Аналитический обзор\ архитектур, моделей, методов и алгоритмов\ для локализации и трекинга неригидных объектов
Автор: Гриценко Г.Г., Фраленко В.П.
Журнал: Программные системы: теория и приложения @programmnye-sistemy
Рубрика: Искусственный интеллект и машинное обучение
Статья в выпуске: 4 (63) т.15, 2024 года.
Бесплатный доступ
Компьютерное зрение требует анализа видеопотока, включающего извлечение информации из кадров, обнаружение определенных объектов и сбор данных о них. После обнаружения часто требуется выполнять трекинг или слежение за объектами в видеопотоке. Неригидность или изменчивость формы препятствует анализу объектов, усложняет их обнаружение и трекинг и ухудшает локализацию. В обзоре рассмотрены архитектуры, модели, методы и алгоритмы, применяемые на практике при обнаружении и отслеживании неригидных объектов, и выделены перспективные решения.
Неригидный объект, искусственная нейронная сеть, глубокое обучение, локализация объектов, трекинг объектов, обнаружение пожаров и задымлений, анализ медицинских изображений
Короткий адрес: https://sciup.org/143183787
IDR: 143183787 | DOI: 10.25209/2079-3316-2024-15-4-111-151
Список литературы Аналитический обзор\ архитектур, моделей, методов и алгоритмов\ для локализации и трекинга неригидных объектов
- Ergasheva A., Akhmedov F., Abdusalomov A., Kim W. Advancing maritime safety: early detection of ship fires through computer vision, deep learning approaches, and histogram equalization techniques // Fire.– 2024.– Vol. 7.– No. 3.– id. 84.– 15 pp. https://doi.org/10.3390/fire7030084
- Farkhod A., Abdusalomov A., Makhmudov F., Cho Y. I. LDA-based topic modeling sentiment analysis using topic/document/sentence (TDS) // Applied Sciences.– 2021.– Vol. 11.– No. 23.– id. 11091.– 15 pp. https://doi.org/10.3390/app112311091
- Xu F., Zhang X., Deng T., Xu W. An image-based fire monitoring algorithm resistant to fire-like objects // Fire.– 2024.– Vol. 7.– No. 1.– id. 3.– 12 pp. https://doi.org/10.3390/fire7010003
- Woo S., Park J., Lee J. -Y. CBAM: convolutional block attention module.– 2018.– 17 с. arXivarXiv 1807.06521v2~[cs.CV] https://doi.org/10.48550/arXiv.1807.06521
- Li G., Chen P., Xu C., Sun C., Ma Y. Anchor-free smoke and flame recognition algorithm with multi-loss // Fire.– 2023.– Vol. 6.– No. 6.– id. 225.– 16 pp. https://doi.org/10.3390/fire6060225
- Li X., Liang Y. Fire-RPG: an urban fire detection network providing warnings in advance // Fire.– 2024.– Vol. 7.– No. 7.– id. 214.– 22 pp. https://doi.org/10.3390/fire7070214
- Ding X., Zhang X., Ma N., Han J., Ding G., Sun J. RepVGG: Making VGG-style ConvNets great again.– 2021.– 10 pp. arXivarXiv 2101.03697~[cs.CV] https://doi.org/10.48550/arXiv.2101.03697
- Tang Y., Han K., Guo J., Xu C., Xu C., Wang Y. GhostNetV2: enhance cheap operation with long-range attention.– 2022.– 12 pp. arXivarXiv 2211.12905~[cs.CV] https://doi.org/10.48550/arXiv.2211.12905
- Zhang Q. L., Yang Y.B. SA-Net: shuffle attention for deep convolutional neural networks.– 2021.– 9 pp. arXivarXiv 2102.00240~[cs.CV] https://doi.org/10.48550/arXiv.2102.00240
- Wang Q.,Wu B., P. Zhu, P. Li,W. Zuo, Hu Q. ECA-Net: efficient channel attention for deep convolutional neural Networks.– 2020.– 12 pp. arXivarXiv 1910.03151v4~[cs.CV] https://doi.org/10.48550/arXiv.1910.03151
- Yang L., Zhang R. Y., Li L., Xie X. Simple attentionmodule based speaker verification with iterative noisy label detection.– 2021.– 5 pp. arXivarXiv 2110.06534~[cs.CV] https://doi.org/10.48550/arXiv.2110.06534
- Xie J., Zhao H. Forest fire object detection analysis based on knowledge distillation // Fire.– 2023.– Vol. 6.– No. 12.– id. 446.– 15 pp. https://doi.org/10.3390/fire6120446
- Jin C., Wang T., Alhusaini N., Zhao S., Liu H., Xu K., Zhang J. Video fire detection methods based on deep learning: datasets, methods, and future directions // Fire.– 2023.– Vol. 6.– No. 8.– id. 315.– 27 pp. https://doi.org/10.3390/fire6080315
- Yuan F., Zhang L., Wan B., Xia X., Shi J. Convolutional neural networks based on multi-scale additive merging layers for visual smoke recognition // Machine Vision and Applications.– 2019.– Vol. 30.– Pp. 345–358. https://doi.org/10.1007/s00138-018-0990-3
- Muhammad K., Ahmad J., Lv Z., Bellavista P., Yang P., Baik S. W. Efficient deep CNN-based fire detection and localization in video surveillance applications // IEEE Transactions on Systems, Man, and Cybernetics: Systems.– 2019.– Vol. 49.– No. 7.– Pp. 1419–1434. https://doi.org/10.1109/TSMC.2018.2830099
- Iandola F. N., Han S., Moskewicz M.W., Ashraf K., Dally W.J., Keutzer K. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size.– 2016.– 13 pp. arXivarXiv 1602.07360~[cs.CV] https://doi.org/10.48550/arXiv.1602.07360
- Khudayberdiev O., Zhang J., Abdullahi S. M., Zhang S. Light-FireNet: an efficient lightweight network for fire detection in diverse environments // Multimedia Tools and Applications.– 2022.– Vol. 81.– Pp. 24553–24572. https://doi.org/10.1007/s11042-022-12552-5
- Zheng S., Gao P., Wang W., Zou X. A highly accurate forest fire prediction model based on an improved dynamic convolutional neural network // Applied Sciences.– 2022.– Vol. 12.– No. 13.– id. 6721.– 15 pp. https://doi.org/10.3390/app12136721
- Tao H., Duan Q. An adaptive frame selection network with enhanced dilated convolution for video smoke recognition // Expert Systems with Applications.– 2023.– Vol. 215.– id. 119371.– 11 pp. https://doi.org/10.1016/j.eswa.2022.119371
- Khan Z. A., Hussain T., Ullah F. U. M., Gupta S. K., Lee M. Y., Baik S.W. Randomly initialized CNN with densely connected stacked autoencoder for efficient fire detection // Engineering Applications of Artificial Intelligence.– 2022.– Vol. 116.– id. 105403.– 11 pp. https://doi.org/10.1016/j.engappai.2022.105403
- Hu C., Tang P., Jin W., He Z., Li W. Real-time fire detection based on deep convolutional long-recurrent networks and optical flow method // Proceedings of the 2018 37th Chinese Control Conference (CCC), CCC 2018 (Wuhan, China, 25–27 July, 2018).– IEEE.– 2018.– ISBN 978-1-538-64968-8.– Pp. 9061–9066. https://doi.org/10.23919/ChiCC.2018.8483118
- Li S., Yan Q., Liu P. An efficient fire detection method based on multiscale feature extraction, implicit deep supervision and channel attention mechanism // IEEE Transactions on Image Processing.– 2020.– Vol. 29.– Pp. 8467–8475. https://doi.org/10.1109/TIP.2020.3016431
- Yang C., Pan Y., Cao Y., Lu X. CNN-transformer hybrid architecture for early fire detection // Proceedings of the Artificial Neural Networks and Machine Learning.– V. IV, ICANN 2022: 31st International Conference on Artificial Neural Networks (Bristol, UK, 6–9 September, 2022), Lecture Notes in Computer Science.– vol. 13532, Berlin: Springer.– 2022.– ISBN 978-3-031-15936-7.– Pp. 570–581. https://doi.org/10.1007/978-3-031-15937-4_48
- Wang X., Cai L., Zhou S., Jin Y., Tang L., Zhao Y. Fire safety detection based on CAGSA-YOLO network // Fire.– 2023.– Vol. 6.– No. 8.– id. 297.– 19 pp. https://doi.org/10.3390/fire6080297
- Ding Z., Zhao Y., Li A., Zheng Z. Spatial-temporal attention two-stream convolution neural network for smoke region detection // Fire.– 2021.– Vol. 4.– No. 4.– id. 66.– 12 pp. https://doi.org/10.3390/fire4040066
- Cao Y., Tang Q., Lu X., Li F., Cao J. STCNet: spatio-temporal cross network for industrial smoke detection.– 2020.– 10 с. arXivarXiv 2011.04863~[cs.CV] https://doi.org/10.48550/arXiv.2011.04863
- Shou Y., Meng T., Ai W., Xie C., Liu H., Wang Y. Object detection in medical images based on hierarchical transformer and mask mechanism // Computational Intelligence and Neuroscience.– 2022.– Vol. 2022.– id. 5863782.– 12 pp. https://doi.org/10.1155/2022/5863782
- Lee S. -G., Kim E., Bae J. S., Kim J. H., Yoon S. Robust end-to-end focal liver lesion detection using unregistered multiphase computed tomography images // IEEE Transactions on Emerging Topics in Computational Intelligence.– 2023.– Vol. 7.– No. 2.– Pp. 319–329. https://doi.org/10.1109/TETCI.2021.3132382
- De Frutos J. P., Pedersen A., Pelanis E., Bouget D., Survarachakan S., Langø T., Elle O. -J., Lindseth F. Learning deep abdominal CT registration through adaptive loss weighting and synthetic data generation // PLOS ONE.– 2023.– Vol. 18.– No. 2.– Pp. 1–14. https://doi.org/10.1371/journal.pone.0282110
- Tyagi A. K., Mohapatra C., Das P., Makharia G., Mehra L., AP P., Mausam DeGPR: deep guided posterior regularization for multi-class cell detection and counting.– 2023.– 11 с. arXivarXiv 2304.00741~[cs.CV] https://doi.org/10.48550/arXiv.2304.00741
- Kang M., Ting C. -M., Ting F. F., Phan R. C. -W. RCS-YOLO: a fast and high-accuracy object detector for brain tumor detection.– 2023.– 11 с. arXivarXiv 2307.16412v2~[cs.CV] https://doi.org/10.48550/arXiv.2307.16412
- Kang M., Ting C. -M., Ting F. F., Phan R. C. -W. BGF-YOLO: enhanced YOLOv8 with multiscale attentional feature fusion for brain tumor detection.– 2023.– 5 с. arXivarXiv 2309.12585v2~[cs.CV] https://doi.org/10.48550/arXiv.2309.12585
- Xu X., Jiang Y., Chen W., Huang Y., Zhang Y., Sun X. DAMO-YOLO: a report on real-time object detection design.– 2023.– 10 с. arXivarXiv 2211.15444v4~[cs.CV] https://doi.org/10.48550/arXiv.2211.15444
- Kang M., Ting C. -M., Ting F. F., Phan R. C. -W. RCS-YOLO: a fast and high-accuracy object detector for brain tumor detection.– 2023.– 11 pp. arXivarXiv 2307.16412v2~[cs.CV] https://doi.org/10.48550/arXiv.2307.16412
- Jadon A., Omama M., Varshney A., Ansari M. S., Sharma R. FireNet: a specialized lightweight fire & smoke detection model for real-time IoT applications.– 2019.– 6 pp. arXivarXiv 1905.11922v2~[cs.CV] https://doi.org/10.48550/arXiv.1905.11922
- Shees A., Ansari M. S., Varshney A., Asghar M. N., Kanwal N. FireNet-v2: improved lightweight fire detection model for real-time IoT applications // Procedia Computer Science.– 2023.– Vol. 218.– Pp. 2233–2242. https://doi.org/10.1016/j.procs.2023.01.199
- Altowaijri A. H., Alfaifi M. S., Alshawi T. A., Ibrahim A. B., Alshebeili S. A. A privacy-preserving IoT-Based fire detector // IEEE Access.– 2021.– Vol. 9.– Pp. 51393–51402. https://doi.org/10.1109/ACCESS.2021.3069588
- Valikhujaev Y., Abdusalomov A., Cho Y. I. Automatic fire and smoke detection method for surveillance systems based on dilated CNNs // Atmosphere.– 2020.– Vol. 11.– No. 11.– id. 1241.– 15 pp. https://doi.org/10.3390/atmos11111241
- Muhammad K., Ahmad J., Mehmood I., Rho S., Baik S. W. Convolutional neural networks based fire detection in surveillance videos // IEEE Access.– 2018.– Vol. 6.– Pp. 18174–18183. https://doi.org/10.1109/ACCESS.2018.2812835
- Saponara S., Elhanashi A., Gagliardi A. Real-time video fire/smoke detection based on CNN in antifire surveillance systems // Journal of Real-Time Image Processing.– 2021.– Vol. 18.– Pp. 889–900. https://doi.org/10.1007/s11554-020-01044-0
- Ayala A., Lima E., Fernandes B., Bezerra B. L., Cruz F. Lightweight and efficient octave convolutional neural network for fire recognition // Proceedings of the 2019 IEEE Latin American Conference on Computational Intelligence, LA-CCI’2019 (Guayaquil, Ecuador, 11–15 November, 2019).– IEEE.– 2019.– ISBN 978-1-7281-5666-8.– 6 pp. https://doi.org/10.1109/LA-CCI47412.2019.9037059
- Saponara S., Elhanashi A., Gagliardi A. Exploiting R-CNN for video smoke/fire sensing in antifire surveillance indoor and outdoor systems for smart cities // Proceedings of the 2020 IEEE International Conference on Smart Computing, SMARTCOMP’2020 (Bologna, Italy, 14–17 September, 2020).– IEEE.– 2020.– ISBN 978-1-7281-6997-2.– Pp. 392–397. https://doi.org/10.1109/SMARTCOMP50058.2020.00083
- Thomson W., Bhowmik N., Breckon T.P. Efficient and compact convolutional neural network architectures for non-temporal real-time fire detection.– 2020.– 6 pp. arXivarXiv 2010.08833~[cs.CV] https://doi.org/10.48550/arXiv.2010.08833
- Zoph B., Vasudevan V., Shlens J., Le Q. V. Learning transferable architectures for scalable image recognition // Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), CVPR’18 (Salt Lake City, Utah, 18–22 June, 2018).– IEEE.– 2018.– ISBN 978-1-728-13294-5.– Pp. 8697–8710. https://doi.org/10.1109/CVPR.2018.00907
- Ma N., Zhang X., Zheng H. -T., Sun J. Shufflenet v2: practical guidelines for efficient CNN architecture design // Proceedings of the 2018 European Conference on Computer Vision (ECCV), ECCV’18 (Munich, Germany, 8–14 September, 2018), Lecture Notes in Computer Science.– vol. 11218, Cham: Springer.– 2018.– ISBN 978-3-030-01263-2.– Pp. 122–138. https://doi.org/10.1007/978-3-030-01264-9_8
- Li H., Kadav A., Durdanovic I., Samet H., Graf H. P. Pruning filters for efficient ConvNets.– 2017.– 13 pp. arXivarXiv 1608.08710v3~[cs.CV] https://doi.org/10.48550/arXiv.1608.08710
- Hu Y., Zhan J., Zhou G., Chen A., Cai W., Guo K., Hu Y., Li L. Fast forest fire smoke detection using MVMNet // Knowledge-Based Systems.– 2022.– Vol. 241.– 20 pp. https://doi.org/10.1016/j.knosys.2022.108219
- Yan K., Bagheri M., Summers R. M. 3D context enhanced region-based convolutional neural network for end-to-end lesion detection.– 2018.– 11 pp. arXivarXiv 1806.09648v2~[cs.CV] https://doi.org/10.48550/arXiv.1806.09648
- Zhang P., Liu W., Wang D., Lei Y., Wang H., Shen C., Lu H. Non-rigid object tracking via deep multi-scale spatial-temporal discriminative saliency maps.– 2019.– 12 pp. arXivarXiv 1802.07957v2~[cs.CV] https://doi.org/10.48550/arXiv.1802.07957
- Hong S., You T., Kwak S., Han B. Online tracking by learning discriminative saliency map with convolutional neural network // Proceedings of the 32nd International Conference on Machine Learning, ICML’15 (Lille, France, 6–11 July, 2015), PMLR.– vol. 37.– 2015.– ISBN 978-1-510-81058-7.– Pp. 597–606. hUtRtpLs://dl.acm.org/doi/10.5555/3045118.3045183
- Son J., Jung I., Park K., Han B. Tracking-by-segmentation with online gradient boosting decision tree // Proceedings of the 2015 IEEE International Conference on Computer Vision, ICCV’15 (Santiago, Chile, 07–13 December, 2015).– IEEE.– 2015.– ISBN 978-1-4673-8391-2.– Pp. 3056–3064. https://doi.org/10.1109/ICCV.2015.350
- Sun X., Cheung N. -M., Yao H., Guo Y. Non-rigid object tracking via deformable patches using shape-preserved KCF and level sets // Proceedings of the 2017 IEEE International Conference on Computer Vision, ICCV’17 (Venice, Italy, 22–29 October, 2017).– IEEE.– 2017.– ISBN 978-1-5386-1032-9.– Pp. 5496–5504. https://doi.org/10.1109/ICCV.2017.586
- Duffner S., Garcia C. PixelTrack: a fast adaptive algorithm for tracking non-rigid objects // Proceedings of the 2013 IEEE International Conference on Computer Vision, ICCV’13 (Sydney, NSW, Australia, 1–8 December, 2013).– IEEE.– 2013.– ISBN 978-1-4799-2840-8.– Pp. 2480–2487. https://doi.org/10.1109/ICCV.2013.308
- Sevilla-Lara L., Learned-Miller E. Distribution fields for tracking // Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, CVPR’12 (Providence, RI, USA, 16–21 June, 2012).– 2012.– ISBN 978-1-4673-1226-4.– Pp. 1910–1917. https://doi.org/10.1109/CVPR.2012.6247891
- Godec M., Roth P. M., Bischof H. Hough-based tracking of non-rigid objects // Proceedings of the 2011 IEEE International Conference on Computer Vision, ICCV’11 (Barcelona, Spain, 06–13 November, 2011).– 2011.– ISBN 978-1-4577-1101-5.– Pp. 81–88. https://doi.org/10.1109/ICCV.2011.6126228
- Sun X., Yao H., Zhang S., Li D. Non-rigid object contour tracking via a novel supervised level set model // IEEE Transactions on Image Processing.– 2015.– Vol. 24.– No. 11.– Pp. 3386–3399. https://doi.org/10.1109/TIP.2015.2447213
- Li Y., Zhu J., Hoi S. Reliable Patch Trackers: Robust visual tracking by exploiting reliable patches // Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition, CVPR’15 (Boston, MA, USA, 07–12 June, 2015).– IEEE.– 2015.– ISBN 978-1-4673-6964-0.– Pp. 353–361. https://doi.org/10.1109/CVPR.2015.7298632
- Olszewska J. I., Mathes T., Vleeschouwer C. D., Piater J., Macq B. Non-rigid object tracker based on a robust combination of parametric active contour and point distribution model, Visual Communications and Image Processing 2007 (San Jose, CA, USA, 28 January–1 February, 2007), Proc. SPIE.– vol. 6508.– 2007.– ISBN 978-0-8194-6621-1.– id. 65082A.– 8 pp. https://doi.org/10.1117hU/t1Rt2pL.7s:0/4/2a3p4p.amanote.com/v4.1.7/research/note-taking?resourceId=8qyuAnQBKQvf0Bhi_O9Q
- Mathes T., Piater J. Robust non-rigid object tracking using point distribution manifolds // Pattern Recognition, Lecture Notes in Computer Science.– vol. 4174, Berlin–Heidelberg: Springer.– 2006.– ISBN 978-3-540-44414-5.– Pp. 515–524. https://doi.org/10.1007/11861898_52
- Руиз-Родригез М., Кобер В. И., Карнаухов В. Н., Мозеров М. Г. Алгоритм трехмерной реконструкции нежестких объектов с использованием камеры глубины // Информационные процессы.– 2019.– Т. 19.– №4.– С. 388–398. hUtRtpL://www.jip.ru/2019/388-398-2019.pdf
- Sipiran I., Bustos B. H. Harris 3D: a robust extension of the harris operator for interest point detection on 3D meshes // The Visual Computer.– 2011.– Vol. 27.– No. 11.– Pp. 963–976. https://doi.org/10.1007/s00371-011-0610-y
- Zhong Y. Intrinsic shape signatures: A shape descriptor for 3D object recognition // Proceedings of the 2009 IEEE Conference on Computer Vision Workshops, ICCVW’09 (Kyoto, Japan, 27 September–4 October, 2009).– IEEE.– 2009.– ISBN 978-1-4244-4442-7.– Pp. 689–696. https://doi.org/10.1109/ICCVW.2009.5457637
- Smith S. M., Brady J. M. SUSAN — a new approach to low level image processing // International Journal of Computer Vision.– 1997.– Vol. 23.– No. 1.– Pp. 45–78. https://doi.org/10.1023/A:1007963824710
- Lowe D. G. Distinctive image features from scale-invariant keypoints // International Journal of Computer Vision.– 2004.– Vol. 60.– No. 2.– Pp. 91–110. https://doi.org/10.1023/B:VISI.0000029664.99615.94
- Rusu R. B., Marton Z. C., Blodow N., Beetz M. Persistent point feature histograms for 3D point clouds // Proceedings of the 10th International Conference on Intelligent Autonomous Systems, IAS-10 (Baden-Baden, Germany, 23–25 July, 2008).– IOS Press.– 2008.– ISBN 978-1-58603-887-8.– Pp. 119–128. https://doi.org/10.3233/978-1-58603-887-8-119
- Tombari F., Salti S., Stefano L. D. Unique signatures of histograms for local surface description // Proceedings of the 2010 European Conference on Computer Vision, ECCV’10 (Crete, Greece, 5–11 September, 2010), Lecture Notes in Computer Science.– vol. 6313, Berlin–Heidelberg: Springer.– 2010.– ISBN 978-3-642-15557-4.– Pp. 356–369. https://doi.org/10.1007/978-3-642-15558-1_26
- Frome A., Huber D., Kolluri R., Bulow T., Malik J. Recognizing objects in range data using regional point descriptors // Proceedings of the 2004 European Conference on Computer Vision, ECCV’04 (Prague, Czech Republic, 11–14 May, 2004), Berlin–Heidelberg: Springer.– 2004.– ISBN 978-3-540-21982-8.– Pp. 224–237. https://doi.org/10.1007/978-3-540-24672-5_18
- 68] Lazebnik S., Schmid C., Ponce J. A sparse texture representation using local affine regions // IEEE Transactions on Pattern Analysis and Machine Intelligence.– 2005.– Vol. 27.– No. 8.– Pp. 1265–1278. https://doi.org/10.1109/TPAMI.2005.151
- Marton Z. C., Pangercic D., Blodow N., Kleinehellefort J., Beetz M. General 3D modelling of novel objects from a single view // Proceedings of the 2010 IEEE/RSJ Conference on Intelligent Robots and Systems, IROS’10 (Taipei, Taiwan, 18–22 October, 2010).– IEEE.– 2010.– ISBN 978-1-4244-6674-0.– Pp. 3700–3705. https://doi.org/10.1109/IROS.2010.5650434
- Sturm J., Engelhard N., Endres F., Burgard W., Cremers D. A Benchmark for the evaluation of RGB-D SLAM systems // Proceedings of the 2012 IEEE/RSJ Conference on Intelligent Robots and Systems (IROS), IROS’12 (Vilamoura-Algarve, Portugal, 7–12 October, 2012).– IEEE.– 2012.– ISBN 978-1-4673-1737-5.– Pp. 573–580. https://doi.org/document/6385773