Object tracking with deep learning
Автор: V. V. Buryachenko, A. I. Pahirka
Журнал: Siberian Aerospace Journal @vestnik-sibsau-en
Рубрика: Informatics, computer technology and management
Статья в выпуске: 2 vol.21, 2020 года.
Бесплатный доступ
Tracking objects is a key task of video analytics and computer vision, which has many applications in various fields. A lot of tracking systems include two stages: detecting objects and tracking changes in the position of objects. At the first stage, objects of interest are detected in each frame of the video sequence, and at the second, the correspondence of the detected objects in neighboring frames is assessed. Nevertheless, in difficult conditions of video surveillance, this task has a number of difficulties associated with changing the illumination of the frame, changing the shape of objects, for example, when a person is walking, and the task is also complicated in the case of camera movement. The aim of the work is to develop a method for tracking objects on the basis of deep learning, which allows to track several objects in the frame, including those in the rough conditions of video surveillance. The paper provides an overview of modern methods for solving objects tracking tasks, among which the most promising one is deep learning neural networks application. The main approach used in this paper is neural networks for detecting regions (R-CNN), which has proven to be an effective method for solving problems of detection and recognition of objects in images. The proposed algorithm uses an ensemble containing two deep neural networks to detect objects and to refine the results of classification and highlight the boundaries of the object. The article evaluates the effectiveness of the developed system using the classical in the field MOT(Multi-Object tracking) metric for objects tracking based on the known databases available in open sources. The effectiveness of the proposed system is compared to other well-known works.
Intelligent systems, deep learning, motion estimation, convolutional network for regions classification (R-CNN).
Короткий адрес: https://sciup.org/148321730
IDR: 148321730 | DOI: 10.31772/2587-6066-2020-21-2-150-154
Список литературы Object tracking with deep learning
- Krizhevsky A., Sutskever I., Hinton G. E. ImageNet classification with deep convolutional neural networks. Communications of the ACM. 2017, Vol. 60, No. 6, P. 84–90.
- Socher R., Perelygin A., Jean Y. Wu, Chuang J., Manning C. D., Ng A. Y., Potts C. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). 2013. P. 1631–1642.
- Mnih V., Kavukcuoglu K., Silver D. et al. Human-level control through deep reinforcement learning. Nature. 2015, Vol. 518, P. 529–533. Doi: 10.1038/nature14236.
- Khan G., Tariq Z., Khan M. Multi-Person Tracking Based on Faster R-CNN and Deep Appearance Features, Visual Object Tracking with Deep Neural Networks, Pier Luigi Mazzeo, Srinivasan Ramakrishnan and Paolo Spagnolo, IntechOpen. 2019. Doi: 10.5772/intechopen.85215.
- Ren S., He K., Girshick R., Sun J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Computer Vision and Pattern Recognition arXiv: 1506.01497, 04.01.2015.
- Hui T.-W., Tang X., Loy C.-C. LiteFlowNet: A lightweight convolutional neural network for optical flow estimation. IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, Utah, USA, 2018, P. 8981–8989.
- Dosovitskiy A., Fischer P., Ilg E., Höusser P., Hazırbas C., Golkov V., van der Smagt P., Cremers D., Brox T. Flownet: Learning optical flow with convolutional networks. IEEE International Conference on Computer Vision, 2015, P. 2758–2766.
- Wang M. et al. Deep Online Video Stabilization With Multi-Grid Warping Transformation Learning. IEEE Transactions on Image Processing, 2019, Vol. 28, No. 5, P. 2283–2292. Doi: 10.1109/TIP.2018.2884280.
- Favorskaya M. N., Buryachenko V. V., Zotin A. G., Pahirka A. I. Video completion in digital stabilization task using pseudo-panoramic technique. Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci. 2017, Vol. XLII-2/W4, P. 83–90.
- Favorskaya M. N., Buryachenko V. V. Background extraction method for analysis of natural images captured by camera traps. Informatsionno-uprav-liaiushchie sistemy. 2018, No. 6, P. 35–45. Doi: 10.31799/1684-8853-2018-6-35-45.
- Geiger A., Lenz P., Urtasun R. Are we ready for autonomous driving? The KITTI vision benchmark suite. IEEE 2012 Conference on Computer Vision and Pattern Recognition, Providence, RI, 2012, P. 3354–3361
- Drone Videos DJI Mavic Pro Footage in Switzerland Available at: https://www.kaggle.com/kmader/ /drone-videos (accessed 05.05.2019).
- Milan A., Leal-Taixé L., Reid I., Roth S., Schindler K. MOT16: A Benchmark for Multi-Object Tracking. arXiv:1603.00831 [cs], (arXiv: 1603.00831), 2016.
- Sun S., Akhtar N., Song H., Mian A. S., Shah M. Deep Affinity Network for Multiple Object Tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, P. 1–15.
- Zhou Xiangzeng, Xie Lei, Zhang Peng, Zhang Yanning. An Ensemble of Deep Neural Networks for Object Tracking. IEEE International Conference on Image Processing, ICIP 2014, 2014.