Научные статьи \ Общие вопросы науки и культуры \ Деятельность и организация. Общая теория связи и управления (кибернетика)

Road images augmentation with synthetic traffic signs using neural networks

Автор: Konushin Anton Sergeevich, Faizov Boris Vladimirovich, Shakhuro Vladislav Igorevich

Журнал: Компьютерная оптика @computer-optics

Рубрика: Обработка изображений, распознавание образов

Статья в выпуске: 5 т.45, 2021 года.

Бесплатный доступ

Traffic sign recognition is a well-researched problem in computer vision. However, the state of the art methods works only for frequent sign classes, which are well represented in training datasets. We consider the task of rare traffic sign detection and classification. We aim to solve that problem by using synthetic training data. Such training data is obtained by embedding synthetic images of signs in the real photos. We propose three methods for making synthetic signs consistent with a scene in appearance. These methods are based on modern generative adversarial network (GAN) architectures. Our proposed methods allow realistic embedding of rare traffic sign classes that are absent in the training set. We adapt a variational autoencoder for sampling plausible locations of new traffic signs in images. We demonstrate that using a mixture of our synthetic data with real data improves the accuracy of both classifier and detector.

Еще

Traffic sign classification, synthetic training samples, neural networks, image recognition, image transforms, neural network compositions

Короткий адрес: https://sciup.org/140290271

IDR: 140290271 | DOI: 10.18287/2412-6179-CO-859

Список литературы Road images augmentation with synthetic traffic signs using neural networks

Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y. Generative adversarial nets. NIPS'14 Proc 27th Int Conf on Neural Information Processing Systems 2014; 2: 2672-2680.
Zhu J-Y, Park T, Isola P, Efros AA. Unpaired image-to-image translation using cycle-consistent adversarial networks. Proc IEEE Int Conf on Computer Vision 2017: 2223-2232.
Karras T, Laine S, Aila T. A style-based generator architecture for generative adversarial networks. Proc IEEE Conf on Computer Vision and Pattern Recognition 2019: 4401-4410.
Lee D, Liu S, Gu J, Liu M-Y, Yang M-H, Kautz J. Context-aware synthesis and placement of object instances. NIPS'18 Proc 32nd Int Conf on Neural Information Processing Systems 2018: 10414-10424.
Dwibedi D, Misra I, Hebert M. Cut, paste and learn: Surprisingly easy synthesis for instance detection. Proc IEEE Int Conf on Computer Vision 2017: 1301-1310.
Dvornik N, Mairal J, Schmid C. Modeling visual context is key to augmenting object detection datasets. Proc European Conf on Computer Vision (ECCV) 2018: 364-380.
Zhang S, Liang R, Wang M. Shadowgan: Shadow synthesis for virtual objects with conditional adversarial networks. Comput Vis Media (Beijing) 2019; 5(1): 105-115.
Liu L, Muelly M, Deng J, Pfister T, Li L-J. Generative modeling for small-data object detection. Proc IEEE Int Conf on Computer Vision 2019: 6073-6081.
Reed A, Gerg ID, McKay JD, Brown DC, Williamsk DP, Jayasuriya S. Coupling rendering and generative adversarial networks for artificial sas image generation. OCEANS 2019 MTS/IEEE SEATTLE 2019: 1-10.
Radford A, Metz L, Chintala S. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv: 1511.06434. Source: (https://arxiv.org/abs/1511.06434).
Denton EL, Chintala S, Fergus R, et al. Deep generative image models using a laplacian pyramid of adversarial networks. NIPS'15 Proc 28th Int Conf on Neural Information Processing Systems 2015; 1: 1486-1494).
Mirza M, Osindero S. Conditional generative adversarial nets. arXiv preprint arXiv: 1411.1784. Source: (https://arxiv. org/abs/1411.1784.
Huang X, Belongie S. Arbitrary style transfer in realtime with adaptive instance normalization. Proc IEEE Int Conf on Computer Vision 2017: 1501-1510.
Chen B-C, Kae A. Toward realistic image compositing with adversarial learning. Proc IEEE Conf on Computer Vision and Pattern Recognition 2019: 8415-8424.
Zheng Z, Zheng L, Yang Y. Unlabeled samples generated by gan improve the person re-identification baseline in vitro. Proc IEEE Int Conf on Computer Vision 2017: 37543762.
Frid-Adar M, Klang E, Amitai M, Goldberger J, Greenspan H. Synthetic data augmentation using gan for improved liver lesion classification. IEEE 15th Int Symposium on Biomedical Imaging (ISBI 2018) 2018: 289-293.
Richter SR, Vineet V, Roth S, Koltun V. Playing for data: Ground truth from computer games. European Conference on Computer Vision 2016: 102-118.
Gaidon A, Wang Q, Cabon Y, Vig E. Virtual worlds as proxy for multi-object tracking analysis. Proc IEEE Conf on Computer Vision and Pattern Recognition 2016: 43404349.
Hoffman J, Tzeng E, Park T, Zhu J-Y, Isola P, Saenko K, Efros A, Darrell T. Cycada: Cycle-consistent adversarial domain adaptation. Int Conf on Machine Learning 2018: 1989-1998.
Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B. The cityscapes dataset for semantic urban scene understanding. Proc IEEE Conf on Computer Vision and Pattern Recognition 2016: 3213-3223.
Jaderberg M, Simony an K, Zisserman A, et al. Spatial transformer networks. NIPS' 15 Proc 28th Int Conf on Neural Information Processing Systems 2015: 2017-2025.
Tripathi S, Chandra S, Agrawal A, Tyagi A, Rehg JM, Chari V. Learning to generate synthetic data via compositing. Proc IEEE Conf on Computer Vision and Pattern Recognition 2019: 461-470.
De La Escalera A, Moreno LE, Salichs MA, Armingol JM. Road traffic sign detection and classification. IEEE Trans Ind Electron 1997; 44(6): 848-859.
Hoessler H, Wohler C, Lindner F, Kreßel U. Classifier training based on synthetically generated samples. Proc 5th Int Conf on Computer Vision Systems (ICVS) 2007. Source: (https://biecoll.ub.uni-bielefeld.de/index.php/icvs/article/view/201/294).
Chigorin A, Konushin A. A system for large-scale automatic traffic sign recognition and mapping. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences 2013; II-3/W3: 13-17.
Moiseyev B, Konev A, Chigorin A, Konushin A. Evaluation of traffic sign recognition methods trained on synthetically generated data. In Book: Blanc-Talon J, Kasinski A, Philips W, Popescu D, Scheunders P, eds. Advanced concepts for intelligent vision systems. Cham, Heidelberg: Springer; 2013: 576-583.
Zhu Z, Liang D, Zhang S, Huang X, Li B, Hu S. Traffic-sign detection and classification in the wild. Proc IEEE Conf on Computer Vision and Pattern Recognition 2016: 2110-2118.
Shakhuro V, Faizov B, Konushin A. Rare traffic sign recognition using synthetic training data. Proc 3rd Int Conf on Video and Image Processing (ICVIP) 2019: 23-26.
Liang Z, Shao J, Zhang D, Gao L. Traffic sign detection and recognition based on pyramidal convolutional networks. Neural Comput Appl 2020; 32: 6533-6543.
Ayachi R, Afif M, Said Y, Atri M. Traffic signs detection for real-world application of an advanced driving assisting system using deep learning. Neural Process Lett 2020; 51(1): 837-851.
Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. Commun ACM 2017; 60(6): 84-90.
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. Proc IEEE Conf on Computer Vision and Pattern Recognition 2016: 770-778.
Farag W. Traffic signs classification by deep learning for advanced driving assistance systems. Intell Decis Technol 2019; 13(3): 305-314.
Serna CG, Ruichek Y. Classification of traffic signs: The european dataset. IEEE Access 2018; 6: 78136-78148.
Faizov B, Shakhuro V, Sanzharov V, Konushin A. Classification of rare traffic signs. Computer Optics 2020; 44(2): 236-243. DOI: 10.18287/2412-6179-CO-601.
Zagoruyko S, Komodakis N. Wide residual networks. arXiv preprint arXiv: 1605.07146. Source: (https://arxiv.org/abs/1605.07146).
Nazeri K, Ng E, Joseph T, Qureshi FZ, Ebrahimi M. EdgeConnect: Generative image inpainting with adversarial edge learning. arXiv preprint arXiv:1901.00212. Source: (https://arxiv.org/abs/1901.00212).
Johnson J, Alahi A, Fei-Fei L. Perceptual losses for real-time style transfer and super-resolution. In Book: Leibe B, Matas J, Sebe N, Welling M, eds. Computer Vision - ECCV 2016. Springer International Publishing AG; 2016: 694-711.
Andrienko O. Fast semantic segmentation. Source: (https://github.com/oandrienko/fast-semantic-segmentation).
Shakhuro V, Konushin A. Russian traffic sign images dataset. Computer Optics 2016; 40(2): 294-300. DOI: 10.18287/2412-6179-2016-40-2-294-300.
Kim K-H, Hong S, Roh B, Cheon Y, Park M. PVANet: Deep but lightweight neural networks for real-time object detection. arXiv preprint arXiv: 1608.08021. Source: (https://arxiv.org/abs/1608.08021).

Еще