Transformer point net: cost-efficient classification of on-road objects captured by light ranging sensors on low-resolution conditions
Автор: Pamplona Jos Fernando, Madrigal Carlos Andrs, Herrera-Ramirez Jorge Alexis
Журнал: Компьютерная оптика @computer-optics
Рубрика: Численные методы и анализ данных
Статья в выпуске: 2 т.46, 2022 года.
Бесплатный доступ
The three-dimensional perception applications have been growing since Light Detection and Ranging devices have become more affordable. On those applications, the navigation and collision avoidance systems stand out for their importance in autonomous vehicles, which are drawing an appreciable amount of attention these days. The on-road object classification task on three-dimensional information is a solid base for an autonomous vehicle perception system, where the analysis of the captured information has some factors that make this task challenging. On these applications, objects are represented only on one side, its shapes are highly variable and occlusions are commonly presented. But the highest challenge comes with the low resolution, which leads to a significant performance dropping on classification methods. While most of the classification architectures tend to get bigger to obtain deeper features, we explore the opposite side contributing to the implementation of low-cost mobile platforms that could use low-resolution detection and ranging devices. In this paper, we propose an approach for on-road objects classification on extremely low-resolution conditions. It uses directly three-dimensional point clouds as sequences on a transformer-convolutional architecture that could be useful on embedded devices. Our proposal shows an accuracy that reaches the 89.74 % tested on objects represented with only 16 points extracted from the Waymo, Lyft’s level 5 and Kitti datasets. It reaches a real time implementation (22 Hz) in a single core processor of 2.3 Ghz.
Lidar, point cloud, deep learning, object classification, transformers, low resolution, autonomous vehicles, low specification computing
Короткий адрес: https://sciup.org/140293817
IDR: 140293817
Список литературы Transformer point net: cost-efficient classification of on-road objects captured by light ranging sensors on low-resolution conditions
- Debeunne C, Vivet D. A review of visual-LiDAR fusion based simultaneous localization and mapping. Sensors 2020; 20(7): 2068.
- Kolhatkar C, Wagle K. Review of SLAM algorithms for indoor mobile robot with LIDAR and RGB-D camera technology. In Book: Favorskaya MN, Mekhilef S, Pandey RK, Singh N, eds. Innovations in electrical and electronic engineering. Singapore: Springer; 2021: 397-409.
- Blokhinov YB, Andrienko EE, Kazakhmedov KK, Vish-nyakov BV. Automatic calibration of multiple cameras and LIDARs for autonomous vehicles. Computer Optics 2021; 45(3): 382-393. DOI: 10.18287/2412-6179-CO-812.
- Michalowska M, Oglozinski M. Autonomous vehicles and road safety. In Book: Mikulski J, ed. Smart solutions in today's transport. 17th Int Conf on Transport Systems Telematics (TST 2017), Katowice - Ustron. Springer International Publishing AG; 2017: 191-202.
- Qi CR, Su H, Mo K, Guibas LJ. Pointnet: Deep learning on point sets for 3D classification and segmentation. IEEE Conf on Computer Vision and Pattern Recognition (CVPR2017), Honolulu 2017: 652-660.
- Tatebe Y, Deguchi D, Kawanishi Y, Ide I, Murase H, Sa-kai U. Pedestrian detection from sparse point-cloud using 3DCNN. International Workshop on Advanced Image Technology (IWAIT2018), Chiang Mai 2018: 1-4.
- Nagashima T, Nagasaki T, Matsubara H. Object classification integrating results of each scan line with low-resolution LIDAR. IEEJ Trans Electr Electron Eng 2019; 14(8): 1203-1208.
- Su H, Maji S, Kalogerakis E, Learned-Miller E. Multiview convolutional neural networks for 3D shape recognition. IEEE Int Conf on Computer Vision (ICCV15), Santiago de Chile 2015: 945-953.
- Serna A, Marcotegui B. Detection, segmentation and classification of 3D urban objects using mathematical morphology and supervised learning. ISPRS J Photogramm Remote Sens 2014; 93: 243-255.
- Simony M, Milzy S, Amendey K, Gross HM. Complex-yolo: An euler-region-proposal for real-time 3D object detection on point clouds. European Conf on Computer Vision (ECCV2018), Munich 2018: 197-209.
- Xu F, Chen L, Lou J, Ren M. A real-time road detection method based on reorganized lidar data. PLoS One 2019; 14(4): e0215159.
- Wu B, Wan A, Yue X, Keutzer K. Squeezeseg: Convolu-tional neural nets with recurrent CRF for real-time road-object segmentation from 3D LiDAR point cloud. IEEE Int Conf on Robotics and Automation (ICRA2018) 2018: 1887-1893.
- Maturana D, Scherer S. VoxNet: A 3D convolutional neural network for real-time object recognition. IEEE/RSJ Int Conf on Intelligent Robots and Systems (IROS2015), Hamburg 2015: 922-928.
- Aijazi AK, Checchin P, Trassoudaine L. Segmentation based classification of 3D urban point clouds: A super-voxel based approach with evaluation. Remote Sens 2013; 5(4): 1624-1650.
- Zakani FR, Arhid K, Bouksim M, Aboulfatah M, Gadi T. Segmentation of 3D meshes combining the artificial neural network classifier and the spectral clustering. Computer Optics 2018; 42(2): 312-319. DOI: 10.18287/2412-61792018-42-2-312-319.
- Weinmann M, Jutzi B, Hinz S, Mallet C. Semantic point cloud interpretation based on optimal neighborhoods, relevant features and efficient classifiers. ISPRS J Photogramm Remote Sens 2015; 105: 286-304.
- Yang B, Dong Z. A shape-based segmentation method for mobile laser scanning point clouds. ISPRS J Photogramm Remote Sens 2013; 81: 19-30.
- LeCun Y, Bengio Y, Hinton G. Deep learning. Nature 2015; 521(7553): 436-44.
- Li Y, Bu R, Sun M, Wu W, Di X, Chen B. PointCNN: Convolution on x-transformed points. Adv Neural Inf Process Syst 2018; 31: 820-830.
- Xu Y, Fan T, Xu M, Zeng L, Qiao Y. SpiderCNN: Deep learning on point sets with parameterized convolutional filters. European Conf on Computer Vision (ECCV2018), Munich 2018: 87-102.
- Komarichev A, Zhong Z, Hua J. A-CNN: Annularly con-volutional neural networks on point clouds. IEEE/CVF Conf on Computer Vision and Pattern Recognition (CVPR2019), Long Beach 2019: 7421-7430.
- Rao Y, Lu J, Zhou J. Spherical fractal convolutional neural networks for point cloud recognition. IEEE/CVF Conf on Computer Vision and Pattern Recognition (CVPR2019), Long Beach 2019: 452-460.
- Wang Y, Sun Y, Liu Z, Sarma SE, Bronstein MM, Solomon JM. Dynamic graph CNN for learning on point clouds. ACM Trans Graph 2019; 38(5): 146.
- Guo M, Cai J, Liu Z, Mu T, Martin RR, Hu S. PCT: Point cloud transformer. Comput Vis Media 2021; 7(2): 187199.
- Qi CR, Yi L, Su H, Guibas LJ. PointNet++: Deep hierarchical feature learning on point sets in a metric space. Int Conf on Neural Information Processing Systems (31st NIPS), Long Beach 2017: 5105-5114.
- Rusu RB, Marton ZC, Blodow N, Dolha M, Beetz M. Towards 3D point cloud based object maps for household environments. Rob Auton Syst 2008; 56(11): 927-941.
- Lu Y, Xue Z, Xia GS, Zhang L. A survey on vision-based UAV navigation. Geo Spat Inf Sci 2018; 21(1): 21-32.
- Xiao C, Wachs J. Triangle-Net: Towards robustness in point cloud learning. IEEE/CVF Winter Conf on Applications of Computer Vision (WACV2021) 2021: 826-835.
- Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I. Attention is all you need. Int Conf on Neural Information Processing Systems (31st NIPS), Log Beach 2017: 6000-6010.
- Zhao H, Jia J, Koltun V. Exploring self-attention for image recognition. IEEE/CVF Conf on Computer Vision and Pattern Recognition (CVPR2020) 2020: 10076-10085.
- Liu X, Han Z, Liu YS, Zwicker M. Point2sequence: Learning the shape representation of 3d point clouds with an attention-based sequence to sequence network. Conf on Artificial Intelligence (33th AAAI), Honolulu 2019: 87788785.
- Guo M-H, Cai J-X, Liu Z-N, Mu T-J, Martin RR, Hu S-M. PCT: Point cloud transformer. Comput Vis Media 2021; 7(2): 187-199.
- Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J. 3D shapenets: A deep representation for volumetric shapes. IEEE Conf on Computer Vision and Pattern Recognition (CVPR2015), Boston 2015: 1912-1920.
- Sun P, Kretzschmar H, Dotiwalla X, Chouard A, Patnaik V, Tsui P, Guo J, Zhou Y, Chai Y, Caine B, Vasudevan V. Scalability in perception for autonomous driving: Waymo open dataset. IEEE/CVF Conf on Computer Vision and Pattern Recognition (CVPR2020) 2020: 2446-2454.
- Kesten R, Usman M, Houston J, Pandya T, Nadhamuni K, Ferreira A, Yuan M, Low B, Jain A, Ondruska P, Omari S, Shah S, Kulkarni A, Kazakova A, Tao C, Platinsky L, Jiang W, Shet V. Lyft level 5 perception dataset 2020. 2021. Source: (https://level5.lyft.com/dataset/).
- Geiger A, Lenz P, Urtasun R. Are we ready for autonomous driving? The KITTI vision benchmark suite. IEEE Conf on Computer Vision and Pattern Recognition (25th CVPR), Providence 2012: 3354-3361.
- Fix E, Hodges JL. Discriminatory analysis. Nonparametric discrimination: Consistency properties. Int Stat Rev 1989; 57(3): 238-247.
- Ullah S, Kim DH. Benchmarking Jetson platform for 3D point-cloud and hyper-spectral image classification. IEEE Int Conf on Big Data and Smart Computing (BigComp2020), Busan 2020: 477-482.
- Lever J, Krzywinski M, Altman NS. Points of significance: Classification evaluation. Nature Methods 2016; 13(8): 603-604.