Special aspects of matrix operation implementations for low-precision neural network model on the Elbrus platform

Бесплатный доступ

This paper investigates the possibility of effective implementation of calculations in low-precision neural network models on the Elbrus platform with the VLIW architecture. Such models are widely used in practice to increase the computational efficiency of recognition and well suit computers with the x86 and ARM architectures. In this paper, we consider an 8-bit neural network model, in which matrix multiplication is the most resource-intensive part of the implementation. This paper presents an effective implementation of matrix multiplication that takes into account the features of the Elbrus architecture: the presence of several computational channels with various arithmetic and logic devices, an array prefetch buffer, and its own SIMD extension. We carry out theoretical and experimental comparisons of the computational efficiency of low-precision and classical neural network models, which show that Elbrus processors have much more capabilities for performing fast floating point calculations and require the development of new approaches to increase the computational efficiency of neural network models.

Еще

Low-precision neural networks, computational efficiency, elbrus architecture, matrix operations

Короткий адрес: https://sciup.org/147232978

IDR: 147232978   |   DOI: 10.14529/mmp200109

Список литературы Special aspects of matrix operation implementations for low-precision neural network model on the Elbrus platform

  • Лимонова, Е.Е. Оценка быстродействия системы распознавания на VLIW архитектуре на примере платформы Эльбрус / Е.Е. Лимонова, Н.А. Бочаров, Н.Б. Парамонов, Д.С. Богданов, В.В. Арлазаров, О.А. Славин, Д.П. Николаев // Программирование. - 2019. - № 1. - C. 15-21.
  • Bulatov, K.B. Smart IDReader: Document Recognition in Video Stream / K.B. Bulatov, V.V. Arlazarov, T.S. Chernov, O.A. Slavin, D.P. Nikolaev // IAPR International Conference on Document Analysis and Recognition (ICDAR), 9-12 November. - Kyoto, 2017. - P. 39-44.
  • Lynchenko, A. Document Image Recognition Algorithm Based on Similarity Metric Robust to Projective Distortions for Mobile Devices / A. Lynchenko, A. Sheshkus, V.L. Arlazarov // International Conference on Machine Vision (ICMV 2018), 1-3 November. - Munich, 2019. - V. 11041. - Article ID: 110411K. - 7 p.
  • Islam, N.A Survey on Optical Character Recognition System / N. Islam, Z. Islam, N. Noor // Journal of Information and Communication Technology. - 2016. - V. 10, № 2. - Article ID: 18302720. - 11 p.
  • Болотова, Ю.А. Распознавание автомобильных номеров на основе метода связных компонент и иерархической временной сети / Ю.А. Болотова, В.Г. Спицын, М.Н. Рудометкина // Компьютерная оптика. - 2015. - V. 39, № 2. - С. 275-280.
  • Limonova, E.E. Convolutional Neural Network Structure Transformations for Complexity Reduction and Speed Improvement / E.E. Limonova, A.V. Sheshkus, A.A. Ivanova, D.P. Nikolaev // Pattern Recognition and Image Analysis. - 2018. - V. 28. - № 1. - P. 24-33.
  • Johnson J. Rethinking Floating Point for Deep Learning / J. Johnson. - 2018. - URL: https://arxiv.org/abs/1811.01721 [дата обращения 01.10.2019]
  • Aojun Zhou. Incremental Network Quantization: Towards Lossless CNNS with Low-Precision Weights / Aojun Zhou, Anbang Yao, Yiwen Guo, Lin Xu, Yurong Chen. - 2017. - URL: https://arxiv.org/abs/1702.03044 [дата обращения 01.10.2019]
  • Low-Precision Matrix Multiplication. - URL: https://github.com/google/gemmlowp [дата обращения 01.10.2019]
  • QNNPACK: Open Source Library for Optimized Mobile Deep Learning. - URL: https://code.fb.com/ml-applications/qnnpack [дата обращения 01.10.2019]
  • Choukroun, Y. Low-Bit Quantization of Neural Networks for Efficient Inference / Y. Choukroun, E. Kravchik, P. Kisilev. - 2019. - URL: https://arxiv.org/abs/1902.06822 [дата обращения 01.10.2019]
  • Прохоров, Н.Л. К 60-летию Института электронных управляющих машин им. И.С. Брука / Н.Л. Прохоров, А.К. Ким, Г.А. Егоров // Информационные технологии и вычислительные системы. - 2018. - № 3. - С. 1-13.
  • Krizhevsky, A. ImageNet Classification with Deep Convolutional Neural Networks / A. Krizhevsky, I. Sutskever I., G.E. Hinton // Communications of the ACM. - 2017. - V. 60, № 6. - P. 84-90.
  • Toshev, A. Deeppose: Human Pose Estimation Via Deep Neural Networks / A. Toshev, C. Szegedy // IIEEE Conference on Computer Vision and Pattern Recognition, 17-19 June. - Washington, 2014. - P. 1653-1660.
  • Szegedy, C. Going Deeper with Convolutions / C. Szegedy, Wei Liu, Yangqing Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovic // IEEE Conference on Computer Vision and Pattern Recognition, 7-12 June. - Boston, 2015. - P. 1-9.
  • Bashivan, P. Learning Representations from EEG with Deep Recurrent-Convolutional Neural Networks / P. Bashivan, I. Rish, M. Yeasin, N. Codella. - 2015. - URL: https://arxiv.org/abs/1511.06448.
  • Brahimi, S. Very Deep Recurrent Convolutional Neural Network for Object Recognition / S. Brahimi, N.B. Aoun, C.B. Amar // International Conference on Machine Vision, 18-20 November. - Nice, 2017. - V. 10341. - Article ID: 1034107.
  • Chellapilla, K. High Performance Convolutional Neural Networks for Document Processing / K. Chellapilla, S. Puri, P. Simard // Tenth International Workshop on Frontiers in Handwriting Recognition, 23-26 October. - La Baule, 2006. - P. 1237-1242.
  • Ким, А.К. Микропроцессоры и вычислительные комплексы семейства Эльбрус / А.К. Ким, В.И. Перекатов, С.Г. Ермаков - СПб.: Питер, 2013.
  • Ишин, П.А. Ускорение вычислений с использованием высокопроизводительных математических и мультимедийных библиотек для архитектуры Эльбрус / П.А. Ишин, В.Е. Логинов, П.П. Васильев // Вестник воздушно-космической обороны. - 2015. - № 4 (8). - C. 64-68.
  • Limonova, E.E. Fast Hamming Distance Computation for 2D Art Recognition on VLIW-Architecture in Case of Elbrus Platform / E.E. Limonova, N.S. Skoryukina, M.I. Neyman-zade // International Conference on Machine Vision, 16-18 November. - Amsterdam, 2019. - V. 11041. - Article ID: 110411N. - 10 p.
  • Goto, K. Anatomy of High-Performance Matrix Multiplication / K. Goto, R.A. Geijn // Transactions on Mathematical Software. - 2008. - V. 34, № 3. - P. 12.
Еще
Статья научная