Аппаратная реализация свёрточной нейронной сети с использованием вычислений в системе остаточных классов

Автор: Червяков Николай Иванович, Ляхов Павел Алексеевич, Нагорнов Николай Николаевич, Валуева Мария Васильевна, Валуев Георгий Вячеславович

Журнал: Компьютерная оптика @computer-optics

Рубрика: Обработка изображений, распознавание образов

Статья в выпуске: 5 т.43, 2019 года.

Бесплатный доступ

Современные архитектуры свёрточных нейронных сетей являются весьма ресурсозатратными, что ограничивает возможности их широкого практического применения. В статье предложена архитектура свёрточной нейронной сети, разделённой на аппаратную и программную части для увеличения производительности вычислений. Для реализации свёрточного слоя нейронной сети в аппаратной части использована модулярная арифметика с целью сокращения ресурсозатрат. Предложен численный метод квантования коэффициентов фильтров свёрточного слоя сети для минимизации влияния шума квантования на результат вычислений в системе остаточных классов и определения разрядности коэффициентов. Данный метод основан на масштабировании коэффициентов на фиксированное количество бит и округлении к большему и к меньшему. Используемые операции позволяют уменьшить ресурсы при аппаратной реализации за счёт простоты их выполнения. Все вычисления в свёрточном слое сети выполняются над числами в формате с фиксированной точкой...

Еще

Свёрточные нейронные сети, обработка изображений, распознавание образов, система остаточных классов

Короткий адрес: https://sciup.org/140246521

IDR: 140246521   |   DOI: 10.18287/2412-6179-2019-43-5-857-868

Hardware implementation of a convolutional neural network using calculations in the residue number system

Modern convolutional neural networks architectures are very resource intensive which limits the possibilities for their wide practical application. We propose a convolutional neural network architecture in which the neural network is divided into hardware and software parts to increase performance and reduce the cost of implementation resources. We also propose to use the residue number system in the hardware part to implement the convolutional layer of the neural network for resource costs reducing. A numerical method for quantizing the filters coefficients of a convolutional network layer is proposed to minimize the influence of quantization noise on the calculation result in the residue number system and determine the bit-width of the filters coefficients. This method is based on scaling the coefficients by a fixed number of bits and rounding up and down. The operations used make it possible to reduce resources in hardware implementation due to the simplifying of their execution...

Еще

Список литературы Аппаратная реализация свёрточной нейронной сети с использованием вычислений в системе остаточных классов

  • Chen, Y. Deep and low-level feature based attribute learning for person re-identification / Y. Chen, S. Duffner, A. Stoian, J.-Y. Dufour, A. Baskurta // Image and Vision Computing. - 2018. - Vol. 79. - P. 25-34.
  • Cheng, X. Scene recognition with objectness / X. Cheng, J. Lu, J. Feng, B. Yuan, J. Zhou // Pattern Recognition. - 2018. -Vol. 74. - P. 474-487.
  • Sarikan, S.S. Automated vehicle classification with image processing and computational intelligence / S.S. Sarikan, A.M. Ozbayoglu, O. Zilcia // Procedia Computer Science. - 2017. - Vol. 114. - P. 515-522.
  • Qayyum, A. Medical image retrieval using deep convolutional neural network / A. Qayyum, S.M. Anwar, M. Awais, M. Majid // Neurocomputing. - 2017. - Vol. 266. - P. 8-20.
  • Zhang, J. Small sample image recognition using improved convolutional neural network / J. Zhang, K. Shao, X. Luo // Journal of Visual Communication and Image Representation. - 2018. - Vol. 55. - P. 640-647.
  • LeCun, Y. Gradient-based learning applied to document recognition / Y. LeCun, L. Bottou, Y. Bengio, P. Haffiner // Proceedings of the IEEE. - 1998. - Vol. 86, Issue 11. - P. 2278-2324.
  • Krizhevsky, A. ImageNet classification with deep convolutional neural networks / A. Krizhevsky, I. Sutskever, G.E. Hinton // Advances in Neural Information Processing Systems. - 2012. - Vol. 25, Issue 2. - P. 1097-1105.
  • Szegedy, C. Going deeper with convolutions / C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich // 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). - 2015. - P. 1-9.
  • Jouppi, N. Motivation for and evaluation of the first tensor processing unit / N. Jouppi, C. Young, N. Patil, D. Patterson // IEEE Micro. - 2018. - Vol. 38, Issue 3. - P. 10-19.
  • TensorFlow. An end-to-end open source machine learning platform [Electronical Resource]. - URL: https://www.tensorflow.org/ (request date 19.04.2019).
  • Aghdam, H.H. Guide to convolutional neural networks: A practical application to traffic-sign detection and classification / H.H. Aghdam, E.J. Heravi. - Springer International Publishing, 2017. - 282 p.
  • danielholanda/LeFlow: Enabling flexible FPGA high-level synthesis of tensorflow deep neural networks [Electronical Resource]. -URL: https://github.com/danielholanda/LeFlow (request date 19.04.2019).
  • Noronha, D.H. LeFlow: Enabling flexible FPGA high-level synthesis of tensorflow deep neural networks / D.H. Noronha, B. Salehpour, S.J.E. Wilton // 2018 Fifth International Workshop on FPGAs for Software Programmers (FSP Workshop). - 2018. - P. 1-8.
  • Cafee. Deep learning framework [Electronical Resource]. - URL: https://caffe.berkeleyvision.org/ (request date 19.04.2019).
  • dicecco1/fpga_caffe [Electronical Resource]. - URL: https://github.com/dicecco1/fpga_caffe (request date 19.04.2019).
  • DiCecco, R. Caffeinated FPGAs: FPGA framework for convolutional neural networks / R. DiCecco, G. Lacey, J. Vasiljevic, P. Chow, G. Taylor, S. Areibi // 2016 International Conference on Field-Programmable Technology (FPT). - 2016. - P. 265-268.
  • Install Intel® distribution of Open VINO™ toolkit for Linux with FPGA support [Electronical Resource]. URL: https://docs.openvinotoolkit.org/2019_R1/_docs_install_guides_installing_openvino_linux_fpga.html (request date 19.04.2019).
  • MATLAB [Электронный ресурс]. URL: https://matlab.rn/products/matlab (дата обращения 19.04.2019).
  • Nakahara, H. A deep convolutional neural network based on nested residue number system / H. Nakahara, T. Sasao // 2015 25th International Conference on Field Programmable Logic and Applications (FPL). - 2015. - P. 1-6.
  • Nakahara, H. A high-speed low-power deep neural network on an FPGA based on the Nested RNS: Applied to an object detector / H. Nakahara, T. Sasao // 2018 IEEE International Symposium on Circuits and Systems (ISCAS). - 2018. - P. 1-5.
  • Manabe, T. FPGA implementation of a real-time super-resolution system with a CNN based on a residue number system / T. Manabe, Y. Shibata, K. Oguri // 2017 International Conference on Field Programmable Technology (ICFPT). - 2017. - P. 299-300.
  • Chervyakov, N.I. Increasing of convolutional neural network performance using residue number system / N.I. Chervyakov, P.A. Lyakhov, M.V. Valueva // International Multi-Conference on Engineering, Computer and Information Sciences (SIBIRCON). - 2017. - P. 135-140.
  • Чернов, В.М. Тернарные система: счисления в конечных полях / В.М. Чернов // Компьютерная оптика. - 2018. - Т. 42, № 4. - С. 704-711. -
  • DOI: 10.18287/2412-6179-2018-42-4-704-711
  • Omondi, A. Residue number systems: Theory and implementation / A. Omondi, B. Premkumar. - London: Imperial College Press, 2007. - 296 p.
  • Cardarilli, G.C. Residue number system for low-power DSP applications / G.C. Cardarilli, A. Nannarelli, M. Re // 41st Asilo-mar Conference on Signals, Systems, and Computers. - 2007. - P. 1412-1416.
  • Vergos, H.T. On modulo 2n+1 adder design / H.T. Vergos, G. Dimitrakopoulos // IEEE Transactions on Computers. - 2012. -Vol. 61, Issue 2. - P. 173-186.
  • Zivaljevic, D. Digital filter implementation based on the RNS with diminished-1 encoded channel / D. Zivaljevic, N. StamenkoviC, V. StojanoviC // 2012 35th International Conference on Telecommunications and Signal Processing (TSP). -2012. - P. 662-666.
  • Chervyakov, N.I. Residue-to binary conversion for general moduli sets based on approximate Chinese remainder theorem / N.I. chervyakov, A.S. Molahosseini, P.A. Lyakhov, M.G. Babenko, M.A. Deryabin // International Journal of Computer Mathematics. - 2017. - Vol. 94, Issue 9. - P. 1833-1849.
  • Hung, C.Y. An approximate sign detection method for residue numbers and its application to RNS division / C.Y. Hung, B. Parhami // Computers and Mathematics with Applications. - 1994. - Vol. 27, Issue 4. - P. 23-25.
  • Matos, R. Efficient implementation of modular multiplication by constants applied to RNS reverse converters / R. de Matos, R. Paludo, N. Chervyakov, P.A. Lyakhov, H. Pettenghi // 2017 IEEE International Symposium on Circuits and Systems (ISCAS). - 2017. - P. 1-4.
  • Rao, K.R. The transform and data compression handbook / K.R. Rao, P.C. Yip. - London, New York: CRC Press, 2001. -399 p.
  • Chervyakov, N.I. Quantization noise of multilevel discrete wavelet transform filters in image processing / N.I. Chervyakov, P.A. Lyakhov, N.N. Nagornov // Optoelectronics, Instrumentation and Data Processing. - 2018. - Vol. 54, Issue 6. - P. 608-616.
  • Rothganger, F. Object recognition database / F. Rothganger, S. Lazebnik, C. Schmid, J. Ponce // [Electronic resource] - URL: http://www-cvr.ai.uiuc.edu/ponce_grp/data/objects (request date 19.04.2019).
  • Chervyakov, N.I. Effect of RNS dynamic range on grayscale images filtering / N.I. Chervyakov, P.A. Lyakhov, D.I. Kalita, K.S. Shulzhenko // 2016 XV International Symposium Problems of Redundancy in Information and Control Systems (REDUNDANCY). - 2016. - P. 33-37.
Еще