Standardization and secure coding combining quantization, pruning, and distillation into a single adaptive pipeline for Cortex-M class microcontrollers
Автор: Khudaiberideva G.B., Kozhukhov D.A., Pimenkova A.A.
Журнал: Теория и практика современной науки @modern-j
Рубрика: Основной раздел
Статья в выпуске: 8 (122), 2025 года.
Бесплатный доступ
The deployment of neural networks on Cortex-M class microcontrollers is subject to limitations in computing resources, memory, and power consumption. Individual application of model compression methods such as quantization, pruning, and knowledge distillation demonstrates limited effectiveness under these constraints. This work suggests a study of synergetic effects when sequentially combining these methods in a single adaptive pipeline. The main focus is on the analysis of interdependencies, for example, the effect of structured pruning on subsequent quantization. A methodology is proposed for creating an adaptive tool that automatically determines and adjusts the optimal sequence and parameters of compression methods for a given target model, a target Cortex-M microcontroller, and required accuracy indicators. Experimental results confirm that the proposed adaptive pipeline is more efficient than the isolated application of compression methods, providing a higher degree of compression and acceleration while meeting the target accuracy metrics on resource-limited devices.
Микроконтроллеры cortex-m, tinyml
Короткий адрес: https://sciup.org/140312532
IDR: 140312532 | УДК: 004.89