Extreme specialization of large language models based on ontological relevance for industrial tasks
Автор: Khudaiberideva G.B., Kozhukhov D.A., Pimenkova A.A.
Журнал: Теория и практика современной науки @modern-j
Рубрика: Основной раздел
Статья в выпуске: 8 (122), 2025 года.
Бесплатный доступ
A methodology for extreme compression of large language models (LLM) is proposed by purposefully removing functionality that is not relevant to a specific narrow-profile industrial task. Unlike traditional compression approaches that focus on preserving the overall capabilities of the model, this approach focuses on identifying and subsequently eliminating the parameters and internal representations responsible for processing knowledge beyond the required domain. The method involves analyzing the semantic importance of data relative to the target ontology of the task (for example, fault diagnosis of a machine based on logs), the use of structured pruning and selective freezing of network modules. The result is a significant reduction in computational requirements and model size while maintaining the required specialized functionality. This approach provides a practical opportunity to implement LLM in resource-limited industrial environments that require high efficiency and predictability.
Large language models, model compression, extreme specialization, industrial application, ontological relevance, structured pruning, parameter freezing, equipment diagnostics, log analysis, computational efficiency, resource-limited environments
Короткий адрес: https://sciup.org/140312538
IDR: 140312538 | УДК: 004.89