A comprehensive data quality assurance model for training neural networks in conditions of unstable sources

Бесплатный доступ

The article is devoted to the development of a comprehensive model for ensuring the quality of data used for training neural network models in conditions of unstable sources. The relevance is due to the high level of defects in the data coming from variable, unstable and heterogeneous sources, which leads to a decrease in the accuracy and reliability of the models. The paper substantiates the need for systematic quality control, and suggests a V-model adapted to the stages of the data lifecycle in machine learning projects. The study covered typical defects such as noise, omissions, drift, and data inconsistency.; special attention is paid to the development of a control architecture involving filtering, recovery, validation and quality monitoring at the operational stage; in order to verify the proposed model, an experiment was conducted on simulated data, which demonstrated improved predictive accuracy after cleaning and correcting input streams; the main objective of the study is to develop a standardized approach to ensuring data quality in AI systems. Sources on industrial QA practice, preprocessing methods, ontological alignment, and drift monitoring were used. In conclusion, the possibilities of using the model in critically sensitive industries are described and recommendations for its implementation are given. The article will be useful for machine learning specialists, AI system developers, data engineers, and IT project managers involved in integrating unstable flows into neural network learning circuits.

Еще

V-модель

Короткий адрес: https://sciup.org/170210811

IDR: 170210811   |   DOI: 10.24412/2500-1000-2025-7-2-262-266

Статья научная