Formalization of basic processes and mathematical model of the system for monitoring and analysis of publications of electronic media

Бесплатный доступ

The article describes an approach to formalizing basic processes and building a mathematical model for a system for collecting and analyzing data from electronic media. The authors, as part of a scientific study, are creating a system, including the development of new algorithms, methods and approaches for collecting and analyzing textual information from Internet news sources. The main direction of the study is the application of methods for the mining of text data based on the technology of artificial neural networks, methods of natural language processing, text mining, machine learning and big data processing. Purpose of the study. To develop a formalized description of the model of the system for monitoring and analyzing the text information of electronic news media using the methods of mathematical modeling. Research methods and tools. The use of the toolkit of the methodology of mathematical modeling, with the methods of system analysis is proposed. To study the system, such methods of system analysis as abstraction, formalization, composition and decomposition, structuring and restructuring, modeling, recognition and identification were used. The system is considered as a formalized model of an automatic classifier and clusterizer for a set of text documents in a natural language in the form of an algebraic system. To solve the problems of classification and clustering of texts, it is proposed to apply machine learning methods based on neural network approaches. The structure of the system and its constituent processes, as well as processes interacting with the system from outside, are presented in the form of a formalized mathematical description. Results. The developed formalized mathematical description of the system model clearly shows the interconnection of the system components with each other, as well as internal processes. The applied approach makes it possible to detail the representation of the system based on its decomposition into subsystems and modules. All this makes it possible to streamline the sequence of stages of creating a system and decompose them into separate stages of work. Conclusion. The results obtained in the course of the study allow us to move on to the next stage of the life cycle of the information system being developed - its software development.

Еще

Media information monitoring, data analysis, monitoring and data analysis system, text analysis, mathematical model of the system, data mining, neural network methods, system analysis, text classification, text clustering

Короткий адрес: https://sciup.org/147236499

IDR: 147236499   |   DOI: 10.14529/ctcr210403

Статья научная