Development of news text thematic classification system using machine learning algorithms
Автор: Chelyshev E.A., Otsokov Sh.A., Raskatova M.V., Shchegolev P.
Рубрика: Информатика и вычислительная техника
Статья в выпуске: 3, 2022 года.
Бесплатный доступ
The article describes the development of a system of thematic classification of news texts using machine learning algorithms. The news article dataset is used in the work, each article belonging to one of nine headings. The method of text data preparation for their subsequent classification is described. The FastText vectorization model is used for document vectorization. Four different classification algorithms were used to build classifiers. The quality of the constructed classifiers was evaluated according to a number of metrics. The paper also describes a web application developed within the framework of the thematic classification system and its interface.
Natural language processing, machine learning, classification, category, normalization, performance measure, web application
Короткий адрес: https://sciup.org/148325179
IDR: 148325179 | DOI: 10.18137/RNU.V9187.22.03.P.185