Development of news text thematic classification system using machine learning algorithms

Бесплатный доступ

The article describes the development of a system of thematic classification of news texts using machine learning algorithms. The news article dataset is used in the work, each article belonging to one of nine headings. The method of text data preparation for their subsequent classification is described. The FastText vectorization model is used for document vectorization. Four different classification algorithms were used to build classifiers. The quality of the constructed classifiers was evaluated according to a number of metrics. The paper also describes a web application developed within the framework of the thematic classification system and its interface.

Еще

Natural language processing, machine learning, classification, category, normalization, performance measure, web application

Короткий адрес: https://sciup.org/148325179

IDR: 148325179   |   DOI: 10.18137/RNU.V9187.22.03.P.185

Статья научная