Y-method of text classification

Автор: Yatsko Viatcheslav

Журнал: Грани познания @grani-vspu

Рубрика: Филологические науки

Статья в выпуске: 3 (74), 2021 года.

Бесплатный доступ

The article deals with the specific features of the automatic text classification. There are described the procedures of a new classification method based on the calculation of the deviations of stop-words distribution from Zipfian score: the recognition of stop-words and the creation of ranked lists; the calculation of deviations of terms frequencies from Zipfian score; the calculation of documents indices basing on standard deviation; finding the degree of document similarity. The author introduces the indicators of classification efficiency, such as the discriminative power, the similarative power and the generalized index. The method was tested and proved to be efficient for the solution of the genre classification task.

Еще

Automatic text document classification, methods and algorithms, zipf distribution, efficiency indices, discriminative power, genre classification, degree of text documents similarity

Короткий адрес: https://sciup.org/148322074

IDR: 148322074

Статья научная