On methods and models of keyword automatic extraction
Автор: Sheremetyeva S.O., Osminin P.G.
Журнал: Вестник Южно-Уральского государственного университета. Серия: Лингвистика @vestnik-susu-linguistics
Рубрика: Прикладная лингвистика
Статья в выпуске: 1 т.12, 2015 года.
Бесплатный доступ
The paper presents an overview and classification of major approaches to the automatic extraction of keywords from text documents. The approaches can be divided into statistical and hybrid approaches. Both of these types can be further classified into corpora-based and document-based. Advantages and shortcomings of particular approaches are analyzed. It is claimed that the use of statistical keyword extraction methods for inflecting languages, such as Russian, is problematic. Requirements to the efficient model of automatic keyword extraction from texts in Russian are formulated and particular recommendations to meet these requirements are given. It is emphasized that to create effective keyword extractors one should take into consideration the linguistic types of natural languages (analytical, inflecting, agglutinative, isolating), the domain (sublanguage) and the availability of linguistic and programming resources. The approach is illustrated by a case study of a keyword extractor for Russian texts on mathematical modeling.
Automatic extraction, russian
Короткий адрес: https://sciup.org/147153946
IDR: 147153946