Lexical and grammatical markers of emotions as parameters for sentiment analysis of internet texts in Russian
Автор: Kolmogorova Anastasia V., Vdovina Lyubov A.
Журнал: Вестник Пермского университета. Российская и зарубежная филология @vestnik-psu-philology
Рубрика: Язык, культура, общество
Статья в выпуске: 3 т.11, 2019 года.
Бесплатный доступ
The article covers intermediate results of the creation of an automatic classifier for Russian-language Internet texts, which distributes those into 8 classes, in accordance with 8 basic emotions proposed by the Swedish biologist Hugo Levheim: ‘anger / rage’, ‘interest / excitement’, ‘enjoyment / joy’, ‘contempt / disgust’, ‘surprise’, ‘shame / humiliation’, ‘fear / terror’, ‘distress / anguish’. The material of the training sample are anonymous texts in the genre of ‘Internet revelations’ posted by users of the social network VKontakte. The operation of the classifier is based on the machine learning algorithm using the support vector machine method. The input parameters are the frequency of the punctuation marks ‘?’, ‘!’, ‘?!’, ‘...’ used, the presence of the negative particle ‘ne’ , the use of constructions ‘takoi + adjective’, ‘tak + adverb’, the collocation ‘kogda lyudi govoryat’ , the presence of parceling, question words, particle ‘-to’, lexemes from lexical fields ‘death’, ‘disease’, ‘family’, ‘loneliness’, as well as measure and degree adverbs. The results considered in the paper consist in the validation of the most characteristic verbal markers of specific emotions as parameters that determine the accuracy of the classifier. We conclude that there is a dependence between the efficiency of parameters and the frequency of correlating verbal markers occurrence within emotional text corpora. The achieved accuracy of the classifier is compared with the results of a dummy classifier that performs attribution randomly. In conclusion, the paper highlights the most useful verbal markers, assesses the prospects of this project in terms of practical problems, and raises the question of continuing the study to increase the accuracy of attribution.
Verbal markers, machine learning, sentiment analysis, ranked classifier, classification of basic emotions, computational linguistics, social media
Короткий адрес: https://sciup.org/147226974
IDR: 147226974 | DOI: 10.17072/2073-6681-2019-3-38-46