Social media analysis and statistical processing of tweets about COVID-19

Бесплатный доступ

The article analyzes Twitter messages to highlight the most relevant topics related to the spread of coronavirus in the world using the example of the Russian-language segment of Twitter. To analyze the proximity of words in the text and form terminological chains of words, the word2vec machine learning method was used. The most common n-grams associated with COVID-19 have been isolated. The authors propose an original method for identifying thematic groups of tweets and assessing their significance, which can be used not only for short messages on Twitter, but also for the analysis of text documents. As a result of the study, n-grams of various lengths are distinguished, their statistical analysis is carried out, thematic groups are formed, including the most relevant n-grams, and their weights are determined. To assess the popularity of topics, a numerical indicator is proposed. The current popularity of topics and their dynamics are taken into account. It is noted that there is an evolution of discussion topics and the formation of new terms related to the pandemic.

Еще

Analysis of social networks, covid-19, n-gram, tweets, thematic group

Короткий адрес: https://sciup.org/148324976

IDR: 148324976   |   DOI: 10.18137/RNU.V9187.22.02.P.084

Статья научная