Анализ социальных сетей и статистическая обработка твитов о COVID-19
Автор: Золотарев Олег Васильевич, Хакимова Аида Хатифовна
Рубрика: Управление сложными системами
Статья в выпуске: 2, 2022 года.
Бесплатный доступ
Анализируются сообщения в «Твиттере» для выделения наиболее актуальных тем, связанных с распространением коронавируса в мире на примере русскоязычного сегмента «Твиттера». Для анализа близости слов в тексте и формирования терминологических цепочек слов использовался метод машинного обучения word2vec. Были выделены наиболее распространенные n-граммы, связанные с COVID-19. Авторы предлагают оригинальную методику выявления тематических групп твитов и оценки их значимости, которую можно использовать не только для коротких сообщений в «Твиттере», но и для анализа текстовых документов. В результате исследования выделяются n-граммы различной длины, проводится их статистический анализ, формируются тематические группы, включающие наиболее релевантные n-граммы, и определяются их веса. Для оценки популярности тем предлагается численный показатель. Учитывается актуальная популярность тем и их динамика. Отмечается, что происходит эволюция тем обсуждений и образование новых терминов, связанных с пандемией.
Анализ социальных сетей, covid-19, n-грамм, твиты, тематическая группа
Короткий адрес: https://sciup.org/148324976
IDR: 148324976 | DOI: 10.18137/RNU.V9187.22.02.P.084
Список литературы Анализ социальных сетей и статистическая обработка твитов о COVID-19
- Aljameel S.S., Alabbad D.A., Alzahrani N.A., Alqarni S.M., Alamoudi F.A., Babili L.M., Aljaafary S.K., Alshamrani F.M. (2020) A Sentiment Analysis Approach to Predict an Individual’s Awareness of the Precautionary Procedures to Prevent COVID-19 Outbreaks in Saudi Arabia. Int J Environ Res Public Health, 18(1):218. doi: 10.3390/ijerph18010218.
- Argentina’s President who was vaccinated tests positive for COVID-19. https://english.alarabiya.net/coronavirus/2021/04/04/Argentina-s-President-who-was-vaccinated-tests-positive-for-COVID-19.
- Bhat M., Qadri M., Noor-ul-Asrar Beg M.K., Ahanger N., Agarwal B.J.B. (2020) Behavior, & immunity. sentiment analysis of social media response on the Covid19 outbreak. Brain, Behavior and Immunity.
- Brand D., Kroon S., van der Merwe B., Cleophas L. (2015) N-Gram Representations For Comment Filtering. In Proceedings of the 2015 Annual Research Conference on South African Institute of Computer Scientists and Information Technologists (SAICSIT ‘15). Association for Computing Machinery, New York, NY, USA, Article 6, 1–10. DOI: https://doi.org/10.1145/2815782.2815789.
- Choi S., Lee J., Kang M.-G., Min H., Chang Y.-S., Yoon S.J.M. (2017) Large-scale machine learning of media outlets for understanding public reactions to nation-wide viral infection outbreaks. Methods, 129:50–59.
- ConnerC., Samuel J., Kretinin A., Samuel Y., Nadeau L. (2019) A Picture for The Words! Textual Visualization in Big Data Analytics,. Northeast Business and Economics Association (NBEA ) Annual Proc. (46), pp. 37–43.
- de las Heras-Pedrosa C., Sánchez-Núñez P., Peláez J.I. (2020) Sentiment Analysis and Emotion Understanding during the COVID-19. Pandemic in Spain and Its Impact on Digital Ecosystems. Int. J. Environ. Res. Public Health, 17, 5542.
- Falzon L., McCurrie C., Dunn J. (2017) Representation and Analysis of Twitter Activity: A Dynamic Network Perspective. In Proc. of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017 (ASONAM ‘17). Association for Computing Machinery, New York, USA, 1183–1190. DOI: https://doi.org/10.1145/3110025.3122118.
- Hassan N.Y., Gomaa W.H., Khoriba G., & Haggag M.H. (2020) Credibility Detection in Twitter Using Word N-gram Analysis and Supervised Machine Learning Techniques. International J. of Intelligent Engineering and Systems, 13, 291-300.
- https://wordcloud.pro/ru/studio/editor?v=11
- Iglesias-Sánchez P.P., Witt G.F.V., Cabrera F.E., Jambrino-Maldonado C. (2020) The Contagion of Sentiments during the Covid-19. Pandemic Crisis: The Case of Isolation in Spain. Int. J. Environ. Res. Public Health, 17, 5918.
- Khakimova A., Yang X., Zolotarev O., Berberova M., Charnine M. (2020) Tracking Knowledge Evolution Based on the Terminology Dynamics in 4P‐Medicine. International J. of Environmental Research and Public Health, vol. 17, No. 20, pp. 1–19. DOI: 10.3390/ijerph17207444.
- Khakimova A.Kh., Zolotarev O.V., Berberova M.A. (2020) Coronavirus Infection Study: Bibliometric Analysis of Publications on Covid-19 Using Pubmed and Dimensions Databases. Scientific Visualization, vol. 12, No. 5, pp. 112–129. DOI: 10.26583/sv.12.5.10.
- Kim G., Fukui K., Shimodaira H. (2018) Word-like character n-gram embedding. In Proceedings of the 2018 EMNLP Workshop W-NUT : The 4th Workshop on Noisy User-generated Text, pages 148–152, Brussels, Belgium. Association for Computational Linguistics.
- Mahdikhani M. (2022) Predicting the popularity of tweets by analyzing public opinion and emotions in different stages of Covid-19 pandemic, International Journal of Information Management Data Insights, vol. 2, Iss. 1:100053. ISSN 2667-0968, https://doi.org/10.1016/j.jjimei.2021.100053.
- Manguri K.H., Ramadhan R.N., Rasul P., Amin M. (2020) Twitter Sentiment Analysis on Worldwide COVID-19 Outbreaks. Kurd. J. Appl. Res.
- Nasser N., Karim L., El Ouadrhiri A., Ali A., & Khan N. (2021). n-Gram based language processing using Twitter dataset to identify COVID-19 patients. Sustainable cities and society, 72, 103048. https://doi.org/10.1016/j.scs.2021.103048
- Nieuwenhuis M., Wilkens J. (2018) Twitter text and image gender classification with a logistic regression n-gram model. In Proc. of the Ninth International Conference of the CLEF Association (CLEF 2018).
- Rinker T.W. Sentimentr: Calculate Text Polarity Sentiment. Buffalo, New York, 2019, version 2.7.1.
- Samuel J., Ali G.G.M.N., Rahman M.M., Esawi E., Samuel Y. (2020) COVID-19 Public Sentiment Insights and Machine Learning for Tweets Classification, 11, 314. https://doi.org/10.3390/info11060314.
- Shi W., Liu D., Yang J., Zhang J., Wen S., Su J. (2020) Social Bots’ Sentiment Engagement in Health Emergencies: A Topic-Based Analysis of the Covid-19 Pandemic Discussions on Twitter. Int. J. Environ. Res. Public Health, 17, 8701.
- Singh R., Singh R., Bhatia A. (2018) Sentiment analysis using Machine Learning technique to predict outbreaks and epidemics. International Journal of Advanced Science and Research, 3:19–24.
- Vicinitas. 2018 Research on 100 Million Tweets: What it Means for Your Social Media Strategy for Twitter. https://www.vicinitas.io/blog/twitter-social-media-strategy-2018-research-100-milliontweets. Accessed April 7, 2022.
- Vicinitas. https://www.vicinitas.io/
- World Health Organization. 2019 Novel Coronavirus (2019-nCoV): Strategic Preparedness and Response Plan, World Health Organization, Geneva, 2020.
- World Health Organization. 2020 Mental Health and Psychosocial Considerations during the COVID-19 Outbreak. Available online: WH O/2019-nCoV/MentalHealth/2020.
- Worldometer. COVID-19 CORONAVIRU S PAN DEMIC. https://www.worldometers.info/coronavirus/
- Zolotarev O., Solomentsev Y., Khakimova A., Charnine M. (2019) Identification of Semantic Patterns in Full-text Documents Using Neural Network Methods. Graphi Con 2019. Computer Graphics and Vision. Proc. of the 29th International Conference on Computer Graphics and Vision. Bryansk, Russia, September 23–26. http://ceur-ws.org/Vol-2485/paper64.pdf