Topic modelling and emotion analysis of the tweets of British and American politicians on the topic of war in Ukraine
DOI:
https://doi.org/10.29038/eejpl.2022.9.2.karKeywords:
lexical token, raw frequency, relative frequency, virtual discourse, topic modelling, emotion analysis, TwitterAbstract
This paper focuses on the content and emotive features of four politicians' posts that were published on their official Twitter accounts during the three-month period of the russian invasion of Ukraine. We selected two British politicians – Boris Johnson, the Prime Minister of the UK, and Yvette Cooper, the Labour MP and Shadow Home Secretary of the State for the Home Department – as well as two American politicians, President of the USA Joe Biden and Republican senator Marco Rubio. In the first phase, we constructed a dataset containing the tweets of the four politicians, which were selected with regard to the topic of war in Ukraine. To be considered approved, the tweets were supposed to contain such words as Ukraine, russia, war, putin, invasion, spotted in one context. In the second phase, we identified the most frequent lexical tokens used by the politicians to inform the world community about the war in Ukraine. For this purpose, we used Voyant Tools, a web-based application for text analysis. These tokens were divided into three groups according to the level of their frequency into most frequent, second most frequent and third most frequent lexical tokens. Additionally, we measured the distribution of the most frequent lexical tokens across the three-month time span to explore how their frequency fluctuated over the study period. In the third phase, we analysed the context of the identified lexical tokens, thereby outlining the subject of the tweets. To do this, we extracted collocations using the Natural Language Toolkit (NLTK) library. During the final phase of the research, we performed topic modelling using the Gibbs Sampling algorithm for the Dirichlet Multinomial Mixture model (GSDMM) and emotion analysis using the NRC Lexicon library.Disclosure Statement
No potential conflict of interest was reported by the authors.
Downloads
References
Горошко Е. И., Полякова Т. Л. (2011) Лингвистические особенности англоязычного твиттера. Учені записки Таврійського національного університету імені Вернадського. Сер. Філологія. Соціальні комунікації, Т. 24(63). № 2(1), 53–58. URL: http://repository.kpi.kharkov.ua/handle/KhPI-Press/49133
Нерян С. О. (2018) Допис у соцмережі як мовленнєвий жанр інтернет- комунікації. Науковий вісник Херсонського державного університету, Сер. Лінгвістика, 33 (1), 66-70. URL: https://journals.indexcopernicus.com/api/file/viewByFileId/708692.pdf
Ніколаєва, Т. М. (2019). Лексико-семантичні аспекти мови соціальних мереж. Закарпатські філологічні студії, 9 (2), 96-101. URL: https://dspace.uzhnu.edu.ua/jspui/handle/lib/33170
Полякова Т. Л. (2021). Лексичні засоби в жанрі твітинг в англомовній політичній інтернет-комунікації. Закарпатські філологічні студії, 14 (1), 177-181. https://doi.org/10.32782/tps2663-4880/2020.14-1.32
Швелідзе Л. Д. (2021) Мовні засоби реалізації комунікативних стратегій у дискурсі соціальних мереж (на матеріалі української та англійської мов) (дис. … канд. філол. наук). Донецький національний університет імені Василя Стуса, Вінниця. URL: https://abstracts.donnu.edu.ua/article/view/9878
Bird, S. (2006). NLTK: the natural language toolkit. Proceedings of the COLING/ACL 2006 Interactive Presentation Sessions (69-72). https://doi.org/10.3115/1225403.1225421
Crystal, D. (2011). A Microexample: Twitter. In Internet Linguistics: A Student Guide. (pp. 36-56). London and New York : Routledge. Taylor & Francis Group.
Mohammad, S. M., & Turney, P. D. (2013). NRC Emotion Lexicon. National Research Council, Canada. https://doi.org/10.4224/21270984
Weisser, C., Gerloff, C., Thielmann, A., Python, A., Reuter, A., Kneib, T., & Säfken, B. (2022). Pseudo-document simulation for comparing LDA, GSDMM and GPM topic models on short and sparse text using Twitter data. Computational Statistics, 1-28. http://dx.doi.org/10.1007/s00180-022-01246-z
Yin, J., & Wang, J. (2014). A dirichlet multinomial mixture model-based approach for short text clustering. Proceedings of the 20th ACM SIGKDD International Conference on Knowledge discovery and data mining, (233-242). https://doi.org/10.1145/2623330.2623715
References (translated and transliterated)
Bird, S. (2006). NLTK: the natural language toolkit. Proceedings of the COLING/ACL 2006 Interactive Presentation Sessions (69-72). https://doi.org/10.3115/1225403.1225421
Crystal, D. (2011). A Microexample: Twitter. In Internet Linguistics: A Student Guide. (pp. 36-56). London and New York : Routledge. Taylor & Francis Group.
Goroshko E. I., Poliakova T. L. (2011). Lingvisticheskiie osobennosti angloiazychnogo tvittera. [Linguistic features of English-language Twitter]. Ucheni Zapysky Tavriiskoho Natsionalnoho Universytetu Imeni Vernadskoho. Filolohiia. Sotsialni Komunikatsii Series, 24(63), No 2–1, 53–58. Retrieved from http://repository.kpi.kharkov.ua/handle/KhPI-Press/49133
Mohammad, S. M., & Turney, P. D. (2013). NRC Emotion Lexicon. National Research Council, Canada. https://doi.org/10.4224/21270984
Nerian, S. O. (2018). Dopys u sotsmerezhi yak movlennievyi zhanr internet-komunikatsii. [A post in a social network as a speech genre of Internet communication]. Naukovyi Visnyk Khersonskoho Derzhavnoho Universytetu. Linhvistyka Series, 33 (1), 66-70. Retrieved from https://journals.indexcopernicus.com/api/file/viewByFileId/708692.pdf
Nikolaieva, T. M. (2019) Leksyko-semantychni aspekty movy sotsialnykh merezh [Lexico-semantic aspects of the social networks language]. Zakarpatski Filolohichni Studii, 9(2), 96-101. Retrieved from https://dspace.uzhnu.edu.ua/jspui/handle/lib/33170
Poliakova T. L. (2021). Leksychni zasoby v zhanri tvitynh v anhlomovnii politychnii internet-komunikatsii [Lexical means in the genre Tweeting in English political Internet communication]. Zakarpatski Filolohichni Studii, 14(1), 177-181. https://doi.org/10.32782/tps2663-4880/2020.14-1.32
Shvelidze, L. D. (2021) Movni zasoby realizatsii komunikatyvnykh stratehii u dyskursi sotsialnykh merezh (na materiali ukrainskoi ta anhliyskoi mov) [Linguistic Means of Implementation of Communicative Strategies in the Social Media Discourse (Based on the Ukrainian and English Languages)]. Unpublished doctoral dissertation. Extended abstract. Vasyl' Stus Donetsk National University, Vinnytsya. Retrieved from https://abstracts.donnu.edu.ua/article/view/9878
Weisser, C., Gerloff, C., Thielmann, A., Python, A., Reuter, A., Kneib, T., & Säfken, B. (2022). Pseudo-document simulation for comparing LDA, GSDMM and GPM topic models on short and sparse text using Twitter data. Computational Statistics, 1-28. http://dx.doi.org/10.1007/s00180-022-01246-z
Yin, J., & Wang, J. (2014). A dirichlet multinomial mixture model-based approach for short text clustering. Proceedings of the 20th ACM SIGKDD International Conference on Knowledge discovery and data mining, (233-242). https://doi.org/10.1145/2623330.2623715
Source
CJE gives recommendations for the use of words “orcs,” “ruscists,” and “putin” in the media. Retrieved from https://imi.org.ua/en/news/cje-gives-recommendations-for-the-use-of-words-orcs-ruscists-and-putin-in-the-media-i45817 (date of access: 7.12.2022)
Downloads
Published
Issue
Section
License
Copyright (c) 2022 Olena Karpina, Justin Chen
This work is licensed under a Creative Commons Attribution 4.0 International License.