Artigo Acesso aberto Revisado por pares

The first two months in the war in Ukraine through topic modeling and sentiment analysis

2023; Elsevier BV; Volume: 15; Issue: 1 Linguagem: Inglês

10.1111/rsp3.12632

ISSN

1757-7802

Autores

Clara Maathuis, Iddo Kerkhof,

Tópico(s)

Social Media and Politics

Resumo

In recent years, social media remained not only an environment for expression, but it also became an active battlefield where its users emulate their thoughts, feelings, beliefs, and experiences about the ongoing conflicts and wars, and where sometimes social manipulation techniques become difficult to identify, tackle, and counter. Currently, proper social media analytics solutions are still incipient regarding the ongoing war in Ukraine, but they are much needed. Consequently, this research aims to extract and analyze the topics discussed and the sentiments experienced by Ukrainian Telegram users on data collected in the first 2 months of war in Ukraine through a design science research in a data science methodological approach carried out in a multidisciplinary stance contributing to the ongoing strategic, socio-ethical, and technical discourses and efforts. En los últimos años, las redes sociales no sólo han continuado siendo un espacio para la expresión, sino que también se han convertido en un frente de batalla abierto donde sus usuarios emiten sus pensamientos, sentimientos, creencias y experiencias sobre los conflictos y guerras en curso, y donde a veces las técnicas de manipulación social resultan difíciles de identificar, atajar y contrarrestar. En la actualidad, las soluciones adecuadas para el análisis de redes sociales son todavía incipientes en relación con la guerra en curso en Ucrania, pero son muy necesarias. En consecuencia, esta investigación tiene como objetivo extraer y analizar los temas discutidos y los sentimientos experimentados por los usuarios ucranianos de Telegram sobre los datos recogidos en los 2 primeros meses de la guerra en Ucrania a través de una investigación de diseño científico en un enfoque metodológico de ciencia de datos llevado a la práctica mediante un planteamiento multidisciplinar que contribuye a los discursos y esfuerzos estratégicos, éticosociales y técnicos en curso. 近年、ソーシャルメディアは表現の場であるだけでなく、ユーザーが今現在起きている紛争や戦争についての考えや、感情、信念、経験などを競い合う激しい戦場となっているが、時には、その社会的操作の技術を特定したり、その問題に取り組んだり、対抗したりすることが困難になる場合もある。現在、ウクライナで起きている戦争に関して、適切なソーシャルメディア分析のソリューションは、まだ初歩的なものであるものの、強く必要とされている。そこで、本稿では、ウクライナの戦争の最初の2か月間に収集されたデータを基に、Telegram(メッセージアプリ)でウクライナ人ユーザーが議論したトピックと経験した感情を抽出し、分析する。これは、現在進行中の戦略的、社会倫理的、技術的な議論や取り組みに貢献する、学際的な立場で実施されるデータサイエンス的な方法のアプローチによるデザインサイエンス研究を通じて実施する。 Motto: 'Nothing will end war unless the peoples themselves refuse to go to war.' ( Albert Einstein). On March 16 2022, a video of President Zelensky was placed on a Ukrainian news website, Ukraine 24, in which President Zelensky was telling his soldiers to lay down their arms and surrender in the fight against Russia. However, this video of about 1 minute is a deepfake video, built to spread false claims and came to existence since the news agency was hacked (Atlantic Council, 2022). On this point, the Ukrainian Defense Intelligence warned precisely 2 weeks before, on March 2 2022, about the preparations done in this sense by Russia (Defense Intelligence of Ukraine, 2022). Shortly after, President Zelensky posted a video response on Telegram and other social media channels in which he denied this claim and further acknowledging maintaining his position (Allyn, 2022; Techcrunch, 2022). This incident shows that social media and cyberspace in general has become an active battlefield in the ongoing war in Ukraine. Moreover, the (still) decentralized nature of social media, together with its spread and use according to one's interests, pose different types of challenges to both users as well as national and international authorities regarding the content contained, together with its nature, validity, target audience (e.g., public opinion), and further spread. In this context, disinformation or misinformation techniques are used with the possibility of altering content to engage public opinion in discourses and beliefs that, for example, either delegitimize, demonize, argue for use of force, or provide specific arguments as causes for war in the target country (Ciuriak, 2022). The efforts and fight against such social media manipulation techniques already show important results by means of embedding different artificial intelligence (AI) and governance-based countermeasures through social media platforms, public and private institutions, and organizations. However, these techniques continue to advance, and feelings of insecurity and instability continue to grow, for instance, using advanced AI techniques for building deepfake content (Carmack, 2022) such as generative adversarial networks (GANs) and autoencoders. While these are now, in many cases, easily recognizable as being fake, they continue to advance, become more complex and intelligent, and therefore, less and less recognizable by the human eye. The remainder of this article is structured as follows. The present section tackles the motivation behind this research while depicting the research aim it intends to achieve. Section 1 discusses relevant research for this article, considering and positioning this research in the ongoing discourses of the existing body of knowledge. Section 2 addresses the methodological approach considered in this research that is, design science research through data science. Section 3 introduces the dataset used in this research together with the necessary preprocessing steps, and discusses the design choices together with implementation aspects involved in the design and development phase of the solution proposed. Section 4 presents the results obtained and positions these results in the ongoing discourses and context of the war in Ukraine. Section 5 discusses the findings of this research and describes the challenges, limitations, and future work. In this section related studies are discussed in respect to the socio-technical nature of this research. Accordingly, related research focusing (i) on datasets and technical techniques used for implementation, and (ii) on policy and social aspects are discussed. Appraising their borderless nature and possibility of targeting both physical and cyber targets (Maathuis, 2022), the battlefield options for cyber weapons has significantly increased in recent years through altering, disturbing, damaging, influencing, or manipulating both humans and systems via the social media environment (Chen & Ferrara, 2022). Such weapons have both direct and indirect impacts on people's cognition (NATO, 2021), beliefs, and behaviors, as well as on different (inter)national security perspectives, dimensions, and aspects (Maathuis et al., 2018a). The manipulation strategies used in social media include spreading propaganda, driving division and polarization, expanding the context to other media types, spreading altered or false information, and distracting or diverting attention (Chen et al., 2022). In this context, a great deal of effort is allocated to designing, building, and implementing corresponding disinformation and misinformation detecting and countering strategies and techniques (Choraś et al., 2021; Islam et al., 2020). Fortunately, the other side of social media, which resonates with its initial purpose, is (live) information spreading and access, as well as communication with others (Krylova-Grek, 2022; Meikle, 2016): essential aspects in modern human existence and society, especially in difficult moments and crises of different natures. To mention a few here, the Arab Spring or the so-called 'Facebook Revolution' in 2011 (Wolfsfeld et al., 2013), the influential campaigns with religious and fighting purposes of different extremist groups on Twitter and Facebook (Klausen, 2015; Pearson, 2018), the social movements and protests during EuroMaidan in Ukraine in 2013–2014 (Onuch, 2015), the annexation of Crimea in 2014 through Ukrainian discourse analysis on IQBuzz and strategic implications (Brodovskaya et al., 2019; Saluschev, 2014), and the proceeding discourses of the ongoing war in Ukraine (Chen & Ferrara, 2022). While on the first examples a rich body of literature can be found, considering several technical, social, political, or ethical aspects and implications, information the ongoing war in Ukraine is scarce, with knowledge being in the incipient phase of its development. This information is much needed to get a clear and transparent view, not only on the events happening but also on the discourses, topics, and feelings expressed by the people experiencing them, their dearest and nearest, and/or impacted by them. In this regard, initial efforts have been made for collecting, analyzing, and interpreting the results obtained with respect to different human aspects correlated to the ongoing war in Ukraine based on data obtained from different social media platforms. Juric (2022) analyzed Facebook, Twitter, Instagram, and YouTube data to gain insights into migration flows and integration trends of Ukrainian refugees, which can be of use to the integration processes into corresponding societies, further improve the response to the ongoing humanitarian crisis, and support the labor market integration. Aguerri et al. (2022) analyzed the information, timeline, and impact of the content of 90 accounts labeled as Russian affiliated of individual journalists and media outlets on Twitter before and after February 28 using the Autoregressive Integrated Moving Average (ARIMA) model, finding an increase in both activity and impact. Haq et al. (2022) acknowledged the increase in both anti-Ukrainian and anti-Russian social media content, and propose a possible solution on requiring stricter information controls, but this would imply establishing who should have the power to define them, as well as how, and in which circumstances, they should be executed. Moreover, a set of valuable Twitter datasets have been proposed for conducting further social media analytics on different aspects of the war in Ukraine. To mention here is the Twitter streaming dataset (Pohl et al., 2022) that uses Ukraine, Russia, and conflict as filtering terms, and which was collected 1 week before and goes for q week after the start of the war for further research purposes. Along the same line, Chen and Ferrara (2022) have collected 63 million tweets between February 22 and March 8 2022, using filtering terms such as Ukraine, Russia, NATO, Donetsk, and Zelensky for supporting the research community in this direction, by further maintaining and updating to understand the role that social media plays with respect to information sharing, influence campaigns, and mobilization in war times. Haq et al. (2022) also gathered Twitter data from the beginning of the war using filtering keywords such as Russia, Ukraine, Putin, Zelensky, and Kyiv without applying language or geo-filters to support the corresponding research community with data updated to the present moment, that is, in May 2022. Moreover, Shevtsov et al. (2022) released a recent Twitter dataset containing IDs from 57.3 million tweets. From the volume analysis conducted by the authors, one can see that the activity of registered users decreases with time, the daily volume of suspended or deactivated accounts significantly increased over time, and that the users' sentiments vary over time. Among the steps that follow for capturing relevant insights are gathering and structuring the topics tackled in the corresponding discourses, and analyzing the sentiments or feelings expressed by the users. Polyzos (2022) gathered 42 million tweets from February 15 2022 to February 26, 2022, using filtering terms such as Russia and Ukraine, and limiting the data collection to only tweets in English, while aiming to analyze not only the sentiments and their scores at the beginning of the war, but also in the immediate period before its start. As parts of the results obtained by Polyzos (2022), words such as insane, fight, attack, impose, and threatens have been considered as contributing to negative sentiments, while words such as support, protect, promise, justice, and peace have been considered as contributing to positive sentiments. Furthermore, Hanley et al. (2022) conduct an in-depth analysis on Western, Russian, and Chinese media discourses regarding the ongoing war in Ukraine, highlighting the fact that while Western press outlets were preoccupied with the military and humanitarian aspects of this war, the Russian media focused on the justification for the 'special military operation' and Chinese news media concentrated on the diplomatic and economic consequences of this conflict. The abovementioned resources reflect the necessity and importance of collecting and analyzing relevant social media datasets and their corresponding discourses and narratives that can support both the research and practitioner communities with direct and realistic information regarding different aspects of the ongoing war. As reflected, incipient research exists on different social media platforms such as Twitter, Facebook, and Reddit, while information on other social platforms that are increasingly used such as Telegram, which turned into a 'digital battlefield' (Bergengruen, 2022), to the best of our knowledge, is limited. It is, therefore, the scope of this research to tackle this knowledge gap using a similar approach to other studies in this domain, that is, topic modeling and sentiment analysis, taking into consideration their applicability and effectiveness demonstrated in the related studies discussed previously. This research aims to build a technical solution for capturing topics and sentiments discussed by Ukrainian-speaking Telegram users in the context of the ongoing war in Ukraine. Taking into consideration the nature of this objective that has a direct societal purpose, contribution, and implications (Offermann et al., 2009; Venable et al., 2017), the design science research methodology (Hevner et al., 2004; Peffers et al., 2018) is used in a data science approach (Shearer, 2000). Accordingly, a series of AI models, that is, artifacts, are built, taking a multidisciplinary approach by merging AI, social media, policy, international relations, and security studies concepts, methods, techniques, and technologies. Hence, the following research activities were undertaken: The exponential increase in the use of social media platforms is also applicable in crises and conflicts of different natures, and where different social, ethical, and legal concerns are raised (Rõigas & Geers, 2015; Saura, Ribeiro-Soriano, & Palacios-Marqués, 2021). The ongoing war in Ukraine is a real example in this case. While different media outlets, organizations, and academic efforts have intensively tackled this issue, and continue to do so with regards to different dimensions of the ongoing war, a series of identical, similar, or even more specific and personal discourses are found with respect to different topics and perspectives. As the related resources studied in this research show, such discourses tackle war incidents, financial sanctions, movement of troops, military and humanitarian aid, and several types of effects such as injury or death to civilian populations, and damage or destruction to civilian infrastructure. Nevertheless, while recent research investigating and positioning data obtained related to ongoing war discourses from social media platforms such as Twitter and Facebook exist, on platforms such as Telegram, which is intensively used by Ukrainians, is scarce (Nazaruk, 2022), but much needed as this platform continues to represent a real alternative to other classical social media platforms, and it implies by default the so-called 'field nature' through its openness and direct connection to real events. Hence, this research tackles this knowledge gap, aiming to capture topics and sentiments of Ukrainian-speaking users from Telegram discourses since the start of the war for 2 months (i.e., from February 24 to April 23 2022) by building a technical solution in this sense. Once that the problem to be studied is identified and the objective of this research is defined, the translation to its corresponding research question is formulated as follows: What are the main topics and sentiments expressed by Ukrainian users in the first 2 months of war? To be able to properly understand the context of this research and capture the necessary background from the domains involved, an extensive literature review was conducted. In this process, studies related to this research were found by building combinations of search keywords such as 'topic,' 'sentiment,' 'analysis,' 'Ukraine,' 'war,' and 'social media,' which are used for building search queries in the Institute of Electrical and Electronics Engineers (IEEE), The Associati for Computing Machiner (ACM), Scopus, Wiley, and Google Scholar scientific databases. Accordingly, the main related studies include strategic and policy analysis studies as well as tangent technical approaches for tackling similar societal and policy issues through the same AI techniques, as discussed in the Related Research section of this article. The solution that this article builds and proposes is a technical artifact composed of a series of models that extract topics and analyze sentiments of users by implementing a series of machine learning models in Python. Accordingly, this process is addressed in the Model Implementation section of this article. The results obtained in this research are directly interpreted and positioned in the current academic and societal discourses regarding different dimensions characterizing the ongoing war in Ukraine, while future perspectives and research ideas are discussed, considering a socio-technical stance. The findings of this research (results obtained and models developed), their interpretation, and the research conducted are disseminated by means of this article and scientific presentations. In this research, Telegram data was collected since the first day of the war, that is, February 24 up to April 23 2022, a total of 2 months, from a Ukrainian channel, whose name is kept private for privacy reasons. The dataset contains 9,736 messages in the Ukrainian language, for which the date and text body was saved in a .csv file for further preprocessing. Furthermore, a translation mechanism was applied to the messages collected using the Google Translate API, which dynamically translates a string into any of the languages supported by the platforms using deep learning techniques applied in the natural language processing paradigm, that is, neural machine translation (Google Translate, 2022). In this case, the translation from Ukrainian language, that is, code uk to English language, that is, code en was made. A code snippet for this process is captured in Figure 1. Once the messages were translated to the English language, they were further preprocessed. This involved lower-case conversion, stop words removal (e.g., the, an) as well as removing URLs and nonalphanumeric characters (e.g., @), except for spaces. A set of these operations is depicted in Figure 2. Further, the messages were split into words and part-of-speech (PoS) tagging was performed to establish and provide a label for each word that corresponds to its grammatical function in a sentence (e.g., noun, adjective, verb). Further, the lemmatization process was carried out to reduce a given word to its semantic root or base form word, which is referred to as the lemma. For instance, the words runs and running are reduced to their lemma run, and the word better is attributed the lemma good. In this process, the WordNet interface was used, as shown in Figure 3. Based on the literature review conducted considering the applicability of AI techniques in social media and considering their effectiveness in tackling similar problems in other or related contexts, as discussed in the Related Research section of this research, the following two modeling techniques are further considered: topic modeling and sentiment analysis. Each of these techniques is further explained and the results obtained from their implementation are depicted and discussed in the next section. The lemmatization process is important when conducting a topic modeling process, a natural language processing area where different artificial intelligence techniques are applied, for example, machine learning or deep learning broadly used to tackle different social media aspects and issues (Alhazmi, 2022). In this process, topics are assigned to text collections, which implies identifying, clustering, and associating clusters of meanings named topics to a given text body. For the implementation, the non-negative matrix factorization with Kullback–Leibler divergence (NMF-KL) algorithm was used (Hien & Gillis, 2021). In a nutshell, this algorithm reduces the dimension of the text body using a factor analysis method to provide less significance or weight to the words that have less coherence by making use of the KL objective function (Medium, 2020). In advance, the number of topics that must be searched should be provided, and they are ordered by their associated weights. This implies that the more important is a word for a specific topic, the greater its weight will be. Furthermore, for providing useful insights into the ongoing war discourses, two approaches have been considered: capturing topics per day and capturing topics over the entire 2-month period. More specifically, (i) nine topics per day have been considered and a direct match between them and new events was observed from the initial interpretation conducted, and (ii) 100 topics for the entire period have been considered to present a more diverse palette of topics covered, and thus to identify corresponding trends. A code snippet for this process is depicted in Figure 4. The second part of the modeling process is represented by the sentiment analysis. This represents another area of natural language processing (Rajput, 2020) where different machine learning or deep learning techniques are being applied to extract, quantify, analyze, and interpret human opinions, attitudes, perceptions, sentiments, and so on, regarding different topics, systems, or services (Birjali et al., 2021; Yue et al., 2019). Specifically, in respect to the sentiments regarding the ongoing war in Ukraine, in this process these are recognized and classified as being positive or negative, both per topic and per day. Three possible pretrained models were considered: TextBlob, VADER, and Flair (Bokka et al., 2019), which were further analyzed, and from which Flair was selected given its accurate additional manual checks, as depicted in Figure 5. In the previous sections, the working dataset was described and the implementation of two machine learning-based modeling techniques, that is, topic modeling and sentiment analysis have been discussed in detail. The results obtained for both applied techniques are presented and interpreted in the context of the ongoing war in Ukraine. Regarding modeling the topics discussed by Ukrainian users, the first process refers to capturing nine topics per day to reflect on the progress of the topics tackled, and the second process refers to capturing 100 topics for the entire period to gather a higher-level insight into the most important topics discussed in the whole period. For topics contained in both categories, sentiment scores have been further captured and assessed. As the results obtained contain multiple files impossible to reproduce here due to space constraints, for the daily analysis it was decided to split the whole 2-month period (9.5 weeks) into three segments and present the results obtained in the following 3 days: first day of the war, that is February 24 2022, March 24 2022, and April 23 2022. In the following figures, dark blue shows the weight of words, with light blue representing the number of messages, and red depicting the sentiment scores. Figure 6 shows the resulting nine topics, together with their corresponding weights for February 24 2022, and their corresponding sentiment score together with the count of messages are captured in Figure 7. Figure 8 shows the resulting nine topics together with their corresponding weights for March 24 2022, and their corresponding sentiment score on the negative side together with the count of messages are captured in Figure 9. Figure 10 shows the resulting nine topics together with their corresponding weights for April 23 2022, and their corresponding sentiment score together with the count of messages are captured in Figure 11. Furthermore, from the top 100 topics captured for the entire period, that is, from February 24 to April 23 2022, nine topics were randomly selected to depict the top 10 words characterizing them, together with their corresponding weights and sentiment scores plotted against the number of messages to show the diversity of topics identified. The nine topics are depicted in Figure 12, where dark blue represents the weights of words, light blue shows the number of messages, and red depicts the sentiment scores. At the end, the sentiment score is provided per day over the entire period in Figure 13. From the results obtained, the topics and words characterizing the first day, February 24 2022, are, for example, military, attack, Zelensky, sanction, subway, explosion, Kyiv, evacuate, Chernobyl, and nuclear, for March 24 2022 they are, for example, child, chemical, NATO, missile, territory, evacuation, and donate, while for April 23 2022 they are, for example, Kherson, announcement, Mariupol, Odessa, Chernihiv, injury, damage, leave, partner, help, child, aircraft, Azovstal, and prisoner. In this analysis, several aspects are observed. Firstly, that the general sentiment is structural negative over time, which can be explained by the deep impact that it had on Ukrainian population and implicitly on users on social media. Secondly, focusing on the first day of the analysis, a direct match exists between the topics identified and the general feelings of the Ukrainian population, and the initial news, interviews, and discourses from international organizations and media outlets, as well as practitioners and researchers from trusted outlets and institutions on aspects such as further advancements of Russian troops and attacks surrounding Kyiv, survival through relocation in the subways and evacuation, the field situation, and the capture of the Chernobyl power plant (European Council, 2022; Gov Ukraine, 2022; NATO, 2022; Purvis, 2022; Reuters, 2022; UNHCR, 2022). These perspectives are also aligned with academic studies analyzing social dimensions of this war concerning, for instance, policy, financial, and health aspects. Accordingly, Siddi (2022) argues that the ongoing war in Ukraine represents a 'turning point for European politics, security, and economy' in which Ukraine's place in European politics was modified and the relations between the EU and Russia are changed. In the battlefield, Freedman (2022) acknowledges the differences in the number, skills, and strategic coherency between the parties involved, reflecting on particular military attacks and counterattacks such as the one in Kherson and near Crimea. Additionally, the author reflects on a crucial aspect in this war: the underscored perspective on command and control 'in determining ultimate military success' (Freedman, 2022). Such actions did not only have a direct impact on the dynamics and relations between Russia and EU, but also led to a comprehensive set of general and targeted sanctions against Russia alongside those applied for the annexation of Crimea (van Bergeijk, 2022). At the same time, the embargo on Russian energy products such as oil and gas is impacting European companies and households (Garicano et al., 2022). Importantly, alongside these consequences are the physical and health effects experienced by millions of refugees and their families, as well as the pressure put upon the Ukrainian health system (Leon et al., 2022). Going to the last day data were collected in this study, that is, April 23 2022, which represents 2 months after the war started, the topics and words identified include defense, occupier, leave, explosion, Kherson, partner, help, Odessa, missile, Mariupol, child, Chernihiv, settlement, gas, Azovstal, crime, and prisoner. Similarly, these match those of trusted sources on aspects such as further strikes in multiple areas, for example, Chernihiv, Kherson, Mariupol, and Odessa, gas shortages, and the crisis in the Azovstal steel plant (UN News, 2022; UN Brief 1 (2022); US DoD, 2022; President Ukraine, 2022; Jeong, 2022). To add to the relevant studies previously discussed in relation to the first day and period of war, an important event that immediately gathered media and humanitarian attention was the Russian siege of the Azovstal steel plant in Mariupol, which directly fell under the scrutiny in the balance between the International Humanitarian Law, the notion of purpose, and the notion of humanity (Winter, 2022). Additionally, surrounding and after February 24, several cyberattacks were launched on Ukrainian infrastructure, for example, on communication systems of the Kyiv Post, IsaacWiper attack on government websites, on digital infrastructure for blocking access to financial services and energy, and MarsStealer for stealing user credentials of Ukrainian citizens and organizations (EU Parliament, 2022; Jakub, 2022). These attacks show patterns and similarities with older cyberattacks targeting Ukraine, such as NotPetya from 2017 and BlackEnergy from 2015 (Greenberg, 2018; Liang et al., 2016). Conclusively, going through the same reasoning for the whole 2-month period, one can see words such as Zelensky, peace, EU, child, sanction, nuclear, Mariupol, Azovstal, Bucha, body, fight, justice, medicine, food, diplomat, and refugee. Hence, a diverse palette of words that properly describe diverse dimensions of the internal and external struggle of Ukrainian people and government are also captured in the negative scores from the sentiment analysis conducted and the results discussed by related studies on this topic. Conclusively, this modeling method and the results obtained focusing on the Telegram platform are meant to (i) supplement the existing studies on discourse analysis, topic extraction, and sentiment analysis conducted on other social media platforms such as Twitter, Facebook, and Reddit; and (ii) further position them in the ongoing policy, social, and technological discourses and studies. The ongoing war in Ukraine reflects its complex character by merging conventional with unconventional means and methods in different operational battlefields, having long structural, functional, and temporal roots aiming at engaging both deliberated and dynamically defined targets and, implicitly, humans. This implies targeting human cognition, and its main corresponding processes such as perception, reasoning, and decision-making. It is a conflict that crosses the physical borders of Ukraine through its strategic, military, economic, and informational implications, and consequences, transforming this from a regional conflict into an indirect international conflict. The ongoing battles could be seen as the fulminant part of the hybrid warfare – a recent war paradigm that combines military, paramilitary, diplomatic, informational, economic, and other means to achieve strategic political goals – conducted in the last years through a mix of means and methods started in Eastern Ukraine and Crimea, as well as through the building and execution of cyber and information operations through environments such as social media and conventional media channels and outlets in general (Geers, 2015; Jaitner & Geers, 2015; Maathuis et al., 2018b, 2020; Muradov, 2022; Pocheptsov, 2018; Raţiu & Munteanu, 2018; Schmid, 2019; Semotiuk, 2022). While countermeasures for tackling social media manipulation exist, from direct considerations in the development of social media platforms, as well as dedicated organizations and institutions based on both massive data and human expertise, the road is still long since technological advancements are in the hands and heads of both defenders and offenders. Therefore, understanding users' essential discourses and narrative topics together with their corresponding sentiments would facilitate an intelligent, adaptive, and transparent approach for producing and/or enhancing awareness and tackling and/or countering social media manipulation in times of conflict. Specifically, several studies have focused in this direction on tackling these issues in relation to the ongoing war in Ukraine and preceding events carried out in the last decade, on social media platforms such as Twitter, Facebook, YouTube, and Reddit, while a limited studies focus on other platforms such as Telegram. Taking these as a theoretical foundation for the implications, and motivation behind this research, in this study the aim is to build a technical solution for capturing topics and sentiments of Ukrainian-speaking Telegram users in the context of the ongoing war in Ukraine. To achieve this aim, the design science research methodology (Hevner et al., 2004; Peffers et al., 2018) is used in a data science approach (Shearer, 2000) for building corresponding machine learning models adopting a multidisciplinary perspective. Accordingly, text data between February 24 and April 23 2022, is collected and prepared for building topic modeling and sentiment analysis models for analyzing the content per day as well as for the whole period. The results obtained reveal, firstly, aspects characterizing the war in Ukraine, being related to attack, particular cities and events, sanctions, children, and refugees as key topics discussed; secondly, a structural negative sentiment over time, characterizing the negative sphere that corresponds to the gravity and scale of the phenomenon investigated, that is, war; and thirdly, a match with academic, practitioner, and media corresponding discourses and narratives. These results could be further considered in academic research for (i) multiplatform, multichannel, and multimedia discourse analysis for extracting general topics and underlying sentiments and emotions of users; (ii) understanding policy, social, and psychological impacts based on focused research on sentiments and emotions of users about aspects such as migration; and (iii) investigating both per day and total period moments, the topics contributing to different types of sentiments and emotions of users. At the same time, these results could be further considered in practitioner research for (i) gathering insights into users' discourses and sentiments for communication purposes, as well as for designing public security awareness campaigns against social manipulation campaigns; and (ii) understanding the dynamics involved in such a complex and long-standing conflict such as the one ongoing in Ukraine. 'If you know the enemy and know yourself, you need not fear the result of a hundred battles.' ( Sun Tzu).

Referência(s)