PROTOCOL: Online interventions for reducing hate speech and cyberhate: A systematic review

Revisão Acesso aberto Revisado por pares

PROTOCOL: Online interventions for reducing hate speech and cyberhate: A systematic review

2021; The Campbell Collaboration; Volume: 17; Issue: 1 Linguagem: Inglês

10.1002/cl2.1133

ISSN

1891-1803

Autores

Steven Windisch, Susann Wiedlitzka, Ajima Olaghere,

Tópico(s)

Bullying, Victimization, and Aggression

Resumo

The internet has become an everyday tool to communicate and network with people around the globe, but its perceived anonymity, availability, and instant access have made it an environment conducive to spreading hateful content and connecting to like-minded individuals with similar hateful ideologies. Hate speech and other prejudice-motivated behavior, however, need to be considered on a continuum of victimization, and "like other social processes, [be seen as] dynamic and in a state of constant movement and change, rather than static and fixed" (Bowling, 1993, p. 238). It is a social process that is marked by multiple, repeat, and constant victimization (Bowling, 1993), with victims no longer distinguishing between specific hateful events, and rather normalizing experiences of hateful conduct "as an everyday, unwanted but routine reality of being 'different'" (Chakraborti, 2016, p. 581). Understanding hateful behavior and victimization as a process allows us to connect "low-level" incidents of hateful behavior to the more serious and life-threatening incidents at the more extreme end of the spectrum (Bowling & Phillips, 2002). The Christchurch attacks in New Zealand and their link to hateful communication on the online platform 8chan is only one such example of how online hate speech and cyberhate can escalate to "in real life" attacks, leaving the online sphere and spilling into the offline world. As per Allport's (1954) scale of prejudice, more extreme forms of prejudice-motivated violence are founded on "lower level" acts of prejudice and bias, therefore, hateful content online should not be ignored. Intervening online to interrupt or counter hateful behavior already at the lower end of the scale of prejudice becomes important; online interventions which are to be identified and synthesized through this systematic review. Allport's (1954) scale of prejudice will be the basis for this systematic review. Early on, Allport (1954) asserted that individuals with negative attitudes toward groups are likely to act out on these prejudices "somehow, somewhere" (p. 14), and that the more intense such negative attitudes are, the more hostile the action will be. Allport (1954) put forward a scale of acts of prejudice to illustrate different degrees of acting out negative attitudes, a scale that starts with antilocution (or what we call hate speech), described as explicitly expressing prejudices through negative verbal remarks to either friends or strangers (Allport, 1954). Avoidance is the next level on the scale of prejudice, with people avoiding members of certain groups, followed by discrimination, where distinctions are made between people based on prejudices, which leads to the active exclusion of members from certain groups (Allport, 1954). This level of acting on prejudices is routed in institutional or systemic prejudices, for example, in the differential treatment of people within employment or education practices, but also within the criminal justice system, or through social exclusion of certain minority group members. Physical attack is the next level on the scale of prejudice, which includes violence against members of certain groups by physically acting on negative attitudes or prejudices. The last level is extermination, which is the ultimate act of violence against members of specific groups, an expression of prejudice that systematically eradicates an entire group of people (e.g., genocide or lynchings; Allport, 1954). Allport's (1954) scale of prejudice makes it clear how hate speech/cyberhate is connected to more extreme forms of violence motivated by specific prejudices and biases, with hate speech (or antilocutions) being only the starting point on a 5-point continuum (Bilewicz & Soral, 2020). The importance of this scale of prejudice is not only that it clearly illustrates a range of different ways and intensity levels to act out prejudices, but also the "progression from verbal aggression to physical violence or, in other words, the performative potential of hate speech" (Allport, 1954; Kopytowska & Baider, 2017, p. 138). This is where interventions at the lower level of the scale of prejudices, interventions targeting hate speech/cyberhate, become important. There is no universal definition of hateful conduct online, but there is some consensus that hate speech targets disadvantaged social groups (Jacobs & Potter, 1998). Bakalis (2018) more narrowly defines cyberhate as "any use of technology to express hatred towards a person or persons because of a protected characteristic—namely race, religion, gender, sexual orientation, disability and transgender identity" (p. 87). Another definition that also points out the ambiguity and challenges involved with identifying more subtle forms of hate speech, and also making reference to the potential threat of hate speech escalating to offline violence, is that put forward by Fortuna and Nunes (2018), who analyzed various definitions of hate speech "Hate speech is language that attacks or diminishes, that incites violence or hate against groups, based on specific characteristics such as physical appearance, religion, descent, national or ethnic origin, sexual orientation, gender identity or other, and it can occur with different linguistic styles, even in subtle forms or when humour is used" (p. 5). In this systematic review, we also distinguish hate speech/cyberhate specifically from other forms of harmful online activity, such as cyber-bullying, harassment, trolling or flaming, as perpetrators of such online behavior repeatedly and systematically target specific individuals to cause upset, to seek out negative reactions, or to create discord on the internet. In contrast, hate speech/cyberhate is more general and does not necessarily target a specific individual (Al-Hassan & Al-Dossari, 2019), instead hate speech/cyberhate heavily features prejudice, bias and intolerance toward certain groups within society. With the majority of hate speech happening online, interventions that take place online are an important way to challenge prejudice and bias, potentially reaching masses of people across the globe. The unique feature of the internet is that such individual negative attitudes toward minority groups and more extreme hateful ideology can find its way onto certain platforms and can instantly connect people sharing similar prejudices. By closing the social and spatial distance, the internet creates a form of collective identity (Perry, 2000, p. 123) and can convince individuals with even the most extreme ideologies that others out there share their views (Gerstenfeld et al., 2003). In addition, the enormous frequency of hate speech/cyberhate within online environments creates a sense of normativity to hatred and the potential for acts of intergroup violence or political radicalization (Bilewicz & Soral, 2020, p. 9). It is, therefore, important to challenge this hate speech epidemic (Bilewicz & Soral, 2020), especially since hate movements have increasingly crossed into the mainstream (Perry, 2000). With hate speech/cyberhate posing a threat to the social order by violating social norms (Soral et al., 2018), perceptions of social norms as either supporting or opposing prejudice has been found to have an influence on how individuals react online (Hsueh et al., 2015). Seeing other people post prejudiced (opposed to antiprejudiced) comments online can lead to the adoption of an online group's biases and can influence an individual's own perceptions and feelings toward the targeted stigmatized group (Hsueh et al., 2015). In addition, research around desensitization also suggests that being exposed to hate speech leads to desensitization, which further leads to an increase in outgroup prejudice toward groups targeted by such speech (Soral et al., 2018). With society increasingly recognizing that it is inappropriate to express prejudices in public settings, many interventions will include some form of social norms nudging to reduce such prejudices; interventions that "nudge behavior in the desired direction" (Titley et al., 2014, p. 60). Therefore, hate speech not only affects minority group members, but also has an influence on opinions of majority group members (Soral et al., 2018), which makes strategies that can elicit change in people's prejudice-related attitudes crucial (see, e.g., Zitek & Hebl, 2007). Governments around the world face increased demand for understanding and countering hateful ideology and violent extremism both online and offline (e.g., the Christchurch Call in New Zealand). The U.S. Government's 2011 CVE Strategy highlights the importance of ongoing research and analysis, the sharing of knowledge and best practices internationally, and the countering of hateful ideologies and propaganda (see also Department of Homeland Security, 2016, 2019). The goal of this systematic review is to use an integrated and interdisciplinary approach to examine the effectiveness of online campaigns and strategies for reducing hate speech and cyberhate. The internet also provides an opportunity to reach masses of people who have been exposed to hateful content and ideology online, therefore, this systematic review will focus on online interventions addressing online hate speech and cyberhate. The specific settings where we would expect to see the online interventions deployed will be on websites, text messaging applications, and online and social media platforms including, but not limited to, Facebook, Instagram, TikTok, WhatsApp, Google, YouTube, and Snapchat. As mentioned previously, many online interventions will be based on social norm nudges to reduce online hate. These interventions aim to change people's online behavior and encourage individuals or groups to conform to established social norms. The communication of social norms can happen through establishing community standards on online platforms themselves (e.g., Facebook, Twitter, etc.), through more formal online training courses, or through anti-hate speech/anti-cyberhate campaigns teaching people to recognize hate, embrace diversity, and stand up to bias. Such prevention campaigns are designed to challenge bias and build ally behaviors by supplying people with constructive responses to combat, for example, antisemitism racism, and homophobia, as well as provide resources to help people explore and critically reflect on current events. Other interventions may add messages to hateful online comments, counter hateful content or extremist ideology, or redirect people to more credible sources. Both peers and parents have been found to foster racial consciousness and identity development, define interracial relationships and cultivate ethnic heritage and culture (Hagerman, 2016). Socialization influences how children understand their group's social position and their membership within that group by providing an understanding of racial, religious, and sexual privilege (Bowman & Howard, 1985). Socialization often reflects peers' and parents' experiences with racism, discrimination, and their ideological perspectives about race, religion, or sexuality (Umaña-Taylor & Fine, 2004). This is important because peers and parents who feel discriminated against or believe that the "other" is a threat may impart their prejudices to their children or friends, which could lead them to interpret the social world with similar discriminatory views and/or behavior. Individuals who feel socially alienated or rejected are especially vulnerable to such socialization practices as they feel that adopting these views will provide them with a sense of acceptance and belonging (Leiken, 2012). Regardless of how an individual develops certain racial, religious, or sexual biases, the online interventions under review are expected to target and reduce the production of original hateful content such as antisemitic Tweets and/or homophobic blog posts as well as the consumption of hate speech material (e.g., watching or reading hate speech videos or blogs). For example, some interventions take a rather broad messaging approach by implementing racial sensitivity and diversity training through Public Service Announcements, peer-to-peer dialogue workshops, or films that provide opportunities for youth and adults to self-reflect and learn about historical oppression, people of color, women, and the LGBTQIA+ community from credible sources. The factual understanding of diverse groups is often supplemented by experiences with people within the group. These educational programs often identify a cultural guide who is willing to introduce youth to new experiences and who can aid in processing thoughts, feelings, and behaviors. These interventions intend to dispute and contradict negative stereotypes associated with specific cultures, people, and institutions by sharing different points of view based on human rights values such as openness, respect for difference, freedom, and equality (Gomes, 2017). Moreover, such interventions tend to involve blanket bans on specific behaviors enforced through the public promotion of norms or individual sanctions enforced by moderators. Other interventions, such as the "Redirect Method," are narrower in their messaging. These interventions generate curated playlists and collections of authentic content that challenge hate speech/cyberhate narratives and propaganda (Helmus & Klein, 2018). For instance, people who are directly searching for extremist content online may be linked to videos and written content that confronts such claims. These videos are designed to be objective in appearance instead of containing material that explicitly counters extremist propaganda. The underlying goal of this type of interventions is to provide credible content that effectively undermines extremist messaging but does not overtly attack the source of propaganda. In addition to confronting hate speech narratives, these interventions provide users with links to numerous social services such as anger management training, drug and alcohol treatment, and mental health resources. Online platforms, such as Twitter and Facebook, have also started to employ a similar method, redirecting people who comment on or share "fake news" or conspiracy theories, which often are fraught with prejudicial undertones and are harmful to minority groups, to more credible content and news sources. The aforementioned interventions are designed to counter-balance these biased perceptions (e.g., unsupported claims of the Black community as criminal or the LGBTQIA+ community as pathologized) Blacks as criminals, LGBTQIA+ as pathologized) by blunting the occurrence of racist discourse and reducing the likelihood these individuals will internalize and normalize racial, religious, and/or sexual prejudices (Qian et al., 2019). Being in new situations is uncomfortable and often awakens fears and apprehensions that can block our experiential development. Acquiring information or being exposed to minority-run businesses, poverty, and writings from minority authors allows a person to understand the thoughts, hopes, fears, and aspirations of the people outside their racial perspective rather than from the perspective of the majority society (Dunham et al., 2013; Lee et al., 2017). Doing so, counters racist programming by challenging hegemonic beliefs, which can lead to the acceptance of tolerant attitudes and the reduction of hateful expressions online. Findings from the proposed review will enhance our understanding of the effectiveness of online anti-hate speech/anti-hate interventions, will help ensure that programming funds are dedicated to the most-effective efforts, and will play a critical role in helping individual programs improve the quality of service provisions. It will inform governments and policymakers of the current state of such online efforts, what works and which modes of interventions to implement, and help guide economically viable investments in nation-state security. Our search of the scholarly literature identified one review, Blaya (2019), as similar to the proposed topic. Blaya's (2019) review, however, focused on the prevalence, type, and characteristics of existing interventions for counteracting cyberhate and did not include a meta-analysis. Two other similar reviews focused on exposure to extremist online content (Hassan et al., 2018) and communication channels associated with cyber-racism (Bliuc et al., 2018). A search of the Campbell Library using key terms (hate OR radical*) returned two protocols and one review identified for further inspection to assess potential overlap. The protocols include "Psychosocial processes and intervention strategies behind Islamist deradicalization: A scoping review" by de Carvalho et al. (2019) and "Police programs that seek to increase community connectedness for reducing violent extremism behavior, attitudes and beliefs" by Mazerolle et al. (2020). A further review on a similar topic is a recently completed Campbell review (January 2020), "Counter-narratives for the prevention of violent radicalization: A systematic review of targeted interventions" by Carthy et al. (2018) at the National University of Ireland, Galway. Our proposed review is distinguished from the de Carvalho et al. (2019) review in that we are focusing on hate speech and cyberhate generally without delimiting our approach to a specific type of radicalization (e.g., Islamist). Furthermore, we are electing to complete a systematic review and meta-analysis. Likewise, the protocol by Mazerolle et al. (2020) focuses on interventions involving police officers either as initiators, recipients, or implementers of community connectedness interventions. Our review will focus specifically on any online intervention, which may or may not involve police, but police will not be the focus nor be the basis of the online intervention strategy. Judging from Carthy et al. (2018) protocol, we anticipate our review will also capture counter-narrative interventions, but will differ based on setting, timing, and scope of interventions. Specifically, we are interested in online interventions that extend beyond counter-messaging campaigns to include a broad array of interventions outlined above and extend beyond radicalization to include everyday hate and prejudice. In addition to conducting a meta-analysis, the proposed review would build on Blaya's (2019) work by expanding the population parameters to include both adolescents as well as adults. Blaya (2019) limited her search to include interventions aimed toward youth, young people, children, young adults, adolescents, children, and teenagers and did not focus on extremism. The main objective of this review is to synthesize the available evidence on the effectiveness of online interventions aimed at reducing the creation and/or consumption of online hate speech/cyberhate material. To what extent are online interventions effective in reducing online hate speech/cyberhate? How is effectiveness related to the type of online hate speech/cyberhate intervention used? How is effectiveness related to the characteristics of individuals experiencing the online hate speech/cyberhate intervention (e.g., age, gender, race/ethnicity, offense history, childhood trauma)? Both experimental and quasi-experimental quantitative studies will be included. These study designs will address research questions #1 to #3. Eligible quantitative study designs include the following: Eligible experimental designs must involve random assignment of participants to distinct treatment and control group(s). Designs that involve quasi-random assignment of participants such as alternate case assignment are also eligible and will be coded as experimental designs. All eligible quasi-experimental designs must include a comparison group of participants compared to participants in the treatment condition. Eligible studies include those that report matching procedures (individual- or group-level) and statistical procedures employed to achieve equivalency between groups. Statistical procedures may include, but are not limited to, propensity score matching, regression analysis, and analysis-of-covariance. Furthermore, in anticipation of a limited quantitative evidence base, we will also include quasi-experimental studies with unmatched comparison groups that provide baseline assessment of outcomes for both groups. Finally, time-series analyses will also be included. Eligible time-series design include short-interrupted time series designs with a control group (<25 pre/post observations) and long-interrupted time series designs with or without a control group (more than 25 pre/post observations). Ineligible quasi-experimental designs include studies that utilize a comparison group consisting of participants who either refused to participate in the study or who initially participated in a study, but then dropped out prior to the start of a study. Eligible comparison conditions include other online interventions or conditions in which participants do not receive or experience an online intervention. Both youth and adult participants of any racial/ethnic background, religious affiliation, gender identity, sexual orientation, nationality, or citizenship status will be eligible for this review. The eligible youth population will be study participants with a minimum age of 10 through age 17. The eligible adult population will be study participants with a minimum age of 18 and older. Studies in which only a subset of the sample is eligible for inclusion—for example, if a study subject participates in both online and offline hate speech interventions—will be excluded. We do not anticipate excluding studies based on sample eligibility, as our inclusion criteria will be wide-ranging, and we will take reasonable steps to locate studies that only involves online interventions. We will resolve differences of opinion regarding the eligibility of a study for inclusion through discussion and consensus. If agreement cannot be reached, we will elicit the opinion of a subject matter expert, whereby the final list of included and excluded studies will be decided. Since these studies will be excluded, they will be unavailable and cannot be calculated in the meta-analysis and any related subgroup/sensitivity analysis. We adopt Blaya's (2019) four-part typology of intervention strategies to outline the potential universe of eligible interventions. The first intervention strategy is the adaptation of legal responses to hate speech/cyberhate, which includes the countering of violent extremism and aims to address cybercrime. More specifically, online interventions that are eligible range from disrupting hateful content online via specific "crackdowns" (e.g., server shutdowns, deletion of social media accounts) to responding to online hate using targeted strategies (e.g., through counter-narratives, modifying hateful content). Examples of previous studies focusing on online crackdowns include the monitoring and investigation of online accounts and content takedowns, online content monitoring and censorship (Alvarez-Benjumea & Winter, 2018), modifying hateful online comments to nonhateful comments (Salminen et al., 2018), and possibly changing algorithms to divert users out of online echo chambers. We are also interested in interventions such as the recent take-down of 8chan after this online platform was linked to "in real life" attacks in New Zealand and the United States, and if interventions exist that disrupt further hateful online content and radicalization after similar trigger events. Disrupting hateful content online via such crackdowns has brought up free speech concerns, as well as concerns around online users and hateful groups just moving on to other online platforms. Responding to hateful content online using targeted strategies has, therefore, been suggested as an effective online intervention. Examples include message priming using the endorsement from religious elites (Siegel & Badaan, 2020), the use of bots to sanction online harassers (Munger, 2017), automatically generating responses to intervene in online conversations where hate speech has been detected (Qian et al., 2019), and redirecting online users to YouTube videos debunking, for example, ISIS recruiting themes (https://redirectmethod.org/). Our systematic review will include a broader range of online interventions, many of which have only recently emerged. Two other strategies identified by Blaya (2019) are the automatic identification and regulation of hate speech/cyberhate using technology as well as the creation of online counter-spaces and counter-communication initiatives. These interventions include online counter-narrative marketing campaigns, the establishment and/or use of online counter spaces, online education-based interventions, online citizenship training, and online legislative initiatives narrowly defined to address extremist ideologies and hate speech that incites targeted violence and radicalization. In general, such interventions seek to prevent or minimize the occurrence of violent extremism or radicalization, including the spread of hate speech and extremist propaganda, by disrupting recruitment channels and creating opportunities to leave such groups. The fourth and final intervention strategy eligible for this systematic review involves educational programs that, for example, provide people with online literacy skills and challenge racism (Blaya, 2019). We will include online empowerment/resilience approaches, policy programs with an online component (e.g., Prevent and Exit programs), and educational and awareness-raising online interventions. Some of these interventions may evaluate behavioral changes by individuals no longer engaging in the creation and/or consumption of cyberhate and extremist material online. These online interventions may be sponsored by nonprofit and nongovernmental organizations, internet service providers, or policy or governmental agencies in the case of legislative interventions. The comparison condition may be routine exposure and engagement to hate speech/cyberhate or another online intervention. The primary outcome of interest is the creation and/or consumption of hateful content online. By creation, we refer to the production and authorship of original hateful content such as posting antisemitic Tweets, uploading racist YouTube videos, and/or writing homophobic blog posts. The consumption of hate speech material may include visiting or being a member of a hate website/online group, watching or reading hate speech videos or blogs, being a target of online hate speech/cyberhate, or reporting hate speech material. Secondary outcomes of interest include affective and emotional states of study participants such as anger, fear, emotional unrest, depression, anxiety, mood swings, and attitudes toward hate speech/cyberhate. Eligible studies must report a primary or secondary outcome (or both) to be included. There will be no exclusion criteria on the source of outcome data. Data for the primary and secondary outcome measures can be obtained from any courses including institutional records, direct observations, surveys or questionnaires completed by participants. We will include any measure of unintended adverse effects from strategies to increase the scale of implementation of potentially effective anti-hate speech and deradicalization interventions for participants. These could include adverse changes to emotional or psychological well-being, defensiveness, guilt, shame, resistance to the teaching, miscommunication, creation of barriers, and dysfunctional adaptation behaviors. Adverse effects can also include nonindividual effects such as a relocation of hate speech/cyberhate to other platforms instead of a reduction of hate speech/cyberhate. All adverse effects described in eligible studies will be included in the synthesis. We will focus on the period between 1990 and the current year, 2020. The period restriction starting with the year 1990 considers when the internet transitioned to a wider infrastructure and broad-based global community (Leiner et al., 2009). We are opting for an inclusive approach in bounding the lower end of our search period to 1990. While the odds may be slim, it is conceivable hate speech/cyberhate was present online through mailing lists or emails and some studies may capture this. Our population of studies will also be limited to studies published in English, German, Persian, and Arabic, but inclusive of studies completed in any geographical region, as we are focused on online content that can be consumed and shared across geographic and nation-state boundaries. The language parameters reflect the language abilities of the review team. Our full-text coding will consider where studies were conducted and, if possible, the geographic location of included study participants. Any changes in eligibility criteria will be agreed prospectively between the members of the review team. These will be documented and reported as a discrepancy from the protocol in the review. In the advent of a change in eligibility, we will rescreen citations. Setting search terms: online OR "social media" OR internet OR Twitter OR Facebook OR 8chan OR 8Kun OR Gab OR Telegram OR TikTok OR Reddit OR WhatsApp OR Instagram OR "social networking site*" OR "cybervictimization" OR "online incivility" AND Extremism/radicalization/hate terms: "hate speech" OR cyberhate OR extrem* OR narrative OR racis* OR radical* OR speech OR ideolog* OR islamophobi* OR homophobi* OR transphobi* OR misogyny OR disablism OR discrim* OR terror* AND Treatment terms: interven* OR option* OR strategy* OR "counter narrative*" OR "nudge" OR "norm* intervention" OR "norm* nudge" OR counternarrative* OR "alternative narrative*" OR campaign* OR counter* OR peer-to-peer OR prevent* OR disrupt* OR st

Ver no editor

Altmetric

PlumX

Entrar

Lembrar minha senha

Receber meu e-mail de confirmação

PROTOCOL: Online interventions for reducing hate speech and cyberhate: A systematic review