Hate speech remains a serious issue in the Armenian online media landscape, particularly concerning political and gender-related topics. Insults, threats, and dehumanizing comments that proliferate under media publications undermine public discourse and create various challenges for freedom of expression.

To address this issue, the All for Equal Rights Foundation, in collaboration with the Media Diversity Institute (MDI) NGO and with support from the United Kingdom, has developed an artificial intelligence-powered moderation tool called moderAItor. This tool is designed to automatically identify and manage hate speech on social media platforms.

Alexander Martirosyan, the program manager of the All for Equal Rights foundation, shared that two years ago, their team conducted a study on gender-based online hate speech in the media. Over the course of 8 months, their 5-member team manually identified and processed 1,519 comments.

“We collected comments from media publications, organized them into an Excel file, and categorized them based on several criteria: the type of hate speech, its underlying reasons, thematic focus, and its relevance to the content of the publication,” explains Alexander.

The work was quite challenging, as some comments were written in Latin, Cyrillic, or even in coded formats. At that moment, the team realized they needed to seek digital solutions. They decided to develop a digital tool capable of identifying hate speech and performing automated moderation.

“After months of extensive work, we successfully developed the AI-powered moderAItor tool. This tool can be integrated with Facebook, Instagram, and YouTube pages to analyze comments and perform moderation. Users can choose from three levels of moderation sensitivity: high, medium, or low. Additionally, they can decide on the actions to take regarding the detected content. The sensitivity level determines the strictness of moderation. Higher sensitivity levels can even address irony and sarcasm, based on the user’s requirements. In cases of very sensitive moderation, users can send all observed comments for review, allowing them to decide how to handle them in the future,” explains Alexander.

The authors explain that moderAItor performs contextual analysis by evaluating not only the content of comments but also the context in which they are published. Furthermore, moderAIto can understand and process comments written in both Latin and Cyrillic. This tool translates and analyzes comments in any language, enabling it to understand their context. Additionally, it serves as a moderation tool that compiles a comprehensive research database based on its findings. For example, statistics collected during this period showed that the most commonly used offensive term in the Armenian online community is “Turk,” which is used as a label in various contexts.

 

Screenshot: moderaitor.app

The tool allows for the collection of detailed analytical data that reveals the basis and nature of hate speech. It identifies 12 different types of hate speech, including incitement to violence, dehumanization, and labeling. Furthermore, these types are categorized based on their underlying basis, which includes factors such as political views, nationality, gender, and more.

Screenshot: moderaitor.app

In partnership with the Media Self-Regulation Initiative, the authors of moderAItor have created a content moderation policy. This policy outlines the types of content that require moderation, user notification methods, principles for managing the online community, and procedures for addressing complaints and disputes.

“This document ensures a transparent and accountable moderation system. Moderation decisions are guided by the principles outlined in the international document known as the ‘Rabat Action Plan’, which establishes standards for identifying and combating hate speech,” states Alexander.

The tool is currently in testing. Over the past 10 days, it has been linked to Factor TV’s social media pages and analyzed over 17,000 comments. Out of these, more than 2,200 comments were identified as containing elements of hate speech and have been moderated.

Screenshot: moderaitor.app website

“Accountability is essential as well. For every deleted or hidden comment, the tool provides a justification for the type of hate speech it associates with it. This enables the data to serve as a basis and an explanation in the event of any disputes or complaints from users,” says Alexander.

The authors of moderAItor plan to offer free access to the tool to over 10 media outlets and representatives from civil society during the upcoming election period. After this current program concludes, they will seek new funding sources or explore monetization options for moderAItor.