Methodology for the analysis of Ukrainian segment of social media and messengers

Methodology for the analysis of Ukrainian segment of social media and messengers

20 Грудня 2021
8495

Methodology for the analysis of Ukrainian segment of social media and messengers

8495
Since 2020 Detector Media has started a systematic analysis of narratives and images on social media like Facebook, Twitter, YouTube video platform, Telegram messenger, etc. In this text, we describe the methodology by which we analyse the content of these social media. It can be supplemented with approaches to the analysis of other social media.
Methodology for the analysis of Ukrainian segment of social media and messengers
Methodology for the analysis of Ukrainian segment of social media and messengers

Українською читайте тут

Where do we get the data from?

Detector Media respects the principles of privacy and security of personal data on social media. Therefore, we take the public data for the analysis, i.e., the one that user has allowed to be collected and processed. Each social network we analyze (Facebook, Twitter, Telegram, YouTube) has its own policies on obtaining, processing, and storing the data. Detector Media has taken into account each network's policies and considered European Union's legislation on personal data protection - GDPR. The data providers are the social media themselves or companies certified by them.

By the Ukrainian segment of such social media as Facebook, Twitter, and Telegram, we describe posts of profiles, pages, groups, and channels located in Ukraine or those who have indicated their location as in Ukraine.

Twitter

Types of data processed by Detector Media:

  • texts of public posts and replies to them;
  • information on the time of the publication of posts and replies to them;
  • number of likes and shares of the posts and replies to them;
  • titles of pages - authors of posts and replies to them;
  • number and list of subscribers and pages’ subscriptions.

Facebook

Types of data processed by Detector Media:

  • texts of public posts and comments to them;
  • information on the time of publication of posts and replies;
  • the number and type of interaction with the post (preferences, distribution, follow the link);
  • names of groups and pages - authors of posts and comments;
  • information about open groups (date of creation, whether the page from which it is administered has been changed);
  • number and list of subscribers.

Telegram

Types of data processed by Detector Media:

  • texts of Telegram channels posts and comments to them;
  • information on the time of publication of posts and comments;
  • information about Telegram channels (date of creation, number of subscribers, and country affiliation);
  • information about the distribution and mention of the message by another Telegram channel.

YouTube

Types of data processed by Detector Media:

  • auto-generated video subtitles;
  • information about the video (creation date, title, description, number of subscribers, number of views, number of likes);
  • information about YouTube channels (creation date, number of subscribers, number of uploaded videos, number of views).

How do we process data?

Detector Media analyses textual and quantitative data using libraries for statistical analysis, natural language processing, and machine learning based on the Python programming language. More details on the types of our analysis:

  • n-gram analysis: automated identification and collection of the most popular words and phrases in the texts;
  • text tone analysis - automatic determination of positive, negative, or neutral tone of the messages;
  • topics modeling - automatic definition of topics mentioned in posts. Topic modeling allows to obtain general information about the content in the body of the documents. It works on the assumption that documents consist of a number of topics, and topics consist of words/phrases that are often found next to each other. Since the algorithm does not create names for such topics, the analysts do so manually after automated generation;
  • recognition of the named entities consists of separating proper names (people, organizations, and locations) from texts. At the first stage, the algorithm automatically finds references to proper names, categorizes them, and, if possible, determines the tone (the attitude of the post’s author to the chosen noun). In the second stage, analysts manually supplement the dictionary of proper names so that in the future, the algorithm can automatically “normalize” them (i.e., determine that, for example, “SBU” and “Security Service of Ukraine” have the same meaning);
  • relationship and network analysis - building a network of relationships between users and posts on social media. This allows us to identify user groups, possible bot networks, and more.

The general analysis algorithm is as follows: first, the data set is processed via computerised methods that help generalise large data sets. This helps to identify trends, patterns, and correlations so that the analyst can further purposefully explore specific aspects relevant to the subject of the study.

How do we identify the influences of hostile information in social media?

Approach #1. Detecting the activity of inauthentic coordinated behavior, i.e., bots that promote consonant messages.

Approach #2. Establishing the relationship and connection analysis between users, groups, and channels.

Approach #3. Labeling sources. For example, SBU has published a list of Telegram channels administered by the General Directorate of the General Staff of the Armed Forces of Russian Federation.

Approach #4. Verifying the allegations for veracity.

Approach #5. Comparing messages to sound in accord with / similar to the Kremlin’s propaganda disinformation narratives.

Such approaches are not mutually exclusive but rather complementary. The combination of approaches helps us more effectively identify and argue regarding the existing influences of hostile information in Ukrainian segment of social media.

Команда «Детектора медіа» понад 20 років виконує роль watchdog'a українських медіа. Ми аналізуємо якість контенту і спонукаємо медіагравців дотримуватися професійних та етичних стандартів. Щоб інформація, яку отримуєте ви, була правдивою та повною.

До 22-річчя з дня народження видання ми відновлюємо нашу Спільноту! Це коло активних людей, які хочуть та можуть фінансово підтримати наше видання, долучитися до генерування спільних ідей та отримувати більше ексклюзивної інформації про стан справ в українських медіа.

Мабуть, ще ніколи якісна журналістика не була такою важливою, як сьогодні.
* Знайшовши помилку, виділіть її та натисніть Ctrl+Enter.
8495
Коментарі
0
оновити
Код:
Ім'я:
Текст:
Долучайтеся до Спільноти «Детектора медіа»!
Ми прагнемо об’єднати тих, хто вміє критично мислити та прагне змінювати український медіапростір на краще. Разом ми сильніші!
Спільнота ДМ
Використовуючи наш сайт ви даєте нам згоду на використання файлів cookie на вашому пристрої.
Даю згоду