Pergamos - Library and Information Center of National and Kapodistrian University of Athens

Unit:

Κατεύθυνση / ειδίκευση Διαχείριση Πληροφορίας και Δεδομένων (ΔΕΔ)
Πληροφορική

Deposit date:

2021-03-22

Year:

2021

Author:

Trispiotis Odysseas

Supervisors info:

Αλέξης Δελής, Καθηγητής, Τμήμα Πληροφορικής και Τηλεπικοινωνιών, ΕΚΠΑ

Original Title:

Real-time Fake-news Detection in Greek using a Browser Extension

Languages:

English

Translated title:

Real-time Fake-news Detection in Greek using a Browser Extension

Summary:

Fake news have become more prevalent in recent years with the increased popularity and use of social media. Such news sometimes can be very dangerous, as they deceive readers and can push the public towards dangerous actions. So we deem critical to detect this type of publicly available information in real-time.

This thesis outlines our work in creating an experimental web browser extension that recognizes if a web page containing a Greek news article is fake with the process being entirely transparent to the user. We do that by carrying out our analysis real-time and ascertain the probability of this article to be illegitimate using contemporary ML techniques.

Initially, we collected a good amount of Greek articles (~35,000) and we marked the fake ones, in order to create a dataset. Then we used this dataset along with basic feature extraction techniques in order to create an input for training various classification algorithms. The result of the above process produces a machine learning model, which can be stored in a file and used for predictions in new unseen data. After that, we compare the results of produced models based on some common metrics. Then we chose the model, which gave us the best results, and we create a backend REST A.P.I. based on this. Finally, we create a separate browser extension as a U.I. client to preview the results.

The results of the above process were quite encouraging considering the amount of available data and showed that our extension can predict fast (~35ms) and with great accuracy (~95%) if an article is fake news or not. There are several open issues for improvement and future research, such as the fake news detection by using various neural networks instead of classification algorithms. Also, the automatic retrain of the model with new data and the handling of which part of web page’s content is an article are some open issues from current thesis.

Main subject category:

Technology - Computer science

Keywords:

classification, feature extraction, model, probability prediction, browser extension, data extraction, data tagging

Index:

Yes

Number of index pages:

Contains images:

Yes

Number of references:

Number of pages:

File: