Real-time Fake-news Detection in Greek using a Browser Extension

Postgraduate Thesis uoadl:2939919 117 Read counter

Unit:
Κατεύθυνση / ειδίκευση Διαχείριση Πληροφορίας και Δεδομένων (ΔΕΔ)
Πληροφορική
Deposit date:
2021-03-22
Year:
2021
Author:
Trispiotis Odysseas
Supervisors info:
Αλέξης Δελής, Καθηγητής, Τμήμα Πληροφορικής και Τηλεπικοινωνιών, ΕΚΠΑ
Original Title:
Real-time Fake-news Detection in Greek using a Browser Extension
Languages:
English
Translated title:
Real-time Fake-news Detection in Greek using a Browser Extension
Summary:
Fake news have become more prevalent in recent years with the increased popularity and use of social media. Such news sometimes can be very dangerous, as they deceive readers and can push the public towards dangerous actions. So we deem critical to detect this type of publicly available information in real-time.

This thesis outlines our work in creating an experimental web browser extension that recognizes if a web page containing a Greek news article is fake with the process being entirely transparent to the user. We do that by carrying out our analysis real-time and ascertain the probability of this article to be illegitimate using contemporary ML techniques.

Initially, we collected a good amount of Greek articles (~35,000) and we marked the fake ones, in order to create a dataset. Then we used this dataset along with basic feature extraction techniques in order to create an input for training various classification algorithms. The result of the above process produces a machine learning model, which can be stored in a file and used for predictions in new unseen data. After that, we compare the results of produced models based on some common metrics. Then we chose the model, which gave us the best results, and we create a backend REST A.P.I. based on this. Finally, we create a separate browser extension as a U.I. client to preview the results.

The results of the above process were quite encouraging considering the amount of available data and showed that our extension can predict fast (~35ms) and with great accuracy (~95%) if an article is fake news or not. There are several open issues for improvement and future research, such as the fake news detection by using various neural networks instead of classification algorithms. Also, the automatic retrain of the model with new data and the handling of which part of web page’s content is an article are some open issues from current thesis.
Main subject category:
Technology - Computer science
Keywords:
classification, feature extraction, model, probability prediction, browser extension, data extraction, data tagging
Index:
Yes
Number of index pages:
3
Contains images:
Yes
Number of references:
63
Number of pages:
48
thesis.pdf (1 MB) Open in new window

 


thesis_source_code.zip
525 KB
File access is restricted.