Fake News Detection with the GREEK-BERT Model with a focus on COVID-19

Graduate Thesis uoadl:2967905 162 Read counter

Unit:
Department of Informatics and Telecommunications
Πληροφορική
Deposit date:
2021-12-02
Year:
2021
Author:
FIORETOS DIMOSTHENES
Supervisors info:
Κουμπαράκης Μανόλης, Καθηγητής, Πληροφορικής και Τηλεπικοινωνιών, Θετικών Επιστημών
Original Title:
Fake News Detection with the GREEK-BERT Model with a focus on COVID-19
Languages:
English
Greek
Translated title:
Fake News Detection with the GREEK-BERT Model with a focus on COVID-19
Summary:
Fake news, while being a problem appearing since the ancient times, is one of the major political and societal issues of recent years. The issue becomes even more important by the prevalence of social media use by the general public. Especially during the COVID19 pandemic, fake news dissemination can have very serious and even fatal side effects for societies as well as individuals.
This thesis outlines our work in creating two classification models for fake news and fake social media posts, alongside a web application for studying the relationships and dissemination patterns of fake and non fake information in social media platforms. Our work is target at the Greek language and the ongoing coronavirus pandemic.
We also present an overview of the research work on which we base our models, as well as related research endeavors regarding fake news detection.
For this purpose we have reused an existing Greek fake news data set, which was part of Odysseas Trispiotis' Master in Science Thesis [1], and we have also created a novel data set for the purposes of this project. In the process of generating this novel data set, we have observed that finding reliable fake post sources is a hard problem, even more so to automate it. The basis of the our classification models are the state of the art BERT [2] and GREEK-BERT [3] models.
The results of the above process were very encouraging, as the final classification models reached accuracy levels greater than 90%, with similarly good scores for other traditional classification metrics, such as precision, recall, f1 score and AUROC.
Main subject category:
Technology - Computer science
Keywords:
machine learning, natural language processing, automatic data tagging, fake news detection, BERT, GREEK-BERT
Index:
Yes
Number of index pages:
3
Contains images:
Yes
Number of references:
124
Number of pages:
65
dimosthenes_fioretos_thesis.pdf (896 KB) Open in new window