DistilBERT-EL-KLD: Knowledge Distillation and Greek Language Modeling

Postgraduate Thesis uoadl:3398449 31 Read counter

Unit:
Specialty Language Technology
Πληροφορική
Deposit date:
2024-05-13
Year:
2024
Author:
Koursaris Athanasios
Supervisors info:
Όνομα: Εμμανουήλ
Επώνυμο: Κουμπαράκης
Βαθμίδα: Καθηγητής
Τμήμα: Πληροφορικής και Επικοινωνιών
Ίδρυμα: ΕΚΠΑ
Original Title:
DistilBERT-EL-KLD: Knowledge Distillation and Greek Language Modeling
Languages:
English
Translated title:
DistilBERT-EL-KLD: Knowledge Distillation and Greek Language Modeling
Summary:
The purpose of the following thesis is the implementation, pre-training, fine-tuning, and evaluation of a DistilBERT model specialized in the Modern Greek Language. After an in-depth examination of the theoretical background in classical Machine Learning and Deep Learning with Neural Networks with the purpose of understanding the inner workings of Transformer Neural Networks and especially BERT and DistilBERT, what follows is the detailed description of the development process concerning such a model, from its pretraining on large Modern Greek Language Corpora, to its fine-tuning and evaluation on downstream Natural Language Processing Tasks, such as Named Entity Recognition, Part of Speech Tagging, as well as Natural Language Inference. The Knowledge Distillation technique seems to play a paramount role in the creation of models that appear to be faster and relatively computationally inexpensive when compared to other larger similar architectures, yet they seem to exhibit rather insignificant downgrades, if any, when it comes to accuracy. The model developed for the purposes of this thesis (DistilBERT-EL-KLD), being a compressed version of GREEK-BERT, appears to exhibit very similar performance and output results to those of its predecessor.
Main subject category:
Technology - Computer science
Keywords:
compression, knowledge distillation, neural networks, deep learning, classification
Index:
No
Number of index pages:
0
Contains images:
Yes
Number of references:
30
Number of pages:
60
Thesis-Koursaris-2024.pdf (2 MB) Open in new window