Pergamos - Library and Information Center of National and Kapodistrian University of Athens

Unit:

Κατεύθυνση Ηλεκτρονικός Αυτοματισμός (H/A)
Library of the School of Science

Deposit date:

2021-07-22

Year:

2021

Author:

Bellos Filippos

Supervisors info:

Γιάννης Αβρίθης, Ερευνητής, Irisa, Inria Rennes-Bretagne Atlantique
Διονύσιος Ρεΐσης, Καθηγητής, Τμήμα Φυσικής, Εθνικό και Καποδιστριακό Πανεπιστήμιο Αθηνών
Άννα Τζανακάκη, Αναπληρώτρια Καθηγήτρια, Εθνικό και Καποδιστριακό Πανεπιστήμιο Αθηνών

Original Title:

Επαναληπτικός καθαρισμός προβλέψεων για ημι-επιβλεπόμενη μάθηση

Languages:

English
Greek

Translated title:

Iterative label cleaning for semi-supervised learning

Summary:

Deep neural networks have become the de facto model for computer vision applications. Their success is partially attributable to their scalability, i.e., the empirical observation that training them on larger datasets produces better performance. Deep networks often achieve their strong performance through supervised learning, which requires a labeled dataset. The performance benefit conferred by the use of a larger dataset can therefore come at a significant cost since labeling data often requires human labor. This cost can be particularly extreme when labeling must be done by an expert.

A powerful approach for training models on a large amount of data without requiring a large amount of labels is semi-supervised learning (SSL). SSL mitigates the requirement for labeled data by providing a means of leveraging unlabeled data. Since unlabeled data can often be obtained with minimal human labor, any performance boost conferred by SSL often comes with low cost. This has led to a plethora of SSL methods that are designed for deep networks.

In this thesis, we propose two methods that combine successful ideas in problems related to our task at hand. In particular, we propose CleanMatch and WeightMatch, two new semi-supervised learning methods that unify dominant approaches and address their limitations. CleanMatch consists of two stages: (1) iterative selection of the most confident pseudo-labels provided by a combination of consistency regularization and pseudo-labeling following FixMatch and (2) augmentation of the labeled set with the selected examples of the first stage and semi-supervised training based on FixMatch on the augmented dataset. WeightMatch estimates a weight reflecting the confidence of each labeled example, forcing the model to rely more on the confident ones during training.

Our methods improve the state-of-the-art by a large margin on CIFAR-10, SVHN and CIFAR-100, especially on few label settings.

Main subject category:

Science

Other subject categories:

Technology - Computer science

Keywords:

Semi-supervised learning, Noisy labels

Index:

Yes

Number of index pages:

Contains images:

Yes

Number of references:

117

Number of pages:

File:

File access is restricted only to the intranet of UoA.

Persistent URL:

https://pergamos.lib.uoa.gr/uoa/dl/object/2958160

Diploma_thesis_Bellos_Filippos.pdf
4 MB
File access is restricted only to the intranet of UoA.

Iterative label cleaning for semi-supervised learning

PDF file