MVU­-GAN: Unfolding the Latent Space of GANs

Graduate Thesis uoadl:2964681 204 Read counter

Unit:
Department of Informatics and Telecommunications
Πληροφορική
Deposit date:
2021-11-04
Year:
2021
Author:
PAPAGEORGIOU PANTELIS
Supervisors info:
Ιωάννης Παναγάκης, Αναπληρωτής Καθηγητής, Τμήμα Πληροφορικής Και Τηλεπικοινωνιών, Εθνικό και Καποδιστριακό Πανεπιστήμιο Αθηνών
Original Title:
MVU­-GAN: Unfolding the Latent Space of GANs
Languages:
English
Greek
Translated title:
MVU­-GAN: Unfolding the Latent Space of GANs
Summary:
Generative Adversarial Networks (GANs) are deep learning based generative models that learn to map noise latent vectors to high fidelity images. Recent work has shown that the input latent space can be decomposed to semantically meaningful directions. Moving towards these directions corresponds to human interpretable image transformations. For example, from high level aspects such as face shape and general hair style, to smaller scale facial features to color schemes and microstructures, everything can be controlled by moving in the corresponding GAN latent space direction.
In order to achieve image editing by identifying latent space directions, previous state­-of­-the­-art methods either based on supervised approaches or leverage the Principal Com­ponents Analysis (PCA) algorithm. The former have a tremendous disadvantage for the range of directions that can be explored, as they rely on a human-­annotated set of scores for each attribute. The latter tend to use the same method with minor modifications, resulting in similar experimental observations.
In this work, we approach the problem of discovering semantic directions in an unsuper­vised way, using semidefinite programming to perform non­linear dimensionality reduction of the internal representation of GANs. In particular, we examine the generation mech­anism of GANs and further utilize the famous algorithm of Maximum Variance Unfolding, also known as Semidefinite Embedding, to identify semantically meaningful directions by decomposing the pre­trained weights. Furthermore, extensive experiments are conduc­ted on the state-of-the art GAN architectures, StyleGAN and StyleGANv2, for 7 different datasets.
To our knowledge, this is the first work to approach this problem from the perspective of semidefinite programming. While the computational cost can be high, the results clearly demonstrate its superiority in various experiments, while in others they can be compared with the results of the most recent supervised and unsupervised methods. Code is avail­able at https://github.com/PanPapag/MVU­GAN.
Main subject category:
Technology - Computer science
Keywords:
GAN, Image Editing, Semantic Directions, Latent Space, Semidefinite Programming
Index:
Yes
Number of index pages:
3
Contains images:
Yes
Number of references:
34
Number of pages:
32
thesis.pdf (9 MB) Open in new window