Supervisors info:
Παναγιώτης Ροντογιάννης, Καθηγητής, Τμήμα Πληροφορικής και Τηλεπικοινωνιών, Εθνικό και Καποδιστριακό Πανεπιστήμιο Αθηνών
Άγγελος Χαραλαμπίδης, Επίκουρος Καθηγητής, Τμήμα Πληροφορικής και Τηλεματικής, Χαροκόπειο Πανεπιστήμιο
Καλλιόπη Κωστοπούλου, Υποψήφια Διδάκτωρ, Τμήμα Πληροφορικής, Πανεπιστήμιο Columbia
Summary:
In recent years, due to significant advancements in computational power, the machine learning community is experimenting with increasingly complex algorithms that are designed to detect patterns in data. Under these circumstances, modern machine learning frameworks are expected to be expressive, so as to provide tools for defining arbitrarily complicated neural architectures. At the same time, in order for such models to be trainable, each underlying framework needs to implement certain optimization algorithms that rely on computing the derivatives of functions, through a process called Automatic Differentiation. What's more, the majority of popular frameworks is exploiting parallel data processing, in order to accelerate the training and inference procedures of machine learning algorithms.
TensorFlow, in particular, is a framework that operates based on the dataflow computational model, which is an alternative programming paradigm that is designed to parallelize computations, so that they can run efficiently in distributed execution environments. However, this inherent design pattern differs fundamentally from traditional, imperative style programming. As a result, efficiency has come at the expense of ease of use, and the framework has attempted to bridge this gap by introducing various features that aim to adapt dataflow constructs into imperative execution. In spite of these efforts to provide expressiveness, as of version 2.16, the framework officially lacks support for recursive function definitions. Recursion is generally regarded as a powerful programming abstraction and is supported in most high level programming languages. For that reason, we believe that TensorFlow, as a machine learning framework, could benefit from supporting such a common feature that programmers are familiar with.
This thesis aims to propose a systematic way of handling recursive function definitions in TensorFlow, in a manner that is compatible and consistent with the dataflow computational model, upon which the framework is based. A significant portion of this work addresses the problem of implementing Automatic Differentiation on such functions, so that they can be used in practical situations, for machine learning.
Keywords:
dataflow, tensorflow, recursive functions, automatic differentiation, static dataflow graphs