Design Techniques of Parallel Accelerator Architectures for Real-Time Processing of Learning Algorithms

Doctoral Dissertation uoadl:3314294 67 Read counter

Unit:
Department of Physics
Library of the School of Science
Deposit date:
2023-03-28
Year:
2023
Author:
Papatheofanous Elissaios Alexios
Dissertation committee:
Διονύσιος Ρεΐσης, Καθηγητής, Τμήμα Φυσικής, ΕΚΠΑ
Δημήτριος Σούντρης, Καθηγητής, ΣΗΜΜΥ ΕΜΠ
Άννα Τζανακάκη, Αναπληρώτρια Καθηγήτρια, Τμήμα Φυσικής, ΕΚΠΑ
Έκτορας Νισταζάκης, Καθηγητής, Τμήμα Φυσικής, ΕΚΠΑ
Μάρκος Αναστασόπουλος, Αναπληρωτής Καθηγητής, Τμήμα Φυσικής, ΕΚΠΑ
Κωνσταντίνος Νικητόπουλος, Καθηγητής, University of Surrey, Ηνωμένο Βασίλειο
Γεώργιος Λεντάρης, Επίκουρος Καθηγητής, Τμήμα Μηχανικών Πληροφορικής και Υπολογιστών, ΠΑΔΑ
Original Title:
Design Techniques of Parallel Accelerator Architectures for Real-Time Processing of Learning Algorithms
Languages:
English
Translated title:
Design Techniques of Parallel Accelerator Architectures for Real-Time Processing of Learning Algorithms
Summary:
The current doctoral thesis focuses on Convolutional Neural Networks (CNNs) for computer vision applications and particularly on the deployment of the inference process of CNNs to embedded accelerators suitable for edge computing. The objective of the thesis is to address several challenges regarding the optimization techniques of CNNs towards their edge deployment as well as challenges in the field of CNN accelerator architectures design techniques.
In this direction, the thesis focuses on different deep learning applications, including on-board payload data processing as well as solar irradiance forecasting, and makes distinct contributions to four different challenges in the fields of CNN optimization and CNN accelerators design.

First, the thesis contributes to the existing literature regarding image processing techniques and deep learning-based image regression for solar irradiance estimation and forecasting.
It proposes an image processing method which is based on accurate sun localization in sky images and which utilizes the solar angles and the mapping functions of the lens of the sky imager camera. When the proposed method is applied to the sky images before these are processed by the image regression CNNs, the results from the extensive study that the thesis conducts, show that the method can improve the accuracy of the irradiance values that the CNNs produce in all cases by introducing only minimal computational overhead.

Next, the thesis focuses on the task of deep learning-based semantic segmentation in order to enable cloud detection from satellite imagery in on-board payload data processing applications. In particular, the thesis proposes a lightweight CNN model architecture, based on the U-Net architecture, which aims at providing an improved trade-off between model size and binary semantic segmentation performance. The proposed model utilizes several CNN techniques in order to reduce the number of parameters and operations required for the inference but at the same time maintain satisfying performance. The thesis conducts a study among CNN models for cloud detection, which are evaluated on the same test dataset as the proposed model, and thus showcases the advantages of the proposed model.

Then, the thesis targets the efficient porting of the inference process of image processing CNNs to edge-oriented embedded accelerator devices. The thesis opts for CNN acceleration based on Field-Programmable Gate Arrays (FPGAs) and contributes the adopted development flow which utilizes the Xilinx Vitis AI framework. Apart from exploring the capabilities of Vitis AI, including its advanced quantization solutions, the thesis also showcases an acceleration approach for accelerating different processes of a single computer vision task by taking advantage of the heterogeneous resources of the FPGA. The execution time and throughput results of the CNN models, for the tasks of binary semantic segmentation for cloud detection as well as image regression for irradiance estimation, on the FPGA, showcase the real-time processing capabilities of the accelerator.

Finally, the thesis contributes the design details of a bi-directional interfacing system for high-throughput and fault-tolerant image transfers between deep learning embedded accelerators, in the context of on-board payload data processing architectures. The interfacing system is developed for interfacing an FPGA with the Intel Movidius Myriad 2 and the extensive testing campaign based on both commercial as well as prototype hardware platforms, shows that it can achieve a bit-rate of up to 2.4 Gbps duplex image data transfers.
Main subject category:
Science
Keywords:
Convolutional Neural Networks, Computer Vision, Deep Learning, Edge Computing, Hardware Acceleration
Index:
No
Number of index pages:
0
Contains images:
Yes
Number of references:
76
Number of pages:
92
Doctoral_Thesis_EAP_Pergamos.pdf (50 MB) Open in new window