Pergamos - Library and Information Center of National and Kapodistrian University of Athens

Unit:

Κατεύθυνση / ειδίκευση Υπολογιστικά Συστήματα: Λογισμικό και Υλικό (ΣΥΣ)
Πληροφορική

Deposit date:

2021-11-26

Year:

2021

Author:

Sartzetakis Dimitrios

Supervisors info:

Γκιζόπουλος Δημήτριος, Καθηγητής, Πληροφορικής και Τηλεπικοινωνιών, Εθνικόν και Καποδιστριακόν Πανεπιστήμιον Αθηνών

Original Title:

GPGPU injector 4.0: A Framework for Architectural Vulnerability Factor (AVF) Assessments Across Nvidia GPUs Generations using GPGPU-Sim 4.0 simulator

Languages:

English
Greek

Translated title:

GPGPU injector 4.0: A Framework for Architectural Vulnerability Factor (AVF) Assessments Across Nvidia GPUs Generations using GPGPU-Sim 4.0 simulator

Summary:

A (Graphics Processing Unit) GPU is a programmable processor on which thousands of processing cores run simultaneously in massive parallelism, where each core is focused on making efficient calculations, facilitating real-time processing and analysis of enormous datasets. Due to the development of general purpose parallel programming environments and languages, all modern GPUs are general purpose GPUs (GPGPUs) as they can be programmed for non-graphics applications and they can direct their processing power towards massively parallel problems. Therefore, as in all general-purpose computing platforms, accurate reliability on GPU hardware structures is a very important factor that architects need to estimate early in the design cycle to weigh the benefits of error protection techniques against their costs.
In this thesis, we introduce GPGPU injector 4.0 which is a fault injection framework for Architectural Vulnerability Factor (AVF) assessment of hardware structures and entire GPU chips that runs over the state-of-the-art performance simulator for Nvidia GPUs architectures: GPGPU-sim. We use GPGPU injector 4.0 for fault injection of transient faults (soft errors) on CUDA enabled GPU architecture. The target hardware structures include the register file, the shared memory, the L1 data/texture cache and the L2 cache which altogether account for several tens of MBs on on-chip GPU storage. More specifically, we compute the AVF of two widely used recent graphic cards which are the RTX 2060 and Quadro GV100 by experimenting with ten different CUDA benchmarks that are simulated on the actual instruction set (SASS).

Main subject category:

Technology - Computer science

Keywords:

transient faults, AVF estimation, Failures In Time (FIT), register file, shared memory, cache memories, GPGPU-Sim

Index:

Yes

Number of index pages:

Contains images:

Yes

Number of references:

Number of pages:

File: