Unit:
Department of Informatics and TelecommunicationsΠληροφορική
Author:
PAPADAKIS CHARIDIMOS-PORFYRIOS
Supervisors info:
Dimitris Gizopoulos, Professor
Original Title:
Evaluating the Impact of Hardware Faults in Modern Microprocessor Arithmetic Units
Translated title:
Evaluating the Impact of Hardware Faults in Modern Microprocessor Arithmetic Units
Summary:
Nowadays, we are surrounded by an astonishing amount of electronic devices constantly
transmitting data to each other or to the internet. Most of our everyday tasks, such as navigation,
communications and payments are handled by a computer. It is obvious that our
modern world relies vastly on calculations done on computers and, therefore, on computer
(micro)processors.
Due to the ease, the high accuracy and speed that modern computing systems offer, we
have come to accept that the result the computer’s processor gives us is always the correct
one. Indeed, the reliability of a electronic computer chip is much higher than every
method we used before them (mechanical, magnetic). However, as has been pointed out
for some years now, even computer chips can make silent errors. That is, errors which
are caused by hardware (silicon) defects and not noticed (detected) by any hardware or
software mechanism. Over the past few years, large tech companies, operating thousands
of servers have pointed out the existence of such hardware defects.
It has also been made clear that the arithmetic units of modern processors are the most
common culprits of output errors due to hardware defects.
Based on that context, the scope of this thesis is to study the patterns, the frequency,
and the severity of the errors caused by such hardware defects in arithmetic units of processors.
The employed methodology consists of HDL models synthesis of those modules
and then, the intentional induction of certain models of hardware defects (faults) to them, in
order to observe the results and patterns inside them. Two types of faults were introduced
to the synthesized models; bridging faults (that is faults which are caused by bridging the
output of two different gates) and stuck-at faults (faults which are caused by forcing the
output of a gate to be either constantly high or constantly low).
By implementing such faults to a large number of different random gates inside the synthesized
module and testing the modules with random and patterned inputs, the distribution
of errors on the outputs can be observed and, therefore, produce an error model for
the output of each arithmetic unit tested.
Main subject category:
Technology - Computer science
Keywords:
fault injection, bridging faults analysis, stuck at faults analysis, processor reliability study, processor fault model
File:
File access is restricted until 2025-04-23.
Thesis_Charidimos_Papadakis_Final.pdf
1 MB
File access is restricted until 2025-04-23.
Thesis_Charidimos_Papadakis_Extras.zip
5 MB
File access is restricted.