Protection of Sensitive Data: Creating, Analyzing and Testing Protocols of Differential Privacy

Graduate Thesis uoadl:2958792 393 Read counter

Unit:
Department of Informatics and Telecommunications
Πληροφορική
Deposit date:
2021-07-28
Year:
2021
Author:
GALANIS NIKOLAOS
Supervisors info:
ΚΩΝΣΤΑΝΤΙΝΟΣ ΧΑΤΖΗΚΟΚΟΛΑΚΗΣ, ΑΝΑΠΛΗΡΩΤΗΣ ΚΑΘΗΓΗΤΗΣ, ΤΜΗΜΑ ΠΛΗΡΟΦΟΔΙΚΗΣ ΚΑΙ ΤΗΛΕΠΙΚΟΙΝΩΝΙΩΝ, ΣΧΟΛΗ ΘΕΤΙΚΩΝ ΕΠΙΣΤΗΜΩΝ, ΕΚΠΑ7
Original Title:
Protection of Sensitive Data: Creating, Analyzing and Testing Protocols of Differential Privacy
Languages:
English
Translated title:
Protection of Sensitive Data: Creating, Analyzing and Testing Protocols of Differential Privacy
Summary:
The problem of preserving privacy while extracting information during data
analysis, has been an everlasting one. Specifically, during the big­data
era, user details can be easily compromised by a malicious handler,
something considered both as a security, and as a privacy issue.

With that being the case, there is a simple solution of denying the access
to user data, thus making the mining of useful information about a
plethora of subjects impossible. On the other hand, a successful mechanism
would be for the data to be flowing without control, something that would
be beneficiary for the advance of sciences (because of the huge amount of
information that would be available), but a significant compromisation for
the individuals’ privacy.

However, none of these two solutions are applicable and helpful for
solving our problem. The answer is finding a balance, that would benefit
both parties: the users and their privacy, as well as the researchers. The
optimal fix to the subject, is Differential Privacy, which is actually a
promise, made by the data handler to the user, that they will not be
affected, by allowing their data to be used in any analysis, no matter
what other studies/databases/info resources are available. Meanwhile, the
output data statistics should be accurate enough for any researcher to
extract useful information from them.

This is a promise that in the first sight, seems rather hard to be
achieved. Despite that, during this thesis, we will look closely into the
theory which makes this form of privacy possible, by the addition of
random noise to the user data. Differential Privacy is based on
probabilistic theories, well known from the 20th century, however, it is a
rather new technique, which has yet to be fully implemented in a handy way
for all data­miners to use.

The goal of this thesis, is to examine and compare previously created
mechanisms for D.P., while also creating our own mechanism, that serves to
the purpose of achieving Local D.P., a form of Differential Privacy that
is nowadays widely used in machine learning algorithms, aiming to protect
the individuals that send their personal data for analysis. We will do so,
by creating a library that is easy to use, and applies to all the rules of
data
privacy, and then extract conclusions from its use.

During this thesis, a lot of testings will be made, in order to convince
for the usability and the efficiency of Differential Privacy.
Main subject category:
Technology - Computer science
Keywords:
Differential Privacy, Security, User data, Data Privacy, Noisy Data, Aggregation of Data
Index:
Yes
Number of index pages:
5
Contains images:
Yes
Number of references:
17
Number of pages:
65
Thesis_Final.pdf (1 MB) Open in new window