Benchmarking of clustering algorithms in biological networks

Postgraduate Thesis uoadl:2927426 152 Read counter

Unit:
Κατεύθυνση Βιοπληροφορική
Library of the School of Science
Deposit date:
2020-11-08
Year:
2020
Author:
Pampalou Andromachi
Supervisors info:
Παντελής Μπάγκος, Καθηγητής, Τμήμα Πληροφορικής με Εφαρμογές στην Βιοϊατρική, Πανεπιστήμιο Θεσσαλίας
Original Title:
Συγκριτική αξιολόγηση αλγορίθμων ομαδοποίησης εφαρμοσμένοι σε βιολογικά δίκτυα
Languages:
Greek
Translated title:
Benchmarking of clustering algorithms in biological networks
Summary:
The function of an organism on the molecular level depends to a large extent on the interactions of its macromolecules. The formation of protein complexes is a basic unit, responsible for a variety of biological mechanisms within the cell. The identification, therefore, of the specific interactions becomes necessary in order to clarify both the structural and the functional organization of living beings. In previous years, the attempt to find protein - protein interactions (PPIs) was limited to experimental methods which, however, mainly investigated small sets of proteins. In recent decades, thanks to the development of high - throughput methods, scientists have at their disposal a significant amount of data. By using computational methods, such as protein - protein interaction networks (PPINs) this information can be processed in order to predict protein complexes with potential biological significance.
The main purpose of this thesis is to provide an overview and evaluation in some of the main clustering algorithms that are applied in biological networks and specifically in protein - protein interaction networks, in order to identify protein complexes. The first step to achieve this is the extensive study of databases containing such interactions and the selection of a representative dataset of PPIs. Thereafter in total 6 different clustering algorithms (Affinity Propagation, ClusterONE, MCL, MCODE, NCMine, and SPICi) are executed with inputs the interaction networks of those datasets. The results that are produced undergo statistical evaluation in order to verify the credibility of the methods, by comparing the results of the different algorithms both with each other and with protein complexes of a reference dataset. Through this study it is possible to determine the algorithm that returns the best results as well as the algorithms with the most related results, consequently leading to more effective future studies of protein - protein interactions.
Main subject category:
Science
Keywords:
Protein - protein Interaction Networks (PPINs), Protein - Protein Interactions (PPIs), Biological Networks, Clusterinh Algorithms, Benchmarking of clustering algorithms
Index:
Yes
Number of index pages:
8
Contains images:
Yes
Number of references:
119
Number of pages:
160
Διπλωματική Εργασία -Παμπάλου Ανδρομάχη.pdf (15 MB) Open in new window