Efficient algorithms and architectures for protein 3-D structure comparison

Doctoral Dissertation uoadl:2838816 331 Read counter

Unit:
Department of Informatics and Telecommunications
Πληροφορική
Deposit date:
2019-01-10
Year:
2019
Author:
Sharma Anuj
Dissertation committee:
Elias Manolakosr, Professor, Department of Informatics and Telecommunications, University of Athens
George Panayotou, Researcher A B.S.R.C, Fleming
Dimitrios Soudris, Assoc. Professor, NTUA
Evangelia Chrysina, Assoc. Professor, Örebro University
Ioannis Emiris, Professor, Department of Informatics and Telecommunications, University of Athens
Stavros Perantonis, Researcher A, Computational Intelligence Lab, DIMOKRITOS
Yannis Cotronis, Assoc. Professor, Department of Informatics and Telecommunications, University of Athens
Original Title:
Efficient algorithms and architectures for protein 3-D structure comparison
Languages:
English
Greek
Translated title:
Efficient algorithms and architectures for protein 3-D structure comparison
Summary:
Protein Structure Comparison (PSC) is a well developed field of computational proteomics with active interest since it is widely used in structural biology and drug discovery. Fast increasing computational demand for all-to-all protein structures comparison is a result of mainly three factors: rapidly expanding structural proteomics databases, high computational complexity of pairwise PSC algorithms, and the trend towards using multiple criteria for comparison and combining their results (MCPSC). In this thesis we have developed a software framework that exploits many-core and multi-core CPUs to implement efficient parallel MCPSC schemes in modern processors based on three popular PSC methods, namely, TMalign, CE, and USM. We evaluate and compare the performance and efficiency of two parallel MCPSC implementations using Intel’s experimental many-core Single-Chip Cloud Computer (SCC) CPU as well as Intel’s Core i7 multi-core processor. Further, we have developed a dataset processing pipeline and implemented it in a Python utility, called pyMCPSC, allowing users to perform MCPSC efficiently on multi-core CPU. pyMCPSC, which combines five PSC methods and five different consensus scoring schemes, facilitates the analysis of similarities in protein domain datasets and can be easily extended to incorporate more PSC methods in the consensus scoring as they are becoming available.
Main subject category:
Technology - Computer science
Keywords:
Protein, Homology, Machine learning, Sequence comparison, Structure comparison
Index:
Yes
Number of index pages:
7
Contains images:
Yes
Number of references:
148
Number of pages:
158
AnujSharmaPhDthesis_final.pdf (9 MB) Open in new window