Design and implementation of a computational method for peptides classification depending on their bioactivity

Postgraduate Thesis uoadl:2866614 418 Read counter

Unit:
Κατεύθυνση Βιοπληροφορική
Library of the School of Science
Deposit date:
2019-03-18
Year:
2019
Author:
Kolotourou Efstathia
Supervisors info:
Γεωργακίλας Αλέξανδρος,Αναπληρωτής Καθηγητής,Τομέας Φυσικής,Σχολή Εφαρμοσμένων Μαθηματικών και Φυσικών Επιστημών,Εθνικό Μετσόβειο Πολυτεχνείο
Βοργιάς Κωνσταντίνος,Καθηγητής ,Τομέας Βιοχημείας και Μοριακής Βιολογίας, Τμήμα Βιολογίας,Εθνικό και Καποδιστριακό Πανεπιστήμιο Αθηνών
Μπάγκος Παντελεήμων,Καθηγητής ,Τμήμα Πληροφορικής με Εφαρμογές στη Βιοϊατρική, Πανεπιστήμιο Θεσσαλίας
Original Title:
Σχεδιασμός και υλοποίηση υπολογιστικής μεθόδου για την ταξινόμηση πεπτιδίων με βάση τη βιοδραστηριότητά τους
Languages:
Greek
Translated title:
Design and implementation of a computational method for peptides classification depending on their bioactivity
Summary:
In recent years, the discovery of new infection therapeutics has become an urgent due to the rapidly increasing infection resistance toward conventional antibiotics. The fact that most potential antibiotics fail to kill pathogens has grown the need to design new antimicrobial agents and the clinical research is now invested into identification of new, non-conventional anti-infective therapies. Antimicrobial peptides (AMPs) have captured the research attention as novel drug candidates and they are also known as host defense peptides because they can protect the host from various pathogenic bacteria. They are oligopeptides with a varying number of amino acids, five to one hundred, and they are effective towards a wide variety of targets, such as bacteria, viruses, fungi and parasites. Natural AMPs can be found in both prokaryotes (e.g., bacteria) and eukaryotes (e.g., protozoan, fungi, plants, insects, and animals) most of which are cationic and amphipathic. This amphipathicity gives them the ability to disrupt the physical integrity of the microbial membrane. Moreover, the electrostatic forces between the negatively charged bacterial surface and the cationic AMPs is another determinant for this interaction between peptides and microbial membrane. The design of novel synthetic AMPs, in-silico, has turned to be a serious challenge because of the broad spectrum of properties that characterize AMPs. Public databases have been already developed and store useful information for hundreds of natural AMPs. Identifying and characterizing AMPs and their functional types has led to many studies and a number of methods have been proposed. Most of them have extracted information from sequence alignment, however, the amino acid composition cannot fully explain the interaction mechanisms of AMPs. Recently, various predictors using machine learning have been developed, based on the compositional characteristics of AMPs amino acid sequences. In this study, a novel computational methodology has been implemented which will offer the ability to characterize the peptides depending on their anti-inflammatory and/or anticancer bioactivity. The first step was the collection of antimicrobial, anticancer, antibacterial, antifungal, antiviral and insecticidal AMPs from DRAMP and DAMPD databases, as the positive sample. Peptides that do not appear to present any bioactivity were collected from Uniprot, as the negative sample. The technique of boosting and random oversampling increased the number of the samples and the final peptide amount was 1491; 213 from each studied category. A number of physicochemical and sequential features (44), which have been proved to be the most indicative for characterizing the antimicrobial peptides, was calculated for each peptide and constituted the final training dataset. The second step was the execution of EnsembleGASVR classification methodology; a computational framework that was proposed in the past for predicting neutral and pathogenic polymorphism variations by classifying missense SNPs (Single Nucleotide Polymorphism) to neutral and disease associated. This methodology facilitates a two-step algorithm, which combines a Support Vector Regression (SVR) classifier with a Genetic Algorithm. Additionally, the method was further improved after replacing the Genetic Algorithm, in its second step, with an evolutionary multi-objective framework, in an attempt to achieve higher performance. The advantage of this method is the effective handling of missing values and the produced models provide a confidence score on every prediction. The training dataset was used to train the aforementioned algorithm. Finally, the quality of the method was evaluated after applying the trained models on a test dataset which predicted the category of the bioactivity that a peptide could belong to. The algorithm was executed five times and the mean of accuracy was calculated to 66.77, while the variance was calculated to 31.07. The ultimate purpose of this algorithm is to apply the best trained models on a superset of uncharacterized peptides, which are met in plant organisms, in order to propose new peptides and plants that can be used in the cancer prevention and the fight against inflammation.
Main subject category:
Science
Keywords:
antimicrobial peptide, bioactivity, antimicrobial features, Machine Learning, Support Vector Regression, Evolutionary algorithms, Classification
Index:
No
Number of index pages:
0
Contains images:
Yes
Number of references:
63
Number of pages:
124
Kolotourou_Efstathia_Diploma_Thesis.pdf (2 MB) Open in new window