Computational studies of protein aggregation for the diagnosis and treatment of Amyloidoses

Doctoral Dissertation uoadl:3413879 67 Read counter

Unit:
Department of Biology
Library of the School of Science
Deposit date:
2024-08-08
Year:
2024
Author:
Apostolakou Avgi-Elena
Dissertation committee:
Βασιλική Α. Οικονομίδου, Αναπληρώτρια Καθηγήτρια, Τμήμα Βιολογίας, Εθνικό και Καποδιστριακό Πανεπιστήμιο Αθηνών
Σπύρος Ευθυμιόπουλος, Καθηγητής, Τμήμα Βιολογίας, Εθνικό και Καποδιστριακό Πανεπιστήμιο Αθηνών
Ιωάννης Π. Τρουγκάκος, Καθηγητής, Τμήμα Βιολογίας, Εθνικό και Καποδιστριακό Πανεπιστήμιο Αθηνών
Παντελής Μπάγκος, Καθηγητής, Τμήμα Πληροφορικής με Εφαρμογές στη Βιοϊατρική, Πανεπιστήμιο Θεσσαλίας
Ανδρέας Σκορίλας, Καθηγητής, Τμήμα Βιολογίας, Εθνικό και Καποδιστριακό Πανεπιστήμιο Αθηνών
Διδώ Βασιλακοπούλου, Αναπληρώτρια Καθηγήτρια, Τμήμα Βιολογίας, Εθνικό και Καποδιστριακό Πανεπιστήμιο Αθηνών
Γεώργιος Παυλόπουλος, Ερευνητής Α΄ βαθμίδας, Ε.ΚΕ.Β.Ε. «Αλέξανδρος Φλέμιγκ»
Original Title:
Computational studies of protein aggregation for the diagnosis and treatment of Amyloidoses
Languages:
English
Translated title:
Computational studies of protein aggregation for the diagnosis and treatment of Amyloidoses
Summary:
Under denaturing conditions, many proteins fail to acquire or maintain their native structure, leading to the adoption of an alternative – non-native – fold. While the cell has mechanisms to preserve protein homeostasis, these can fail resulting in misfolded proteins accumulating in the cell. The misfolded proteins can then assemble into either amorphous aggregates or organized aggregates, the latter called amyloid fibrils. A wide range of diseases are caused by the deposition of amyloid fibrils, these are called amyloidoses and are a subset of conformational diseases. Examples of amyloidoses include neurodegenerative diseases, such as Alzheimer’s disease and Parkinson’s disease, type 2 diabetes, and transthyretin amyloidosis. At least 38 proteins and protein products have been shown to assemble into pathological amyloids in human, in either their wild-type or mutated variants. The amyloid deposits formed by these amyloidogenic proteins may contain other proteins; the role of these co-deposited proteins in disease is under investigation.
A wealth of knowledge has been accumulated about aggregation and amyloidogenicity, and yet, many questions remain unanswered. Therefore, the aim of this doctoral thesis, which took place at the Department of Cell Biology & Biophysics under the supervision of Associate Professor Vassiliki A. Iconomidou, was to conduct computational studies that will help answer some of the many remaining questions regarding protein aggregation and amyloidogenicity. The systemic exploration of amyloidogenic proteins or proteins related to protein aggregation was achieved with the use of computational methodologies. For example, the analysis of biological networks—such as protein-protein interaction networks—was used to explore the relationships underlying amyloidogenicity, to compare disease-related molecular mechanisms in humans and model organisms, and to identify promising drugs and drug targets. The ultimate aim of this dissertation was the computational study of amyloidogenic proteins and their aggregation mechanism in order to contribute, along with experimental approaches, to the diagnosis and treatment of amyloidoses.
The results of this dissertation are split into five projects, each corresponding to a published, submitted or ready for submission scientific research article.
Amyloidogenic proteins and polymorphisms
Protein aggregation and amyloid formation are processes that can be greatly affected by changes in the protein sequence. Some amyloidoses have been associated with specific mutations, such as transthyretin amyloidosis which has been extensively studied. However, no study has been conducted on mutations found in amyloidogenic proteins as a whole. To fill this gap a study was conducted to evaluate the effect of single nucleotide polymorphisms (SNPs) on amyloidogenic proteins. SNPs are genetic variations that correspond to a variant in a single base position in the DNA and typically must have a frequency of 1% or more in a given population. SNPs can be categorized based on their influence on the protein product. Here the focus was on missense SNPs (msSNPs) that lead to an amino acid substitution.
For this study, all known human amyloidogenic proteins were collected. Proteins characterized as precursors (causative) in the Amyloidoses Collection (AmyCo) database were extracted. These were supplemented using the most recent list of amyloidogenic proteins curated by the International Society of Amyloidosis (ISA). Pharmaceutical products not corresponding to human genes and immunoglobins, which are extremely variable, were excluded. Next, three databases were used for the collection of msSNPs regarding these proteins, UniProt, ClinVar, and dbSNP. This data was mapped to the primary protein sequences from UniProt and duplicate entries were removed. All msSNPs were classified using a unified terminology as Pathogenic, Benign or Unclassified depending on their association with disease(s). Additionally, the polymorphisms related specifically to amyloidoses and related phenotypes were isolated, and a subset was created for further analysis. Next, to assess how msSNPs might affect protein aggregation, the aggregation-prone regions of each protein were predicted using AMYLPRED2, a consensus method for predicting amyloidogenic determinants from the protein sequence.
Statistical analysis of the collected data was accomplished using three methodologies, each applied where appropriate. First, the chi-squared test was used to determine whether the classification of an msSNP is related to its location in or outside of aggregation-prone regions (APRs). Indeed, aggregation-prone regions were shown to have a larger percentage of pathogenic msSNPs. However, the difference was statistically significant (p<0.05) only for the dataset of amyloidoses-related msSNPs. Next, logistic regression was performed to examine if an amino acid substitution that results in a change in biophysical properties (e.g., polar to non-polar) is related to the pathogenicity status of the msSNP. Most notably, substitutions of a negatively charged residue by any other category were more likely to be pathogenic than benign. Lastly, to estimate the statistical significance of each possible amino acid substitution a bootstrap method was used to theoretically increase the available data in a random and unbiased way. The most frequent substitutions of the pathogenic msSNPs in aggregation-prone regions were that of Glutamic acid by Lysine, Arginine by Histidine, and Leucine by Proline. Lastly, a case study on AD and the critical APP protein was further explored.
The results of this work have been presented at conferences and a manuscript has been prepared for publication. (Galanis, F. P., Apostolakou, A. E., et al., In silico study of missense SNPs on Amyloidogenic Proteins)
Amyloidoses network
Beside the common element of amyloid deposition, amyloidoses and related diseases are diverse in their presentation and biology but also have other common characteristics. Study of these diseases are typically limited to one or few, therefore a study was done on these diseases as a whole in order to identify common elements in their molecular mechanisms. This could help with the fact that most amyloidoses and other diseases related to amyloid deposition have limited treatment options. Drug development is a long and costly process, and this has led to the adoption of strategies to mitigate these issues, such as the use of screening assays. This problem is exacerbated by rare diseases that affect a small number of people and have fewer resources dedicated to studying them. A promising approach for drug development with shorter timelines and lower costs is called drug repurposing; this is essentially the re-purposing of existing drugs for use beyond their scope.
In this work, a network of amyloidoses and diseases related to amyloid deposition was constructed. Amyloidogenic proteins and co-deposited proteins were the primary focus, these were collected from the Amyloidoses Collection (AmyCo). This dataset was updated through an extensive literature search, which resulted in an additional disease. The top 100 protein interactors for each protein were programmatically extracted from the STRING database. Drugs targeting any of the proteins in the dataset were collected from DrugBank, a comprehensive database of drugs and drug targets. Furthermore, DrugBank was also used to find drugs indicated for the diseases under study. A network of associations between diseases, proteins and drugs was constructed and visualized using the freely available tool Cytoscape. Finally, the literature was reviewed for any previously established associations between the drugs and the diseases under study and promising drugs and drug targets were identified.
A total of 76 diseases caused by or associated with amyloids were studied. The central proteins were found in the network of disease and proteins associated to them that is the precursor and co-deposited proteins. These central proteins corresponded to two amyloid signature proteins and to the two serum amyloid-α proteins that are associated with several inflammatory diseases where secondary amyloidosis can develop. The total network is composed of 76 diseases, 768 proteins and 1,414 drugs. The disease-protein-drug association network was strongly connected with only four diseases being isolated. Less than half of the diseases have available drugs but most of the remaining proteins are indirectly associated to drugs, therefore there is potential for repurposing these drugs. Within this network, important drug targets and their associated drugs were found such as G-protein coupled receptors (GPCRs) and protein kinases. Lastly, the findings from this network analysis can be used to guide experimental studies for discovering treatments for amyloidoses.
The results of this work have been presented at conferences and a manuscript has been prepared for publication. (Rizou, A. E. I., Nasi, G. I., Apostolakou, A. E., et al., Integrated network-based analysis of diseases associated with amyloid deposition through a disease-protein-drug network.)
Alzheimer’s disease and G-protein Coupled Receptors
G-protein coupled receptors (GPCRs) are the largest family of eukaryotic membrane receptors and are responsible for many important physiological functions, such as vision, olfaction, and neurotransmission. GPCRs are considered ideal drug targets and have been implicated in AD, thereby making them subjects of interest for treating AD. As part of this study, the relationship between AD and GPCR signaling was explored computationally. First, the human GPCR signaling network was constructed and made available via a web application that was created and named hGPCRnet. This application was then used to study the muscarinic acetylcholine receptor pathways that are known to be associated with AD pathology. Finally, proteins that are found exclusively in either the M1 or M2 receptor signaling pathway were shown to be prime pharmacological targets for treating AD.
The results of this work were published in a peer-reviewed international scientific journal. (Apostolakou, A. E., Baltoumas, F. A., Stravopodis, D. J., & Iconomidou, V. A. (2020). Extended Human G-Protein Coupled Receptor Network: Cell-Type-Specific Analysis of G-Protein Coupled Receptor Signaling Pathway. Journal of proteome research, 19(1), 511–524. https://doi.org/10.1021/acs.jproteome.9b00754)
Alzheimer’s disease and drug repurposing
One of the most studied amyloidoses is Alzheimer’s disease (AD), the most common type of dementia. AD is a neurodegenerative disease that is characterized by the presence of amyloid plaques, which are composed primarily of amyloid-β peptide, a cleavage product of APP. Even though most attempts to find a treatment for AD have focused on the primary component of amyloid plaques, there has been little to no consideration of the co-deposited proteins. While some of these proteins have an unknown role in AD and may not influence its course, others have been shown to affect amyloid formation and disease progression, such as the molecular chaperone clusterin, and the apolipoprotein E. The aim of this work was to determine if these proteins have potential as therapeutic targets and if any existing drugs that target them can be used for drug repurposing.
The proteins found on amyloid plaques were collected from AmyCo. Their protein interactors were retrieved from IntAct, a database of experimentally determined molecular interactions. Next, the drugs targeting the proteins found on amyloid plaques were collected from DrugBank. Additionally, the drugs approved by the FDA for use in AD and their protein targets from DrugBank were gathered. The network of proteins and drugs interacting with the amyloid plaques was constructed and compared to the existing treatments. Lastly, a review of the literature was done to gather information about the drugs that could be repurposed for use in AD.
Including the precursor protein APP there was a total of 12 proteins found in amyloid plaques. These interacted with over 500 proteins and 72 drugs forming a large, connected network. Only one co-deposited protein was not targeted by drugs, while APP was the main protein associated with drugs. In fact, the only drug found in the network that is approved by the FDA for the treatment of AD was an anti-amyloid-β monoclonal antibody, with the other AD drugs primarily prescribed as palliative care and therefore not targeting the cause. However, there were several drugs that target proteins central to amyloid plaques that could be useful in combating AD. After literature research, a total of 28 drug candidates were suggested for further investigation. Of these, there were 15 drugs with known association with AD and related processes and 13 drugs with no or limited association with AD. These candidates were not amongst drugs proposed by previous similar studies on AD and therefore support the novelty of this approach.
The results of this work have been presented at conferences and a manuscript has been prepared for publication. (Apostolakou, A. E., et al., Co-deposited proteins in Alzheimer's disease as a potential treasure trove for drug repurposing.)
Alzheimer’s disease and Caenorhabditis elegans
Alzheimer’s disease (AD) is a progressive neurodegenerative disorder with no known treatment and whose pathogenesis has not yet been elucidated. The main pathological hallmarks of AD are amyloid plaques, which are extracellular deposits of fibrils consisting of abnormally folded amyloid-β peptide (Aβ) a cleavage product of amyloid precursor protein (APP), and neurofibrillary tangles composed of intracellular filaments of the microtubule-associated Tau protein. Because of their central role in AD, APP and more recently Tau have been the primary focus of study. However, other biological processes have been connected to AD and are under scrutiny, such as oxidative stress, inflammation and neurotransmission. Evidence has shown that these AD-related processes are interconnected and a number of feedback loops connect and regulate the various pathways.
Age-related diseases, such as AD, often require the use of model organisms due to ethical limitations, the large human lifespan, and the slow progression of the disease. Various organisms have been employed to study AD including the commonly used: Caenorhabditis elegans (nematode), Mus musculus (mouse) and Drosophila melanogaster (fruit fly). How appropriate an organism is to model a specific disease is in part dependent on how similar the relevant proteins and pathways are between the model and the human. System biology approaches, such as protein-protein interaction networks, can reveal the similarities and differences in AD-related pathways between humans and the organisms used to study this disease.
As part of this doctoral study, computational methods were applied to the study of AD and the amyloidogenic proteins involved, APP and Tau. More specifically, a comparison of AD-related pathways between Homo sapiens and C. elegans was made using a network alignment approach. Protein interaction data were collected from the STRING and IntAct databases. C. elegans proteins and their human orthologs were extracted from WormBase, a database about nematode biology curated by experts. The networks were created and visualized using Cytoscape and multiple network alignment algorithms were tested. Key conserved processes in the two organisms were the APP processing and Tau phosphorylation pathways. Finally, a number of conserved interactions and specific proteins were identified as potential targets of experimental studies in C. elegans.
The results of this work were published in a peer-reviewed international scientific journal. (Apostolakou, A. E., Sula, X. K., Nastou, K. C., Nasi, G. I., & Iconomidou, V. A. (2021). Exploring the conservation of Alzheimer-related pathways between H. sapiens and C. elegans: a network alignment approach. Scientific reports, 11(1), 4572. https://doi.org/10.1038/s41598-021-83892-9)
Conclusions
As part of this doctoral dissertation studies were conducted regarding the entirety of human amyloidogenic proteins, as well as studies focused on Alzheimer's disease. Relevant data from biological databases and the scientific literature, such as genetic polymorphisms or molecular interactions, were analyzed with the use of computational methods and tools. These works could be further expanded through computational studies, such as a structural analysis of polymorphisms, comparison of Alzheimer-related networks between human and other model organisms, as well as study of other diseases related to protein aggregation. Furthermore, these results must be validated experimentally and for this reason many targets of study are suggested, including drugs with potential for repurposing and C. elegans interactions. Lastly, to facilitate future studies the majority of the collected resulting data is available through web applications.
Main subject category:
Science
Keywords:
Protein aggregation, Amyloids, Amyloidoses, Alzheimer's disease, G-protein coupled receptors, Bioinformatics, Single nucleotide polymorphisms, Network analysis, Protein interactions, Drug repurposing
Index:
Yes
Number of index pages:
3
Contains images:
Yes
Number of references:
508
Number of pages:
244
File:
File access is restricted until 2027-10-04.

Apostolakou_PhD_2024_final.pdf
13 MB
File access is restricted until 2027-10-04.