Understanding the viral phylogenetic distribution and biodiversity at a protein family level.

Postgraduate Thesis uoadl:2964113 220 Read counter

Unit:
Κατεύθυνση Βιοπληροφορική-Επιστήμη Βιοϊατρικών Δεδομένων
Πληροφορική
Deposit date:
2021-10-29
Year:
2021
Author:
Voutsadaki Kleanthi
Supervisors info:
Γεώργιος Α. Παυλόπουλος, Ερευνητής Β’ ,BSRC Alexander Fleming
Ιωάννης Ζ. Εμίρης, Καθηγητής, ΕΚΠΑ
Martin Rezcko, Staff Scientist, Level A’, ΕΚΠΑ/BSRC Alexander Fleming
Original Title:
Understanding the viral phylogenetic distribution and biodiversity at a protein family level.
Languages:
English
Translated title:
Understanding the viral phylogenetic distribution and biodiversity at a protein family level.
Summary:
Viruses play an important role in an ecosystem’s integrity as they interact with many
living organisms varying from bacteria to eukaryotic species. Therefore, understanding
viral biodiversity can provide insights into their role in a biological system, such as
host-virus interactions. The aim of this research is to study and compare the phylogeny,
functional properties and habitat distribution of viral protein families before and after
their enrichment with metagenomic sequences. The methodology which was followed
was to extract all viral protein sequences from NCBI and to cluster them into families
using sequence similarity approaches, such as LAST sequence alignment and MCL
clustering. After clustering, the protein families were further enriched with sequences
from metagenomes and metatranscriptomes using sequence alignment approaches.
The metagenome metadata allowed us to shed light on the habitat distribution of the
enriched families and unravel possible origins and current locations of the families. The
families were annotated with functions and biological processes using Pfam and Gene
Ontology databases in order to perform functional comparative analysis prior and after
the enrichment. Furthermore, comparative taxonomic analysis was conducted on
different lineage levels using network-based approaches. After the taxonomic analysis,
it was found that the enrichment with metagenomes did not affect the original
taxonomic lineage significantly. Nevertheless some new taxonomic labels were
introduced and some initial labels disappeared. Also, most protein domains associated
with our protein families were not annotated in the Gene Ontology database. Regarding
the habitat distribution, some protein families were found to be habitat-unique and some
families were shared across different habitats.
Main subject category:
Science
Keywords:
μεταγωνιδιώματα, φυλογενετική, δίκτυα, στοίχιση αλληλουχιών, ενδιαιτήματα, λειτουργίες πρωτεϊνών
Index:
Yes
Number of index pages:
5
Contains images:
Yes
Number of references:
143
Number of pages:
94
File:
File access is restricted only to the intranet of UoA.

Thesis_Manuscript_Voutsadaki_Pergamos_.pdf
12 MB
File access is restricted only to the intranet of UoA.