Analysis and classification of constrained DNA elements with N-gram graphs and genomic signatures

Επιστημονική δημοσίευση - Άρθρο Περιοδικού uoadl:3025033 16 Αναγνώσεις

Μονάδα:
Ερευνητικό υλικό ΕΚΠΑ
Τίτλος:
Analysis and classification of constrained DNA elements with N-gram graphs and genomic signatures
Γλώσσες Τεκμηρίου:
Αγγλικά
Περίληψη:
Most common methods for inquiring genomic sequence composition, are based on the bag-of-words approach and thus largely ignore the original sequence structure or the relative positioning of its constituent oligonucleotides. We here present a novel methodology that takes into account both word representation and relative positioning at various lengths scales in the form of n-gram graphs (NGG). We implemented the NGG approach on short vertebrate and invertebrate constrained genomic sequences of various origins and predicted functionalities and were able to efficiently distinguish DNA sequences belonging to the same species (intra-species classification). As an alternative method, we also applied the Genomic Signatures (GS) approach to the same sequences. To our knowledge, this is the first time that GS are applied on short sequences, rather than whole genomes. Together, the presented results suggest that NGG is an efficient method for classifying sequences, originating from a given genome, according to their function. © 2014 Springer International Publishing.
Έτος δημοσίευσης:
2014
Συγγραφείς:
Polychronopoulos, D.
Krithara, A.
Nikolaou, C.
Paliouras, G.
Almirantis, Y.
Giannakopoulos, G.
Περιοδικό:
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Εκδότης:
Springer-Verlag
Τόμος:
8542 LNBI
Σελίδες:
220-234
Λέξεις-κλειδιά:
Bioinformatics; Classification (of information); Oligonucleotides, CNEs; Genomic sequence; genomic signatures; n-gram graphs; UCEs, Genes
Επίσημο URL (Εκδότης):
DOI:
10.1007/978-3-319-07953-0_18
Το ψηφιακό υλικό του τεκμηρίου δεν είναι διαθέσιμο.