CAST: an iterative algorithm for the complexity analysis of sequence tracts

Επιστημονική δημοσίευση - Άρθρο Περιοδικού uoadl:3053896 24 Αναγνώσεις

Μονάδα:
Ερευνητικό υλικό ΕΚΠΑ
Τίτλος:
CAST: an iterative algorithm for the complexity analysis of sequence
tracts
Γλώσσες Τεκμηρίου:
Αγγλικά
Περίληψη:
Motivation: Sensitive detection and masking of low-complexity regions in
protein sequences. Filtered sequences can be used in sequence comparison
without the risk of matching compositionally biased regions. The main
advantage of the method over similar approaches is the selective masking
of single residue types without affecting other possibly important,
regions.
Results: A novel algorithm for low-complexity region detection and
selective masking. The algorithm is based on multiple-pass
Smith-Waterman comparison of the query sequence against twenty
homopolymers with infinite gap penalties. The output of the algorithm is
both the masked query sequence for further analysis, e.g. database
searches, as well as the regions of low complexity. The detection of
low-complexity regions is highly specific for single residue types. ft
is shown that this approach is sufficient for masking database query
sequences without generating false positives. The algorithm is
benchmarked against widely available algorithms using the 210 genes of
Plasmodium falciparum chromosome 2, a dataset known to contain a large
number of low-complexity regions.
Έτος δημοσίευσης:
2000
Συγγραφείς:
Promponas, VJ
Enright, AJ
Tsoka, S
Kreil, DP
Leroy, C
and Hamodrakas, S
Sander, C
Ouzounis, CA
Περιοδικό:
NAR GENOMICS AND BIOINFORMATICS
Εκδότης:
Oxford University Press
Τόμος:
16
Αριθμός / τεύχος:
10
Σελίδες:
915-922
Επίσημο URL (Εκδότης):
DOI:
10.1093/bioinformatics/16.10.915
Το ψηφιακό υλικό του τεκμηρίου δεν είναι διαθέσιμο.