CAST: an iterative algorithm for the complexity analysis of sequence
tracts

Promponas, VJ; Enright, AJ; Tsoka, S; Kreil, DP; Leroy, C; and Hamodrakas, S; Sander, C; Ouzounis, CA

doi:10.1093/bioinformatics/16.10.915

Μονάδα:

Ερευνητικό υλικό ΕΚΠΑ

Τίτλος:

CAST: an iterative algorithm for the complexity analysis of sequence
tracts

Γλώσσες Τεκμηρίου:

Αγγλικά

Περίληψη:

Motivation: Sensitive detection and masking of low-complexity regions in
protein sequences. Filtered sequences can be used in sequence comparison
without the risk of matching compositionally biased regions. The main
advantage of the method over similar approaches is the selective masking
of single residue types without affecting other possibly important,
regions.
Results: A novel algorithm for low-complexity region detection and
selective masking. The algorithm is based on multiple-pass
Smith-Waterman comparison of the query sequence against twenty
homopolymers with infinite gap penalties. The output of the algorithm is
both the masked query sequence for further analysis, e.g. database
searches, as well as the regions of low complexity. The detection of
low-complexity regions is highly specific for single residue types. ft
is shown that this approach is sufficient for masking database query
sequences without generating false positives. The algorithm is
benchmarked against widely available algorithms using the 210 genes of
Plasmodium falciparum chromosome 2, a dataset known to contain a large
number of low-complexity regions.

Έτος δημοσίευσης:

2000

Συγγραφείς:

Promponas, VJ
Enright, AJ
Tsoka, S
Kreil, DP
Leroy, C
and Hamodrakas, S
Sander, C
Ouzounis, CA

Περιοδικό:

NAR GENOMICS AND BIOINFORMATICS

Εκδότης:

Oxford University Press

Τόμος:

Αριθμός / τεύχος:

Σελίδες:

915-922