z-logo
open-access-imgOpen Access
Conservative extraction of over-represented extensible motifs
Author(s) -
Alberto Apostolico,
Matteo Comin,
Laxmi Parida
Publication year - 2005
Publication title -
bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.599
H-Index - 390
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/bti1051
Subject(s) - motif (music) , computer science , syntax , extensibility , structural motif , variety (cybernetics) , data science , theoretical computer science , artificial intelligence , biology , programming language , physics , biochemistry , acoustics
The discovery of motifs in biosequences is frequently torn between the rigidity of the model on the one hand and the abundance of candidates on the other. In particular, the variety of motifs described by strings that include 'don't care' (dot) patterns escalates exponentially with the length of the motif, and this gets only worse if a dot is allowed to stretch up to some prescribed maximum length. This circumstance tends to generate daunting computational burdens, and often gives rise to tables that are impossible to visualize and digest. This is unfortunate, as it seems to preclude precisely those massive analyses that have become conceivable with the increasing availability of massive genomic and protein data. Although a part of the problem is endemic, another part of it seems rooted in the various characterizations offered for the notion of a motif, that are typically based either on syntax or on statistics alone. It seems worthwhile to consider alternatives that result from a prudent combination of these two aspects in the model.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here