z-logo
Premium
Finding flexible patterns in unaligned protein sequences
Author(s) -
Jonassen Inge,
Collins John F.,
Higgins Desmond G.
Publication year - 1995
Publication title -
protein science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.353
H-Index - 175
eISSN - 1469-896X
pISSN - 0961-8368
DOI - 10.1002/pro.5560040817
Subject(s) - ambiguity , class (philosophy) , set (abstract data type) , measure (data warehouse) , identification (biology) , computer science , computational biology , conserved sequence , theoretical computer science , biology , mathematics , pattern recognition (psychology) , artificial intelligence , genetics , data mining , base sequence , gene , botany , programming language
We present a new method for the identification of conserved patterns in a set of unaligned related protein sequences. It is able to discover patterns of a quite general form, allowing for both ambiguous positions and for variable length wildcard regions. It allows the user to define a class of patterns (e.g., the degree of ambiguity allowed and the length and number of gaps), and the method is then guaranteed to find the conserved patterns in this class scoring highest according to a significance measure defined. Identified patterns may be refined using one of two new algorithms. We present a new (nonstatistical) significance measure for flexible patterns. The method is shown to recover known motifs for PROSITE families and is also applied to some recently described families from the literature.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here