z-logo
Premium
Structural analysis based on state‐space modeling
Author(s) -
Stultz Collin M.,
White James V.,
Smith Temple F.
Publication year - 1993
Publication title -
protein science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.353
H-Index - 175
eISSN - 1469-896X
pISSN - 0961-8368
DOI - 10.1002/pro.5560020302
Subject(s) - sequence (biology) , protein secondary structure , algorithm , markov chain , hidden markov model , loop modeling , smoothing , computer science , set (abstract data type) , markov model , mathematics , artificial intelligence , protein structure prediction , protein structure , statistics , machine learning , biology , genetics , programming language , biochemistry
Abstract A new method has been developed to compute the probability that each amino acid in a protein sequence is in a particular secondary structural element. Each of these probabilities is computed using the entire sequence and a set of predefined structural class models. This set of structural classes is patterned after Jane Richardson's taxonomy for the domains of globular proteins. For each structural class considered, a mathematical model is constructed to represent constraints on the pattern of secondary structural elements characteristic of that class. These are stochastic models having discrete state spaces (referred to as hidden Markov models by researchers in signal processing and automatic speech recognition). Each model is a mathematical generator of amino acid sequences; the sequence under consideration is modeled as having been generated by one model in the set of candidates. The probability that each model generated the given sequence is computed using a filtering algorithm. The protein is then classified as belonging to the structural class having the most probable model. The secondary structure of the sequence is then analyzed using a “smoothing” algorithm that is optimal for that structural class model. For each residue position in the sequence, the smoother computes the probability that the residue is contained within each of the defined secondary structural elements of the model. This method has two important advantages: (1) the probability of each residue being in each of the modeled secondary structural elements is computed using the totality of the amino acid sequence, and (2) these probabilities are consistent with prior knowledge of realizable domain folds as encoded in each model. As an example of the method's utility, we present its application to flavodoxin, a prototypical α/β protein having a central β ‐sheet, and to thioredoxin, which belongs to a similar structural class but shares no significant sequence similarity.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here