Premium
Discovering structural correlations in α‐helices
Author(s) -
Klingler Tod M.,
Brutlag Douglas L.
Publication year - 1994
Publication title -
protein science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.353
H-Index - 175
eISSN - 1469-896X
pISSN - 0961-8368
DOI - 10.1002/pro.5560031024
Subject(s) - amino acid , sequence (biology) , structural motif , protein structure , sequence logo , protein secondary structure , computational biology , representation (politics) , probabilistic logic , conditional independence , protein structure prediction , computer science , peptide sequence , sequence alignment , artificial intelligence , biology , genetics , biochemistry , politics , gene , political science , law
We have developed a new representation for structural and functional motifs in protein sequences based on correlations between pairs of amino acids and applied it to α‐helical and β‐sheet sequences. Existing probabilistic methods for representing and analyzing protein sequences have traditionally assumed conditional independence of evidence. In other words, amino acids are assumed to have no effect on each other. However, analyses of protein structures have repeatedly demonstrated the importance of interactions between amino acids in conferring both structure and function. Using Bayesian networks, we are able to model the relationships between amino acids at distinct positions in a protein sequence in addition to the amino acid distributions at each position. We have also developed an automated program for discovering sequence correlations using standard statistical tests and validation techniques. In this paper, we test this program on sequences from secondary structure motifs, namely α‐helices and β‐sheets. In each case, the correlations our program discovers correspond well with known physical and chemical interactions between amino acids in structures. Furthermore, we show that, using different chemical alphabets for the amino acids, we discover structural relationships based on the same chemical principle used in constructing the alphabet. This new representation of 3‐dimensional features in protein motifs, such as those arising from structural or functional constraints on the sequence, can be used to improve sequence analysis tools including pattern analysis and database search.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom