Premium
Sequence codes for extended conformation: A neighbor‐dependent sequence analysis of loops in proteins
Author(s) -
Crasto Chiquito J.,
Feng Jinan
Publication year - 2001
Publication title -
proteins: structure, function, and bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.699
H-Index - 191
eISSN - 1097-0134
pISSN - 0887-3585
DOI - 10.1002/1097-0134(20010215)42:3<399::aid-prot100>3.0.co;2-e
Subject(s) - loop modeling , loop (graph theory) , sequence (biology) , amino acid residue , protein data bank , peptide sequence , amino acid , protein structure , biology , chemistry , genetics , biochemistry , protein structure prediction , mathematics , combinatorics , gene
We performed an extensive sequence analysis on the loops of proteins. By dividing a loop databank derived from the Protein Data Bank into groups, we analyzed the chemical characteristics and the sequence preferences of loops of different lengths and loops connecting different secondary structures in proteins. We found that a large population of loops in our loop databank (94.4%) is either partially or completely surface‐exposed. A majority of surface loops in proteins are hydrophilic, whereas the chemical characteristics of interior loops are relatively neutral according to Eisenberg's consensus hydrophobicity scale. As a first step in investigating the intrinsic sequence–structure relationship of loop sequences in proteins, we performed a neighbor‐dependent sequence analysis that calculated the effect of the neighboring amino acid type on the loop propensity of residues in loops. This method enhances the statistical significance of residue propensity, thus allowing us to explore the positional preference of amino acids in loops. Our analysis yielded a series of amino acid dyads that showed high preference for loop conformation. The data presented in this study should prove useful for developing potential codes in recognizing loop sequences in proteins. Proteins 2001;42:399–413. © 2001 Wiley‐Liss, Inc.