Premium
Protein contact prediction using patterns of correlation
Author(s) -
Hamilton Nicholas,
Burrage Kevin,
Ragan Mark A.,
Huber Thomas
Publication year - 2004
Publication title -
proteins: structure, function, and bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.699
H-Index - 191
eISSN - 1097-0134
pISSN - 0887-3585
DOI - 10.1002/prot.20160
Subject(s) - artificial neural network , correlation , disjoint sets , sequence (biology) , protein structure prediction , set (abstract data type) , computer science , artificial intelligence , pattern recognition (psychology) , algorithm , mathematics , protein structure , biology , combinatorics , genetics , geometry , biochemistry , programming language
Abstract We describe a new method for using neural networks to predict residue contact pairs in a protein. The main inputs to the neural network are a set of 25 measures of correlated mutation between all pairs of residues in two “windows” of size 5 centered on the residues of interest. While the individual pair‐wise correlations are a relatively weak predictor of contact, by training the network on windows of correlation the accuracy of prediction is significantly improved. The neural network is trained on a set of 100 proteins and then tested on a disjoint set of 1033 proteins of known structure. An average predictive accuracy of 21.7% is obtained taking the best L /2 predictions for each protein, where L is the sequence length. Taking the best L /10 predictions gives an average accuracy of 30.7%. The predictor is also tested on a set of 59 proteins from the CASP5 experiment. The accuracy is found to be relatively consistent across different sequence lengths, but to vary widely according to the secondary structure. Predictive accuracy is also found to improve by using multiple sequence alignments containing many sequences to calculate the correlations. Proteins 2004. © 2004 Wiley‐Liss, Inc.