The importance of larger data sets for protein secondary structure prediction with neural networks | Zendy

Chandonia JohnMarc | Zendy; Karplus Martin | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

The importance of larger data sets for protein secondary structure prediction with neural networks

Author(s) -

Chandonia JohnMarc,

Karplus Martin

Publication year - 1996

Publication title -

protein science

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 3.353

H-Index - 175

eISSN - 1469-896X

pISSN - 0961-8368

DOI - 10.1002/pro.5560050422

Subject(s) - artificial neural network , conjugate gradient method , gradient descent , protein secondary structure , sequence (biology) , computer science , algorithm , class (philosophy) , data mining , artificial intelligence , chemistry , biochemistry

A neural network algorithm is applied to secondary structure and structural class prediction for a database of 318 nonhomologous protein chains. Significant improvement in accuracy is obtained as compared with performance on smaller databases. A systematic study of the effects of network topology shows that, for the larger database, better results are obtained with more units in the hidden layer. In a 32‐fold cross validated test, secondary structure prediction accuracy is 67.0%, relative to 62.6% obtained previously, without any evolutionary information on the sequence. Introduction of sequence profiles increases this value to 72.9%, suggesting that the two types of information are essentially independent. Tertiary structural class is predicted with 80.2% accuracy, relative to 73.9% obtained previously. The use of a larger database is facilitated by the introduction of a scaled conjugate gradient algorithm for optimizing the neural network. This algorithm is about 10–20 times as fast as the standard steepest descent algorithm.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research