z-logo
Premium
Self‐organized neural maps of human protein sequences
Author(s) -
Ferrán Edgardo A.,
Pflugfelder Bernard,
Ferrara Pascual
Publication year - 1994
Publication title -
protein science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.353
H-Index - 175
eISSN - 1469-896X
pISSN - 0961-8368
DOI - 10.1002/pro.5560030316
Subject(s) - self organizing map , artificial neural network , computer science , artificial intelligence , cluster analysis , unsupervised learning , pattern recognition (psychology)
We have recently described a method based on artificial neural networks to cluster protein sequences into families. The network was trained with Kohonen's unsupervised learning algorithm using, as inputs, the matrix patterns derived from the dipeptide composition of the proteins. We present here a large‐scale application of that method to classify the 1,758 human protein sequences stored in the SwissProt database (release 19.0), whose lengths are greater than 50 amino acids. In the final 2‐dimensional topologically ordered map of 15 × 15 neurons, proteins belonging to known families were associated with the same neuron or with neighboring ones. Also, as an attempt to reduce the time‐consuming learning procedure, we compared 2 learning protocols: one of 500 epochs (100 SUN CPU‐hours [CPU‐h]), and another one of 30 epochs (6.7 CPU‐h). A further reduction of learning‐computing time, by a factor of about 3.3, with similar protein clustering results, was achieved using a matrix of 11×11 components to represent the sequences. Although network training is time consuming, the classification of a new protein in the final ordered map is very fast (14.6 CPU‐seconds). We also show a comparison between the artificial neural network approach and conventional methods of biosequence analysis.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here