cnnAlpha : Protein disordered regions prediction by reduced amino acid alphabets and convolutional neural networks | Zendy

Oberti Mauricio | Zendy; Vaisman Iosif I. | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

cnnAlpha : Protein disordered regions prediction by reduced amino acid alphabets and convolutional neural networks

Author(s) -

Oberti Mauricio,

Vaisman Iosif I.

Publication year - 2020

Publication title -

proteins: structure, function, and bioinformatics

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 1.699

H-Index - 191

eISSN - 1097-0134

pISSN - 0887-3585

DOI - 10.1002/prot.25966

Subject(s) - convolutional neural network , computer science , artificial intelligence , sequence (biology) , reduction (mathematics) , pattern recognition (psychology) , class (philosophy) , intrinsically disordered proteins , proteome , machine learning , computational biology , data mining , bioinformatics , chemistry , mathematics , biology , biochemistry , geometry

Intrinsically disordered regions (IDR) play an important role in key biological processes and are closely related to human diseases. IDRs have great potential to serve as targets for drug discovery, most notably in disordered binding regions. Accurate prediction of IDRs is challenging because their genome wide occurrence and a low ratio of disordered residues make them difficult targets for traditional classification techniques. Existing computational methods mostly rely on sequence profiles to improve accuracy which is time consuming and computationally expensive. This article describes an ab initio sequence‐only prediction method—which tries to overcome the challenge of accurate prediction posed by IDRs—based on reduced amino acid alphabets and convolutional neural networks (CNNs). We experiment with six different 3‐letter reduced alphabets. We argue that the dimensional reduction in the input alphabet facilitates the detection of complex patterns within the sequence by the convolutional step. Experimental results show that our proposed IDR predictor performs at the same level or outperforms other state‐of‐the‐art methods in the same class, achieving accuracy levels of 0.76 and AUC of 0.85 on the publicly available Critical Assessment of protein Structure Prediction dataset (CASP10). Therefore, our method is suitable for proteome‐wide disorder prediction yielding similar or better accuracy than existing approaches at a faster speed.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research