z-logo
open-access-imgOpen Access
Ultrafast end-to-end protein structure prediction enables high-throughput exploration of uncharacterized proteins
Author(s) -
Shaun M. Kandathil,
Joe G Greener,
Andy M. Lau,
David T. Jones
Publication year - 2022
Publication title -
proceedings of the national academy of sciences of the united states of america
Language(s) - English
Resource type - Journals
eISSN - 1091-6490
pISSN - 0027-8424
DOI - 10.1073/pnas.2113348119
Subject(s) - computer science , preprocessor , convolutional neural network , uniprot , end to end principle , representation (politics) , sequence (biology) , set (abstract data type) , protein structure prediction , artificial intelligence , pattern recognition (psychology) , feature (linguistics) , algorithm , protein sequencing , computational biology , data mining , protein structure , peptide sequence , biology , genetics , biochemistry , linguistics , philosophy , politics , political science , law , gene , programming language
Significance We present a deep learning-based predictor of protein tertiary structure that uses only a multiple sequence alignment (MSA) as input. To date, most emphasis has been on the accuracy of such deep learning methods, but here we show that accurate structure prediction is also possible in very short timeframes (a few hundred milliseconds). In our method, the backbone coordinates of the target protein are output directly from the neural network, which makes the predictor extremely fast. As a demonstration, we generated over 1.3 million models of uncharacterized proteins in the BFD, a large sequence database including many metagenomic sequences. Our results showcase the utility of ultrafast and accurate tertiary structure prediction in rapidly exploring the “dark space” of proteins.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here