
A general-purpose protein design framework based on mining sequence–structure relationships in known protein structures
Author(s) -
Jianfu Zhou,
Alexandra E. Panaitiu,
Gevorg Grigoryan
Publication year - 2019
Publication title -
proceedings of the national academy of sciences of the united states of america
Language(s) - English
Resource type - Journals
eISSN - 1091-6490
pISSN - 0027-8424
DOI - 10.1073/pnas.1908723117
Subject(s) - computer science , protein design , protein structure prediction , protein structure , sequence (biology) , protein data bank , computational biology , data mining , bioinformatics , theoretical computer science , biology , genetics , biochemistry
Significance Evolution has given us proteins that perform amazingly complex tasks in living systems, each molecule appearing “custom-built” for its particular purpose. Protein design seeks to enable the “custom building” of proteins at will, for specific tasks, without waiting for evolution. This is a grand challenge, because how a protein’s 3-dimensional structure and function are encoded in its amino acid sequence is exceedingly difficult to model. In this paper, we argue that sequence–structure encodings can instead be learned directly from proteins of known structure, which enables an approach to design. We are at an exciting time in protein science, where emergent principles inferred from data may allow us to make headway in cases where application of first principles is challenging.