Large-scale design and refinement of stable proteins using sequence-only models | Zendy

Jedediah M. Singer | Zendy; Scott Novotney | Zendy; Devin Strickland | Zendy; Hugh K. Haddox | Zendy; Nicholas Leiby | Zendy; Gabriel J. Rocklin | Zendy; Cameron M. Chow | Zendy; Anindya Roy | Zendy; Asim K. Bera | Zendy; Francis C. Motta | Zendy; Longxing Cao | Zendy; EvaMaria Strauch | Zendy; Tamuka M. Chidyausiku | Zendy; Alex Ford | Zendy; Ethan Ho | Zendy; Alexander Zaitzeff | Zendy; Craig O. Mackenzie | Zendy; Hamed Eramian | Zendy; Frank DiMaio | Zendy; Gevorg Grigoryan | Zendy; Matthew Vaughn | Zendy; Lance Stewart | Zendy; David Baker | Zendy; Eric Klavins | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Large-scale design and refinement of stable proteins using sequence-only models

Author(s) -

Jedediah M. Singer,

Scott Novotney,

Devin Strickland,

Hugh K. Haddox,

Nicholas Leiby,

Gabriel J. Rocklin,

Cameron M. Chow,

Anindya Roy,

Asim K. Bera,

Francis C. Motta,

Longxing Cao,

EvaMaria Strauch,

Tamuka M. Chidyausiku,

Alex Ford,

Ethan Ho,

Alexander Zaitzeff,

Craig O. Mackenzie,

Hamed Eramian,

Frank DiMaio,

Gevorg Grigoryan,

Matthew Vaughn,

Lance Stewart,

David Baker,

Eric Klavins

Publication year - 2022

Publication title -

plos one

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.99

H-Index - 332

ISSN - 1932-6203

DOI - 10.1371/journal.pone.0265020

Subject(s) - stability (learning theory) , sequence (biology) , computer science , computational biology , set (abstract data type) , function (biology) , protein design , amino acid , fidelity , peptide sequence , biological system , protein structure , biology , machine learning , genetics , biochemistry , telecommunications , gene , programming language

Engineered proteins generally must possess a stable structure in order to achieve their designed function. Stable designs, however, are astronomically rare within the space of all possible amino acid sequences. As a consequence, many designs must be tested computationally and experimentally in order to find stable ones, which is expensive in terms of time and resources. Here we use a high-throughput, low-fidelity assay to experimentally evaluate the stability of approximately 200,000 novel proteins. These include a wide range of sequence perturbations, providing a baseline for future work in the field. We build a neural network model that predicts protein stability given only sequences of amino acids, and compare its performance to the assayed values. We also report another network model that is able to generate the amino acid sequences of novel stable proteins given requested secondary sequences. Finally, we show that the predictive model—despite weaknesses including a noisy data set—can be used to substantially increase the stability of both expert-designed and model-generated proteins.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research