
Analysis of AlphaFold2 for Modeling Structures of Wildtype and Variant Protein Sequences
Author(s) -
Anowarul Kabir,
Toki Inan,
Amarda Shehu
Publication year - 2022
Publication title -
epic series in computing
Language(s) - English
Resource type - Conference proceedings
SCImago Journal Rank - 0.21
H-Index - 7
ISSN - 2398-7340
DOI - 10.29007/5g4v
Subject(s) - computer science , protein structure prediction , sequence (biology) , protein tertiary structure , benchmark (surveying) , mutation , protein design , wild type , protein structure , computational biology , function (biology) , stability (learning theory) , artificial intelligence , biology , genetics , machine learning , mutant , gene , biochemistry , geodesy , geography
ResNet and, more recently, AlphaFold2 have demonstrated that deep neural networks can now predict a tertiary structure of a given protein amino-acid sequence with high accuracy. This seminal development will allow molecular biology researchers to advance various studies linking sequence, structure, and function. Many studies will undoubtedly focus on the impact of sequence mutations on stability, fold, and function. In this paper, we evaluate the ability of AlphaFold2 to predict accurate tertiary structures of wildtype and mutated sequences of protein molecules. We do so on a benchmark dataset in mutation modeling studies. Our empirical evaluation utilizes global and local structure analyses and yields several interesting observations. It shows, for instance, that AlphaFold2 performs similarly on wildtype and variant sequences. The placement of the main chain of a protein molecule is highly accurate. However, while AlphaFold2 reports similar confidence in its predictions over wildtype and variant sequences, its performance on placements of the side chains suffers in comparison to main-chain predictions. The analysis overall supports the premise that AlphaFold2-predicted structures can be utilized in further downstream tasks, but that further refinement of these structures may be necessary.