Premium
Predicting phenotype from genotype: Improving accuracy through more robust experimental and computational modeling
Author(s) -
Gallion Jonathan,
Koire Amanda,
Katsonis Panagiotis,
Schoenegge AnneMarie,
Bouvier Michel,
Lichtarge Olivier
Publication year - 2017
Publication title -
human mutation
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.981
H-Index - 162
eISSN - 1098-1004
pISSN - 1059-7794
DOI - 10.1002/humu.23193
Subject(s) - computational model , experimental data , computation , computer science , function (biology) , scalability , relevance (law) , computational biology , biology , phenotype , affect (linguistics) , computational complexity theory , machine learning , artificial intelligence , algorithm , genetics , statistics , mathematics , database , political science , gene , law , linguistics , philosophy
Computational prediction yields efficient and scalable initial assessments of how variants of unknown significance may affect human health. However, when discrepancies between these predictions and direct experimental measurements of functional impact arise, inaccurate computational predictions are frequently assumed as the source. Here, we present a methodological analysis indicating that shortcomings in both computational and biological data can contribute to these disagreements. We demonstrate that incomplete assaying of multifunctional proteins can affect the strength of correlations between prediction and experiments; a variant's full impact on function is better quantified by considering multiple assays that probe an ensemble of protein functions. Additionally, many variants predictions are sensitive to protein alignment construction and can be customized to maximize relevance of predictions to a specific experimental question. We conclude that inconsistencies between computation and experiment can often be attributed to the fact that they do not test identical hypotheses. Aligning the design of the computational input with the design of the experimental output will require cooperation between computational and biological scientists, but will also lead to improved estimations of computational prediction accuracy and a better understanding of the genotype–phenotype relationship.