Premium
Improving the theranostics of Mendelian diseases: from ad hoc to evidence‐based tailored thresholds
Author(s) -
Simcikova Daniela,
Heneberg Petr
Publication year - 2018
Publication title -
the faseb journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.709
H-Index - 277
eISSN - 1530-6860
pISSN - 0892-6638
DOI - 10.1096/fasebj.2018.32.1_supplement.532.16
Subject(s) - epistasis , missense mutation , inference , computational biology , computer science , novelty , machine learning , gene , artificial intelligence , mutation , biology , bioinformatics , genetics , psychology , social psychology
Prediction methods are an integral part of biomedical and biotechnological research, particularly with the recent applications of next‐generation sequencing and ever‐growing databases. Most of the prediction methods are based on evolutionary sequence information combined with structural features of proteins. We focused on how applicable the state‐of‐the‐art prediction methods are. We tested the functionality of the recently developed method, EVmutation, which reflects epistasis, on a large set of clinically observed variations in genes associated with Mendelian diseases. Our study has included 44 genes, 7,178 missense variations with known clinical phenotype of their carriers, and 221,337 theoretical missense variations. We compared the outputs from EVmutation with those from SNAP2 and PoPMuSiC 2.1. Despite its novelty, EVmutation was associated with a lack of specificity that was similar to the other methods. The lack of specificity seems to be the common constraint of all recently used prediction methods despite their predictions are associated with high sensitivity, which is often close to 100%. Hence, we tailored the default settings of prediction methods, which appeared burdened by a kind of systemic error, in a general and gene‐specific manner. We found the threshold settings that allowed high specificity and high sensitivity of EVmutation and/or SNAP2 for a precise prediction of up to 41% variations. Besides that, we focused on unsettled criteria for VUS (variant of unknown significance) and inference of variation conservation from multiple sequence alignments (MSAs). Particularly, the effectiveness of prediction methods that are based on use of MSAs can be crippled by low variability among analyzed protein sequences. As a proof of concept, we demonstrated the the improvement in prediction of the effect of clinically observed variations in highly conserved genes AR and PTEN by using a modified MSA building approach. The study confirmed that the predictions of variations in enzymatically active proteins and/or in highly conserved domain structures are more precise than predictions of proteins within flexible parts of protein chains or in domains without catalytic function, which maintain their function, irrespectively of their low sequence identity. When we analyzed a distribution of EVmutation and SNAP2 scores among clinically observed and theoretical variations, we identified several large previously unreported pools of variations that are under negative selection during molecular evolution and are absent in patients; these variations were particularly prominent in G6PD , PTPN11 , HNF4A and HBB . The study showed that the prediction of clinical effects differs from simply predicting the effects at the biochemical or molecular level. We thus argue for the re‐evaluation of currently used approaches before getting confused with a vast number of newly developed prediction approaches, which all fail to cope with the specificity issue unless appropriately modified. This abstract is from the Experimental Biology 2018 Meeting. There is no full text article associated with this abstract published in The FASEB Journal .