Premium
In silico prediction of deleterious single amino acid polymorphisms from amino acid sequence
Author(s) -
Li Shuyan,
Xi Lili,
Li Jiazhong,
Wang Chengqi,
Lei Beilei,
Shen Yulin,
Liu Huanxiang,
Yao Xiaojun,
Li Biao
Publication year - 2011
Publication title -
journal of computational chemistry
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.907
H-Index - 188
eISSN - 1096-987X
pISSN - 0192-8651
DOI - 10.1002/jcc.21701
Subject(s) - in silico , computational biology , human disease , identification (biology) , amino acid , random forest , sequence (biology) , amino acid residue , informatics , computer science , peptide sequence , machine learning , biology , genetics , gene , engineering , botany , electrical engineering
Molecular cause of human disease retains as one of the most attractive scientific research targets for decades. An effective approach toward this topic is analysis and identification of disease‐related amino acid polymorphisms. In this work, we developed a concise and promising deleterious amino acid polymorphism identification method SeqSubPred based on 44 features solely extracted from protein sequence. SeqSubPred achieved surprisingly good predictive ability with accuracy (0.88) and area under receiver operating characteristic (0.94) without resorting to homology or evolution information, which is frequently used in similar methods and usually more complex and time‐consuming. SeqSubPred also identified several critical sequence features obtained from random forests model, and these features brought some interesting insights into the factors affecting human disease‐related amino acid substitutions. The online version of SeqSubPred method is available at montana.informatics.indiana .edu/cgi‐bin/seqmut/seqsubpred.cgi © 2010 Wiley Periodicals, Inc. J Comput Chem, 2011