Premium
In Silico Models to Discriminate Compounds Inducing and Noninducing Toxic Myopathy
Author(s) -
Hu Xiaoying,
Yan Aixia
Publication year - 2012
Publication title -
molecular informatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.481
H-Index - 68
eISSN - 1868-1751
pISSN - 1868-1743
DOI - 10.1002/minf.201100067
Subject(s) - myopathy , support vector machine , test set , training set , artificial intelligence , in silico , set (abstract data type) , pattern recognition (psychology) , chemistry , computer science , biochemistry , medicine , pathology , gene , programming language
Toxic myopathy is a muscular disease in which the muscle fibers do not function and which results in muscular weakness. Some drugs, such as lipid‐lowering drugs and antihistamines, can cause toxic myopathy. In this work, a dataset containing 232 chemical compounds inducing toxic myopathy (IM‐compounds) and 117 drugs not inducing toxic myopathy (notIM‐compounds) was collected. The dataset was split into a training set (containing 270 compounds) and a test set (containing 79 compounds). A Kohonen’s self‐organizing map (SOM) and a support vector machine (SVM) were applied to develop classification models to differentiate IM‐compounds and notIM‐compounds. Polarizibity related descriptors, electronegativity related descriptors, atom charges related descriptors, H‐bonding related descriptor, atom identity and molecular shape descriptors were used to build models. Using the SOM method, classification accuracies of 88.4 % for the training set and 88.2 % for the test set were achieved; using the SVM method, classification accuracies of 95.6 % for the training set and 86.1 % for the test set were achieved. In addition, extended connectivity fingerprints (ECFP_4) were calculated and analyzed to find important substructures of molecules relating to toxic myopathy.