Predictive Toxicology Modeling: Protocols for Exploring hERG Classification and Tetrahymena pyriformis End Point Predictions | Zendy

BoHan Su | Zendy; Yi-Shu Tu | Zendy; Emilio Xavier Esposito | Zendy; Yufeng Jane Tseng | Zendy

Open Access

Predictive Toxicology Modeling: Protocols for Exploring hERG Classification and Tetrahymena pyriformis End Point Predictions

Author(s) -

BoHan Su,

Yi-Shu Tu,

Emilio Xavier Esposito,

Yufeng Jane Tseng

Publication year - 2012

Publication title -

journal of chemical information and modeling

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 1.24

H-Index - 160

eISSN - 1549-960X

pISSN - 1549-9596

DOI - 10.1021/ci300060b

Subject(s) - quantitative structure–activity relationship , support vector machine , tetrahymena pyriformis , herg , artificial intelligence , computer science , molecular descriptor , machine learning , test set , data mining , data set , field (mathematics) , mathematics , chemistry , biology , tetrahymena , potassium channel , biochemistry , biophysics , pure mathematics

The inclusion and accessibility of different methodologies to explore chemical data sets has been beneficial to the field of predictive modeling, specifically in the chemical sciences in the field of Quantitative Structure-Activity Relationship (QSAR) modeling. This study discusses using contemporary protocols and QSAR modeling methods to properly model two biomolecular systems that have historically not performed well using traditional and three-dimensional QSAR methodologies. Herein, we explore, analyze, and discuss the creation of a classification human Ether-a-go-go Related Gene (hERG) potassium channel model and a continuous Tetrahymena pyriformis (T. pyriformis) model using Support Vector Machine (SVM) and Support Vector Regression (SVR), respectively. The models are constructed with three types of molecular descriptors that capture the gross physicochemical features of the compounds: (i) 2D, 2 1/2D, and 3D physical features, (ii) VolSurf-like molecular interaction fields, and (iii) 4D-Fingerprints. The best hERG SVM model achieved 89% accuracy and the three-best SVM models were able to screen a Pubchem data set with an accuracy of 97%. The best T. pyriformis model had an R(2) value of 0.924 for the training set and was able to predict the continuous end points for two test sets with R(2) values of 0.832 and 0.620, respectively. The studies presented within demonstrate the predictive ability (classification and continuous end points) of QSAR models constructed from curated data sets, biologically relevant molecular descriptors, and Support Vector Machines and Support Vector Regression. The ability of these protocols and methodologies to accommodate large data sets (several thousands compounds) that are chemically diverse - and in the case of classification modeling unbalanced (one experimental outcome dominates the data set) - allows scientists to further explore a remarkable amount of biological and chemical information.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research