z-logo
Premium
The One‐Class Classification Approach to Data Description and to Models Applicability Domain
Author(s) -
Baskin Igor I.,
Kireeva Natalia,
Varnek Alexandre
Publication year - 2010
Publication title -
molecular informatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.481
H-Index - 68
eISSN - 1868-1751
pISSN - 1868-1743
DOI - 10.1002/minf.201000063
Subject(s) - quantitative structure–activity relationship , applicability domain , hyperplane , chemical space , computer science , class (philosophy) , support vector machine , set (abstract data type) , test set , feature vector , kernel (algebra) , stability (learning theory) , domain (mathematical analysis) , cheminformatics , data mining , artificial intelligence , training set , feature (linguistics) , machine learning , mathematics , computational chemistry , chemistry , discrete mathematics , mathematical analysis , biochemistry , geometry , programming language , drug discovery , linguistics , philosophy
In this paper, we associate an applicability domain (AD) of QSAR/QSPR models with the area in the input (descriptor) space in which the density of training data points exceeds a certain threshold. It could be proved that the predictive performance of the models (built on the training set) is larger for the test compounds inside the high density area, than for those outside this area. Instead of searching a decision surface separating high and low density areas in the input space, the one‐class classification 1‐SVM approach looks for a hyperplane in the associated feature space. Unlike other reported in the literature AD definitions, this approach: (i) is purely “data‐based”, i.e. it assigns the same AD to all models built on the same training set, (ii) provides results that depend only on the initial descriptors pool generated for the training set, (iii) can be used for the huge number of descriptors, as well as in the framework of structured kernel‐based approaches, e.g., chemical graph kernels. The developed approach has been applied to improve the performance of QSPR models for stability constants of the complexes of organic ligands with alkaline‐earth metals in water.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom