Premium
Predicting pK a for Small Molecules on Public and In‐house Datasets Using Fast Prediction Methods Combined with Data Fusion
Author(s) -
Kalliokoski Tuomo,
Sinervo Kai
Publication year - 2019
Publication title -
molecular informatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.481
H-Index - 68
eISSN - 1868-1751
pISSN - 1868-1743
DOI - 10.1002/minf.201800163
Subject(s) - fusion , context (archaeology) , sensor fusion , mean squared prediction error , data mining , computer science , molecule , small molecule , chemistry , artificial intelligence , machine learning , paleontology , philosophy , linguistics , biochemistry , organic chemistry , biology
Data fusion approach was investigated in the context of pK a prediction for 391 small molecules derived from a public data source as well as for 681 compounds from an internal corporate database. Four different pKa prediction methods (Simulations Plus ADMET‐Predictor S+pKa, ACD/Labs Percepta Classic, ACD/Labs Percepta GALAS and Epik) were used to predict the most acidic or basic pKa for each of the compounds. By using data fusion, the median absolute error for the internal compounds was reduced from the best performing single model's value of 0.69 down to 0.50. In addition to the improved accuracy, data fusion also enabled predictions for all of the compounds in the dataset as individual methods failed on some of the molecules.