Open Access
Do you know your r2?
Author(s) -
Alex Avdeef
Publication year - 2020
Publication title -
admet and dmpk
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.309
H-Index - 13
ISSN - 1848-7718
DOI - 10.5599/admet.888
Subject(s) - pearson product moment correlation coefficient , mean squared error , statistics , correlation coefficient , mathematics , correlation , mathematical statistics , computer science , geometry
The prediction of solubility of drugs usually calls on the use of several open-source/commercially-available computer programs in the various calculation steps. Popular statistics to indicate the strength of the prediction model include the coefficient of determination (r2), Pearson’s linear correlation coefficient (rPearson), and the root-mean-square error (RMSE), among many others. When a program calculates these statistics, slightly different definitions may be used. This commentary briefly reviews the definitions of three types of r2 and RMSE statistics (model validation, bias compensation, and Pearson) and how systematic errors due to shortcomings in solubility prediction models can be differently indicated by the choice of statistical indices. The indices we have employed in recently published papers on the prediction of solubility of druglike molecules were unclear, especially in cases of drugs from ‘beyond the Rule of 5’ chemical space, as simple prediction models showed distinctive ‘bias-tilt’ systematic type scatter.