Premium
Matched Molecular Pair Analysis on Large Melting Point Datasets: A Big Data Perspective
Author(s) -
Withnall Michael,
Chen Hongming,
Tetko Igor V.
Publication year - 2018
Publication title -
chemmedchem
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.817
H-Index - 100
eISSN - 1860-7187
pISSN - 1860-7179
DOI - 10.1002/cmdc.201700303
Subject(s) - set (abstract data type) , melting point , chemical space , perspective (graphical) , point (geometry) , computer science , molecular descriptor , simple (philosophy) , data set , data mining , chemistry , mathematics , bioinformatics , biology , artificial intelligence , machine learning , quantitative structure–activity relationship , drug discovery , organic chemistry , philosophy , geometry , epistemology , programming language
A matched molecular pair (MMP) analysis was used to examine the change in melting point (MP) between pairs of similar molecules in a set of ∼275k compounds. We found many cases in which the change in MP (ΔMP) of compounds correlates with changes in functional groups. In line with the results of a previous study, correlations between ΔMP and simple molecular descriptors, such as the number of hydrogen bond donors, were identified. In using a larger dataset, covering a wider chemical space and range of melting points, we observed that this method remains stable and scales well with larger datasets. This MMP‐based method could find use as a simple privacy‐preserving technique to analyze large proprietary databases and share findings between participating research groups.