z-logo
open-access-imgOpen Access
Predicting partition coefficients for the SAMPL7 physical property challenge using the ClassicalGSG method
Author(s) -
Nazanin Donyapour,
Alex Dickson
Publication year - 2021
Publication title -
journal of computer-aided molecular design
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.749
H-Index - 101
eISSN - 1573-4951
pISSN - 0920-654X
DOI - 10.1007/s10822-021-00400-x
Subject(s) - force field (fiction) , representation (politics) , molecular dynamics , field (mathematics) , computer science , mean squared error , statistical physics , mathematics , artificial intelligence , computational chemistry , chemistry , physics , statistics , pure mathematics , politics , political science , law
The prediction of [Formula: see text] values is one part of the statistical assessment of the modeling of proteins and ligands (SAMPL) blind challenges. Here, we use a molecular graph representation method called Geometric Scattering for Graphs (GSG) to transform atomic attributes to molecular features. The atomic attributes used here are parameters from classical molecular force fields including partial charges and Lennard-Jones interaction parameters. The molecular features from GSG are used as inputs to neural networks that are trained using a "master" dataset comprised of over 41,000 unique [Formula: see text] values. The specific molecular targets in the SAMPL7 [Formula: see text] prediction challenge were unique in that they all contained a sulfonyl moeity. This motivated a set of ClassicalGSG submissions where predictors were trained on different subsets of the master dataset that are filtered according to chemical types and/or the presence of the sulfonyl moeity. We find that our ranked prediction obtained 5th place with an RMSE of 0.77 [Formula: see text] units and an MAE of 0.62, while one of our non-ranked predictions achieved first place among all submissions with an RMSE of 0.55 and an MAE of 0.44. After the conclusion of the challenge we also examined the performance of open-source force field parameters that allow for an end-to-end [Formula: see text] predictor model: General AMBER Force Field (GAFF), Universal Force Field (UFF), Merck Molecular Force Field 94 (MMFF94) and Ghemical. We find that ClassicalGSG models trained with atomic attributes from MMFF94 can yield more accurate predictions compared to those trained with CGenFF atomic attributes.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here