z-logo
open-access-imgOpen Access
Prediction of Retention Time and Collision Cross Section (CCSH+, CCSH–, and CCSNa+) of Emerging Contaminants Using Multiple Adaptive Regression Splines
Author(s) -
Alberto Celma,
Richard Bade,
J. Sancho,
Félix Hernández,
Melissa Humphries,
Lubertus Bijlsma
Publication year - 2022
Publication title -
journal of chemical information and modeling
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.24
H-Index - 160
eISSN - 1549-960X
pISSN - 1549-9596
DOI - 10.1021/acs.jcim.2c00847
Subject(s) - univariate , multivariate adaptive regression splines , mars exploration program , chemistry , multivariate statistics , data mining , mass spectrometry , standard deviation , linear regression , computer science , statistics , chromatography , mathematics , machine learning , physics , bayesian multivariate linear regression , astronomy
Ultra-high performance liquid chromatography coupled to ion mobility separation and high-resolution mass spectrometry instruments have proven very valuable for screening of emerging contaminants in the aquatic environment. However, when applying suspect or nontarget approaches ( i.e. , when no reference standards are available), there is no information on retention time (RT) and collision cross-section (CCS) values to facilitate identification. In silico prediction tools of RT and CCS can therefore be of great utility to decrease the number of candidates to investigate. In this work, Multiple Adaptive Regression Splines (MARS) were evaluated for the prediction of both RT and CCS. MARS prediction models were developed and validated using a database of 477 protonated molecules, 169 deprotonated molecules, and 249 sodium adducts. Multivariate and univariate models were evaluated showing a better fit for univariate models to the experimental data. The RT model ( R 2 = 0.855) showed a deviation between predicted and experimental data of ±2.32 min (95% confidence intervals). The deviation observed for CCS data of protonated molecules using the CCS H model ( R 2 = 0.966) was ±4.05% with 95% confidence intervals. The CCS H model was also tested for the prediction of deprotonated molecules, resulting in deviations below ±5.86% for the 95% of the cases. Finally, a third model was developed for sodium adducts (CCS Na , R 2 = 0.954) with deviation below ±5.25% for 95% of the cases. The developed models have been incorporated in an open-access and user-friendly online platform which represents a great advantage for third-party research laboratories for predicting both RT and CCS data.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here