Premium
Manifold preprocessing for laser‐induced breakdown spectroscopy under Mars conditions
Author(s) -
Boucher Thomas,
Carey CJ,
Dyar Melinda Darby,
Mahadevan Sridhar,
Clegg Samuel,
Wiens Roger
Publication year - 2015
Publication title -
journal of chemometrics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.47
H-Index - 92
eISSN - 1099-128X
pISSN - 0886-9383
DOI - 10.1002/cem.2727
Subject(s) - curse of dimensionality , mars exploration program , laser induced breakdown spectroscopy , preprocessor , linear subspace , exploration of mars , manifold (fluid mechanics) , embedding , subspace topology , nonlinear dimensionality reduction , artificial intelligence , partial least squares regression , computer science , algorithm , mathematics , spectroscopy , machine learning , dimensionality reduction , physics , geometry , astrobiology , engineering , mechanical engineering , quantum mechanics
Laser‐induced breakdown spectroscopy (LIBS) is currently being used onboard the Mars Science Laboratory rover Curiosity to predict elemental abundances in dust, rocks, and soils using a partial least squares regression model developed by the ChemCam team. Accuracy of that model is constrained by the number of samples needed in the calibration, which grows exponentially with the dimensionality of the data, a phenomenon known as the curse of dimensionality . LIBS data are very high dimensional, and the number of ground‐truth samples (i.e., standards) recorded with the ChemCam before departing for Mars was small compared with the dimensionality, so strategies to optimize prediction accuracy are needed. In this study, we first use an existing machine learning algorithm, locally linear embedding (LLE), to combat the curse of dimensionality by embedding the data into a low‐dimensional manifold subspace before regressing. LLE constructs its embedding by maintaining local neighborhood distances and discarding large global geodesic distances between samples, in an attempt to preserve the underlying geometric structure of the data. We also introduce a novel supervised version, LLE for regression (LLER), which takes into account the known chemical composition of the training data when embedding. LLER is shown to outperform traditional LLE when predicting most major elements. We show the effectiveness of both algorithms using three different LIBS datasets recorded under Mars‐like conditions. Copyright © 2015 John Wiley & Sons, Ltd.