Premium
Classification and calibration of organic matter fluorescence data with multiway analysis methods and artificial neural networks: an operational tool for improved drinking water treatment
Author(s) -
Bieroza Magdalena,
Baker Andy,
Bridgeman John
Publication year - 2011
Publication title -
environmetrics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.68
H-Index - 58
eISSN - 1099-095X
pISSN - 1180-4009
DOI - 10.1002/env.1045
Subject(s) - calibration , partial least squares regression , artificial neural network , curse of dimensionality , chemometrics , biological system , fluorescence , fluorescence spectroscopy , artificial intelligence , computer science , pattern recognition (psychology) , chemistry , machine learning , mathematics , statistics , biology , physics , quantum mechanics
Abstract Fluorescence spectroscopy enables fast and sensitive analysis of environmental samples containing various organic matter constituents. However, to retrieve valuable information from fluorescence spectra, robust techniques for data analysis should be employed. Here, different multivariate analysis methods and artificial neural networks (ANNs) were applied for decomposition and calibration of fluorescence excitation–emission matrices (EEMs). This is the first paper summarizing the application of different data mining methods, from multiway analysis to ANNs, for fluorescence EEMs technique employed to characterize organic matter properties and removal in the field of drinking water treatment. Fluorescence analysis was carried out on municipal water treatment samples of raw and partially‐treated water. Parallel factor analysis (PARAFAC) method and self‐organizing maps were used to analyse EEMs, extract information on the organic matter constituents and reduce the dimensionality of the data to enhance the efficiency of calibration methods. Partial least squares (PLS), multiple linear regression (MLR) and neural network with back‐propagation were employed for calibration of fluorescence data with actual total organic carbon (TOC) concentrations. All models except PARAFAC‐MLR produced consistent results with correlation coefficient R 2 = 0.93 for validation dataset. This is the first such comparative analysis of fluorescence data modelling that clarifies fundamental fluorescence data analysis questions regarding the suitability of different decomposition and calibration methods. Copyright © 2010 John Wiley & Sons, Ltd.