
Regional Forecasting of Fine Particulate Matter Concentrations: A Novel Hybrid Model Based on Principal Component Regression and EOF
Author(s) -
Wu Xianghua,
Xie Kang,
Liu Jane,
Liu Duanyang,
Zhou Jieqin,
Tang Lili
Publication year - 2021
Publication title -
earth and space science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.843
H-Index - 23
ISSN - 2333-5084
DOI - 10.1029/2021ea001694
Subject(s) - empirical orthogonal functions , principal component analysis , environmental science , air quality index , econometrics , variance (accounting) , regression , principal component regression , regression analysis , explained variation , statistics , computer science , meteorology , mathematics , geography , accounting , business
When many cities need quantitative forecasts of air quality to adjust industrial production plans and urbanization development, how to build an efficient forecast model remain a challenge. Methodology for quantitative prediction of air quality can no longer rely on single‐site observations, and thus approaches that require fewer input data and are more efficient and more reliable need to be explored. This paper proposes the principles and steps of a new model using the empirical orthogonal function (EOF) and principal component regression (PCR), which is a hybrid EOF‐PCR approach that decomposes the panel data of PM 2.5 and predictors into spatial structures (EOFs) and time expansion coefficients (ECs) by EOF analysis, establishes the PCR of the ECs of PM 2.5 , and simulates the PM 2.5 concentrations in each city by projecting the fitted ECs through EOFs. The very heart of the new model is the PCR modeling and projection. The results are presented for PM 2.5 concentrations over Jiangsu Province in eastern China. The results show that this EOF‐PCR model, which is based on EC1s with a cumulative variance contribution rate above 90%, has an average prediction accuracy of 65%. The model performs best in spring and autumn, better in summer and worst in winter. Most predictors have a maximum lag of a week, and they are quite different among seasons. Considering the influence of the spatial distributions of predictors, rather than covariates at a single site, this model can reflect regional influences and effectively improve the simulation effect.