Premium
Generalized additive models for large data sets
Author(s) -
Wood Simon N.,
Goude Yannig,
Shaw Simon
Publication year - 2015
Publication title -
journal of the royal statistical society: series c (applied statistics)
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.205
H-Index - 72
eISSN - 1467-9876
pISSN - 0035-9254
DOI - 10.1111/rssc.12068
Subject(s) - smoothing , generalized additive model , computer science , grid , spline (mechanical) , mathematical optimization , smoothing spline , residual , algorithm , additive model , data set , set (abstract data type) , data mining , mathematics , machine learning , artificial intelligence , geometry , structural engineering , bilinear interpolation , engineering , computer vision , programming language , spline interpolation
Summary We consider an application in electricity grid load prediction, where generalized additive models are appropriate, but where the data set's size can make their use practically intractable with existing methods. We therefore develop practical generalized additive model fitting methods for large data sets in the case in which the smooth terms in the model are represented by using penalized regression splines. The methods use iterative update schemes to obtain factors of the model matrix while requiring only subblocks of the model matrix to be computed at any one time. We show that efficient smoothing parameter estimation can be carried out in a well‐justified manner. The grid load prediction problem requires updates of the model fit, as new data become available, and some means for dealing with residual auto‐correlation in grid load. Methods are provided for these problems and parallel implementation is covered. The methods allow estimation of generalized additive models for large data sets by using modest computer hardware, and the grid load prediction problem illustrates the utility of reduced rank spline smoothing methods for dealing with complex modelling problems.