z-logo
Premium
Can multi‐model combination really enhance the prediction skill of probabilistic ensemble forecasts?
Author(s) -
Weigel A. P.,
Liniger M. A.,
Appenzeller C.
Publication year - 2008
Publication title -
quarterly journal of the royal meteorological society
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.744
H-Index - 143
eISSN - 1477-870X
pISSN - 0035-9009
DOI - 10.1002/qj.210
Subject(s) - overconfidence effect , probabilistic logic , computer science , forecast skill , econometrics , ensemble forecasting , dispersion (optics) , machine learning , artificial intelligence , statistics , mathematics , physics , optics , psychology , social psychology
The success of multi‐model ensemble combination has been demonstrated in many studies. Given that a multi‐model contains information from all participating models, including the less skilful ones, the question remains as to why, and under what conditions, a multi‐model can outperform the best participating single model. It is the aim of this paper to resolve this apparent paradox. The study is based on a synthetic forecast generator, allowing the generation of perfectly‐calibrated single‐model ensembles of any size and skill. Additionally, the degree of ensemble under‐dispersion (or overconfidence) can be prescribed. Multi‐model ensembles are then constructed from both weighted and unweighted averages of these single‐model ensembles. Applying this toy model, we carry out systematic model‐combination experiments. We evaluate how multi‐model performance depends on the skill and overconfidence of the participating single models. It turns out that multi‐model ensembles can indeed locally outperform a ‘best‐model’ approach, but only if the single‐model ensembles are overconfident. The reason is that multi‐model combination reduces overconfidence, i.e. ensemble spread is widened while average ensemble‐mean error is reduced. This implies a net gain in prediction skill, because probabilistic skill scores penalize overconfidence. Under these conditions, even the addition of an objectively‐poor model can improve multi‐model skill. It seems that simple ensemble inflation methods cannot yield the same skill improvement. Using seasonal near‐surface temperature forecasts from the DEMETER dataset, we show that the conclusions drawn from the toy‐model experiments hold equally in a real multi‐model ensemble prediction system. Copyright © 2008 Royal Meteorological Society

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here