Enhancing Skill of Medium-Range Forecasts with Machine Learning based Multimodel Superensemble
Abstract
Accurate medium-range weather forecasting is a longstanding challenge due to the chaotic nature of the atmosphere and the imperfect representation of physical processes. Classical Numerical Weather Prediction (NWP) systems are highly skillful but computationally intensive, limiting accessibility for many operational users. Recently, purely data-driven global AI models (e.g., FourCastNet, Pangu-Weather, GraphCast, FuXi) have emerged as efficient alternatives that produce forecasts rapidly using GPU, but like NWP, they too exhibit persistent region or variable dependent systematic biases.
This work develops and evaluates a machine-learning based multimodel superensemble (MMSE) that reduces persistent systematic biases and enhances the forecast skill of modern AI weather models.
The approach builds on the superensemble principle of bias-aware and performance-weighted combination of forecasts (rather than equal-weight averaging), adapted here for AI forecasts with interpretable learning and spatiotemporal deep networks that preserve geophysical structure. Ground truth is provided by ERA5 and GPM IMERG, with inputs harmonized at 0.25 degrees and 6-hour intervals.
To illustrate utility across different applications, MMSE is configured for three main applications: (i) January 10m winds over Germany for renewable-energy operations (wind-ramp planning, day-ahead scheduling, grid balancing); (ii) July rainfall over India for core monsoon operations (flood early warning, agricultural decisions); (iii) May 2m air temperature over India for early detection of heatwave conditions for public-health advisories.
We first implement a tabular XGBoost based MMSE (XGB–MMSE), which performs strongly on absolute-error metrics (RMSE) but does not improve much for spatial-anomaly correlation coefficient (ACC) due to flattening of gridded data. We therefore develop a 3D CNN based MMSE (CNN–MMSE) that ingests a full spatiotemporal forecast data, preserving coherent patterns and improving both RMSE and ACC at medium leads. Model interpretability is maintained via SHAP-based explanations, which suggests each contributor’s relative strengths.
Across all three configurations, the MMSE consistently outperforms its constituent AI models, demonstrating a forecast gain of about 1 day. Here, “forecast gain” denotes the difference in lead time at which the MMSE and a comparison model achieve the same accuracy (e.g., at a fixed RMSE/ACC threshold). With a one-day head start, we can issue advisories sooner, place resources in advance, and improve safety.
Once trained, the AI powered superensemble produces full 15 day forecasts on a single GPU in minutes. The methodology is modular and can be easily adapted for different applications, i.e., for other regions, seasons, or variables. Additionally, it can be extended to include more contributing models, either numerical (NWP) or data-driven, which enables broader adoption in renewable energy, disaster preparedness, and agriculture.
Keywords: multimodel superensemble; AI weather models; medium-range forecasting; XGBoost; 3D CNN; interpretability; Germany; India.
Collections
Related items
Showing items related by title, author, creator and subject.
-
Integrated Parallel Simulations and Visualization for Large-Scale Weather Applications
Malakar, Preeti (2018-07-28)The emergence of the exascale era necessitates development of new techniques to efficiently perform high-performance scientific simulations, online data analysis and on-the-fly visualization. Critical applications like ... -
Ultra High Compression For Weather Radar Reflectivity Data
Makkapati, Vishnu Vardhan (2007-10-09)Weather is a major contributing factor in aviation accidents, incidents and delays. Doppler weather radar has emerged as a potent tool to observe weather. Aircraft carry onboard radars but their range and angular resolution ... -
Cloud Properties Over SHAR Region Derived From Weather RADAR Data
Bhattacharya, Anwesa (2011-01-18)Weather radars are increasingly used for the study of clouds, understanding the precipitation systems and also for forecasting very short range weather (one hour to a few hours). Now, Doppler Weather Radar (DWR) data are ...

