Data-driven regional modelling

Today we have some guest co-authors for the blog, colleagues from MET Norway, as we tell you about some ongoing work which is led by MET Norway with support by ECMWF.

Limited-area modelling (LAM) is a vital tool for regional weather forecasting, providing more precise and localised forecasts compared to global models. This is achieved by running a model at higher resolution and utilising more local data than can be afforded globally.

Alongside the very exciting results in data-driven global medium-range weather forecasting, there has also been some great work for limited-area modelling with Neural-LAM (Oskarsson et al. 2023). Training on just a few years of data (in contrast to the decades of ERA5), the authors created an accurate 10 km emulator of the MetCoOp Ensemble Prediction System (MEPS). They used a limited-area graph neural network over the Nordic region, forced by a boundary of grid points from a physical model. This approach is an elegant adaptation of the limited-area modelling method to data-driven weather forecasting.

In a collaboration between MET Norway and ECMWF, we have adapted the tooling for the global AIFS approach, to take advantage of high-resolution data over the Nordics.

In contrast to Neural-LAM, the approach we are currently testing keeps a global model. Over the domain where high-resolution data are available, here the Nordics, a finer-resolution graph is used in the model, to be able to ingest and output data from high-resolution regional analysis, reanalysis or forecast datasets. Throughout the rest of the domain, ERA5 can be used as training data. The graph neural network architecture used in AIFS is well suited for an application like this, where the resolution varies throughout the domain.

An advantage of such an approach is that information from far beyond the domain of interest can be utilised if deemed useful by the model (Figure 1). A potential disadvantage is that the whole globe is simulated, even when only a limited area may be of interest. However, for a data-driven model, once trained this cost is still only a few minutes and single GPU.

We are taking the first steps in this project, but the results are already promising. In Figure 2, which shows the evolution of a forecast, we see that the model successfully creates higher-resolution structure over the domain of interest. Additionally, weather systems appear to move seamlessly from the global to the regional domain and vice versa.

Map showing UK, Ireland, and Nordic countries. Grey dots show the nodes in the hidden mesh in the stretched-grid approach, where a higher-resolution mesh is used in the Nordics and lower resolution elsewhere.

Figure 1: This shows the nodes (grey) in the hidden mesh in the stretched-grid approach, where a higher-resolution mesh is used in the Nordics and lower resolution elsewhere. The example node (yellow) is connected to a set of other nodes (red), both inside and outside the regional domain.

The next steps will be to further explore optimal model structure for this stretched grid, including the choice of graphs. Beyond that we will be moving to 2.5 km resolution, which is the original resolution of the regional model.

Animation showing 7-day forecast of 10m wind speed and sea-level pressure using data-driven stretched grid approach. the model successfully creates higher-resolution structure over the Nordics.

Figure 2: The animation shows a 7-day forecast of 10 metre wind speed (shading) and sea-level pressure (contours) using the data-driven stretched grid approach. The model has learned to forecast at high resolution (here ~10 km) inside of the Nordic region, and low resolution (here ~100 km) outside of this domain. The model successfully creates higher-resolution structure over the Nordics.

Envisaging a wider use of the tooling supporting the AIFS, we are creating a framework called Anemoi¹. This framework builds on top of PyTorch², and is used to create the AIFS, but can also create data-driven weather forecasting models at regional or global scale. Users are able to bring datasets and ideas for the architecture configuration, and train their own model for an application, e.g. a national meteorological service. The framework creates ML-ready datasets from raw GRIB or NetCDF data, configured for efficient training. The framework supports training global models, including ones with stretched grids like the above, and limited-area models using neural networks (as demonstrated by Neural-LAM). The plan will be to open source this toolbox soon, when it is sufficiently mature to allow easy use.

In an ECMWF pilot project on ML, led by MET Norway and MeteoSwiss and with contributions from 13 meteorological organisations across Europe, we will further develop these tools and explore together what the role of data-driven models is for weather prediction.

Authors

From MET Norway

Thomas Nipen and the rest of the MET Norway AIFS team (Håvard Homleid Haugen, Magnus Sikora Ingstad, Aram Farhad Salihi, Ivar Seierstad, Paulina Tedesco)

And from ECMWF

Matthew Chantry and the rest of the AIFS team (Rilwan Adewoyin, Mihai Alexe, Zied Ben Bouallègue, Mariana Clare, Jesper Dramsch, Sara Hahner, Simon Lang, Christian Lessig, Linus Magnusson, Michael Maier-Gerber, Gert Mertes, Gabriel Moldovan, Ana Prieto Nemesio, Cathal O’Brien, Florian Pinault, Baudouin Raoult, Mario Santa Cruz, Helen Theissen, Steffen Tietsche)

¹ Anemoi were the gods of winds in ancient Greece.

² PyTorch is a popular open-source ML training framework that is used in the AIFS and many other projects across disciplines.