ECMWF Newsletter #173

Progress on emulating the radiation scheme via machine learning

Matthew Chantry, Peter Düben, Robin Hogan (all ECMWF), Peter Ukkonen (Danish Meteorological Institute)

 

One of the most promising applications of machine learning in weather and climate modelling is the creation of so-called emulators. An original model component is emulated by running the model and storing input and output pairs of the model component to train a machine learning tool – typically based on deep neural networks. The emulator can then be used within the forecast model. Here we describe progress in emulating the radiation scheme used in numerical weather forecasting.

Why is emulation useful? For computationally expensive model components, the emulator may be much cheaper than the original component. For instance, emulators could be used to speed up operationally used parametrizations to save computing time, or to make expensive components such as cloud bin microphysics schemes affordable in an operational forecast. They are also more portable to heterogeneous hardware using existing machine learning libraries. Neural networks can be trained to produce accurate answers at reduced numerical precision, as low as half precision with 16 bits per variable, providing an additional route for performance improvement.

Towards emulating the radiation scheme

As a first step, the emulation of the gravity wave drag parametrization scheme has been considered. Results have shown that a deep-learning emulator can indeed represent the scheme correctly. It can also be used to generate tangent linear and adjoint model code for use within 4D‑Var data assimilation. However, to make the approach viable for operational implementation, a more costly component of the Integrated Forecasting System (IFS) – the ecRad radiation scheme – would need to be emulated. Consequently, tests to emulate the ecRad radiation scheme were performed as part of the MAELSTROM EuroHPC-Joint Undertaking project. Datasets have been developed and published for the purpose of emulating radiative transfer within the IFS. These datasets can be used to learn the TripleClouds solver, which will be deployed operationally in the IFS in an upcoming cycle, or the more expensive SPARTACUS solver, which incorporates 3D cloud effects.

Changes in temperature error.
%3Cstrong%3EChanges%20in%20temperature%20error.%3C/strong%3E%20The%20charts%20show%20the%20change%20in%20root-mean-square%20(RMS)%20error%20of%20temperature%20if%20the%20emulator%20is%20used%20to%20represent%20the%20TripleClouds%20radiation%20scheme%20in%2024-hour%20and%20120-hour%20forecasts%20(top),%20and%20the%20same%20change%20if%20McICA,%20the%20current%20operational%20scheme,%20is%20compared%20with%20TripleClouds%20(bottom).%20The%20experiments%20were%20carried%20out%20from%201%20June%20to%2031%20August%202021.%20Red%20indicates%20where%20the%20emulator/McICA%20forecasts%20are%20degraded%20relative%20to%20TripleClouds,%20blue%20where%20they%20are%20improved.%20Cross-hatching%20indicates%20at%20least%2095%25%20confidence.
Changes in temperature error. The charts show the change in root-mean-square (RMS) error of temperature if the emulator is used to represent the TripleClouds radiation scheme in 24-hour and 120-hour forecasts (top), and the same change if McICA, the current operational scheme, is compared with TripleClouds (bottom). The experiments were carried out from 1 June to 31 August 2021. Red indicates where the emulator/McICA forecasts are degraded relative to TripleClouds, blue where they are improved. Cross-hatching indicates at least 95% confidence.

Accurate neural network emulators have been created for the TripleClouds solver. Shortwave and longwave processes are solved separately. For the shortwave process, a series of recurrent neural networks, chosen to mimic the existing algorithm, was created by Peter Ukkonen from the Danish Meteorological Institute and has been adopted here. For the longwave process, a convolutional-based neural network is used. In simulations at about 29 km resolution, when coupling the neural network solution instead of the TripleClouds solver to the IFS, we find good agreement in the troposphere compared to the reference simulation, with only small degradations in the stratosphere above 100 hPa (see the top two figure panels). To contextualise these changes, we also plot the impact using the current solver, McICA, instead of the upcoming TripleClouds solver, which was used to generate the machine learning data (see the bottom two panels).

However, emulators will eventually only be useful in the IFS if they can either be more efficient or portable when compared to the conventional model component; if they emulate a version of the parametrization scheme that would otherwise be too expensive for operational use; or if they can be used to generate tangent linear and adjoint model code for data assimilation.

The Infero library was used to execute machine learning libraries into the code framework of the IFS. So far, tests have been limited to CPU hardware, and improvements in efficiency are only ~25%. However, in the next step the execution on GPUs will be tested, and significant improvements in efficiency are anticipated here.

Additional tests

Work has also begun to investigate the emulation of expensive versions of the parametrization scheme. A study in collaboration with the University of Reading has shown that neural networks can represent 3D cloud effects as represented in the SPARTACUS solver. Here, emulators were trained to predict the difference between the default TripleClouds radiation scheme and SPARTACUS. Results have been published, but model evaluation has so far only been done in offline simulations, so further work on the coupling and applications within IFS simulations will be required.

Finally, tests using the emulator to generate a tangent linear or adjoint version of the neural network emulator will be performed shortly. While it is still unclear whether simulations will stay stable during the 4D-Var minimisation process, the approach is very promising: it will represent the latest version of the radiation scheme in simulations, while the current data assimilation is still performed with the tangent linear and adjoint model code that was derived for an old version of the radiation scheme.