Towards sub-seasonal to seasonal forecasts for EFAS

Fredrik Wetterhall

ECMWF is currently developing a sub-seasonal to seasonal (S2S) hydrometeorological forecasting system for the European Flood Awareness System (EFAS) to complement the existing seasonal hydrometeorological forecasts that have been produced operationally since December 2016. The work on EFAS-S2S is part of ECMWF’s role as the computational centre for EFAS, the early warning system for floods of the European Commission’s Copernicus Emergency Management Service (CEMS). The new system will use ECMWF extended-range forecasts, which are issued every Monday and Thursday with a lead time of up to 46 days. As expected, tests have shown that forecast skill is comparable to that of seasonal forecasts when the forecasts are initialised at the same time. The advantage of EFAS-S2S compared to seasonal forecasts lies in the more frequent updates of the hydrological and meteorological initial conditions. The new system is planned to be made operational later this year.

The growing EFAS portfolio

EFAS ( has been running operationally since 2012. ECMWF’s responsibilities include running the hydrometeorological computations, archiving the data and disseminating the forecasts through the EFAS web interface. EFAS delivers forecasts over many time ranges, from nowcasting flash floods to probabilistic medium-range and seasonal forecasts. The medium-range forecasts are produced twice daily up to 10 days ahead, whereas the seasonal forecasts, forced with ECMWF’s seasonal forecasting system SEAS5, are issued once a month and provide an outlook of up to 8 weeks ahead (Arnal et al., 2018). The main idea of the EFAS-S2S forecasts is to provide more frequent outlooks of the hydrological situation than the seasonal forecasts can provide. This would ideally provide decision-makers with more up-to-date, actionable information at the timescales they require (Wetterhall & Di Giuseppe, 2018).

In the numerical weather prediction community, there is growing interest in the sub-seasonal to seasonal range, loosely defined as the range beyond 15 days up to 2 months. This has manifested itself in many projects and initiatives, including the S2S Prediction Project launched by the World Meteorological Organization in 2013. At the S2S time range, the predictability of meteorological surface variables over Europe is in general quite low, although this depends on the spatial and temporal scales. This limits the use of S2S predictions of such variables for decision-making. However, for many catchment areas the skill of S2S hydrometeorological forecasts, e.g. of river discharge and water levels of rivers and lakes, depends to a considerable extent on hydrological initial conditions, such as snow, soil moisture and ground-water storage. Furthermore, the hydrological time of concentration, meaning the time it takes for water to flow from the most remote point in a watershed to the outlet, can be as much as weeks and months for the largest river systems. Therefore, the skill of hydrometeorological forecasts is expected to be higher compared to that of forecasts of meteorological variables for the same areas.

S2S experiment

To test hydrometeorological predictability at the S2S range, an experiment was set up using extended-range hydrometeorological ensemble re-forecasts (hereafter referred to as EXT) covering 20 years up to June 2017. The 11‑member re-forecasts were produced using the latest available operational version of ECMWF’s Integrated Forecasting System (IFS Cycle 43r3) and were issued twice weekly (Mondays and Thursdays) with a lead time of 46 days. The EXT re-forecasts were compared with hydrometeorological re-forecasts forced with ECMWF SEAS5 seasonal re-forecasts and referred to hereafter as SEAS. SEAS5 meteorological re-forecasts cover 36 years (1981–2016), have 25 ensemble members and are initialised on the first of each month, but only re-forecasts from the same period as for the EXT forecasts were used in the experiment. The meteorological re-forecasts used to produce SEAS and EXT were initialised using the ERA-Interim reanalysis. For more details on the forcing data, see Table 1.






Extended-range forecasts

15/46 days

18/36 km (TCo639/TCo319)

IFS Cycle 43r3


SEAS5 forecasts

214 days

36 km

IFS Cycle 43r1


Table 1 Details of the meteorological data used in the experiment.

The meteorological forcing was run through the hydrological model LISFLOOD (the operational model in EFAS) across the European domain to produce the hydrometeorological re-forecasts. LISFLOOD has been calibrated and set up on a 5x5 km grid across all of mainland Europe. It includes a routing component, which translates the runoff into modelled discharge over the river network. The model also uses static maps to provide information on the soil, vegetation, elevation etc. The system used in this study is the same as in the operational EFAS.

The hydrometeorological re-forecasts were compared against a hydrological reanalysis run using observed precipitation and temperature as forcing. This is referred to as ‘simulations forced with observations (SFO)’ and is used as a proxy for observations. Using simulated discharge as the ground truth means that EXT and SEAS forecast skill is compared without any effects caused by biases in the hydrological model. The scores used were the continuous ranked probability score (CRPS), bias and reliability for the modelled discharge over a selected number of points across the domain where the hydrological model was calibrated, hereafter referred to as ‘outlet points’ (see Mazzetti & Prudhomme, 2018). CRPS was adjusted to account for the difference in ensemble size between EXT and SEAS, in accordance with the method presented in Ferro et al. (2008).

A CRPS skill score (CRPSS) was calculated against a reference forecast consisting of randomly selected SFO simulations with the same start date as the forecasts but selected from all other years (hereafter referred to as CLIM). This reference forecast has no predictive skill, but it has the advantage of having perfect reliability and being unbiased. Bias was defined as the ensemble mean minus SFO, such that a positive bias means that the forecast is too wet. It is not straightforward to correct for the bias in the forecast, since discharge is quite a complex variable. Attempts to apply bias correction to the forcing meteorological variables have shown promise but are not unproblematic. In this study, forecasts are bias-corrected by multiplying the forecasts with a scaling factor as a function of lead time, month of year and location. The scaling factor was calculated by applying a smoothing filter to the relative mean error of the mean of the ensemble forecasts in comparison with the SFO.

Limit of predictability

To understand the potential for using forecasts in decision-making, it is essential to understand the limit of predictability. Figure 1 shows the CRPSS of discharge for all outlet points across Europe as a function of lead time for EXT and SEAS. Forecast skill is compared against the reference forecast, CLIM. The limit of predictability is here defined as the lead time when the CRPSS drops below 0.1. As shown in Figure 1, that time is in the region of 25 to 35 days across all locations and seasons. This is much more skilful than the skill of precipitation forecasts over the same area. Just looking at Figure 1, it might seem that SEAS performs better than EXT. However, when comparing EXT and SEAS for all starting months from January to December separately (Figure 2), the forecasts perform somewhat differently depending on the season. SEAS has higher skill in spring (March to May), whereas EXT has higher skill in late summer and autumn (August to November). These small differences in skill can be explained by the higher resolution of the first 15 days of the extended forecast, which leads to a better representation of precipitation in EXT in summer and autumn. The difference in skill in spring needs further investigation.

Figure 1  CRPSS of daily discharge as a function of lead time for EXT and SEAS forecasts for all starting months and evaluation points, with and without bias correction.
​Bias and reliability

The relatively sharp decline in CRPSS can to some extent be explained by a bias in both EXT and SEAS forecasts. This is the result of a bias in the underlying meteorological forecast, especially over the winter months, which translates into a bias in discharge. The uncorrected forecast bias in EXT is not spatially consistent: it is negative (too dry) over the Alpine catchments and positive (too wet) in central-eastern Europe. The pattern is similar for SEAS. The dry bias over the Alpine catchments can to some extent be explained by a slight underestimation of precipitation in these areas, which then translates into an underestimation of the river flow. This also explains some of the lower skill in the winter months seen in Figure 2.

Figure 2  Difference in CRPSS of daily discharge of the EXT and SEAS forecasts. Positive values mean that EXT forecasts are better than SEAS forecasts.

Reliability of a forecast is important in terms of its usefulness for decision-making. A reliable forecast can be trusted to predict the correct probability for different outcomes, regardless of the accuracy of the ensemble mean. A strongly unreliable forecast is in practice of no use and can lead to poor decisions. Figure 3 is a reliability diagram, which shows to what extent predicted probabilities are matched by the observed frequency of occurrence when such predictions are made. The closer the lines are to the diagonal, the more reliable the forecast is. The figure shows that both EXT and SEAS are slightly overconfident when it comes to predicting flow at or above the observed median. For example, high probabilities of such an outcome in the forecast are not quite matched by similarly high frequencies of occurrence. This can be attributed to an underestimation of the ensemble spread. The reliability regarding the prediction of low flows (dashed line, Figure 3) indicates an underprediction of low flows in EXT, which can be explained by the wet bias in the lower distribution of precipitation. SEAS performs better than EXT in this regard. High flow predictions are generally not reliable in either system, but EXT performs slightly better than SEAS.

Figure 3  Reliability diagram for EXT and SEAS5 for week 4 for all outlet points. The solid lines indicate the reliability at or above the median of observed discharge, the dashed (dotted) lines the forecast reliability for the forecast being below (above) the 10th (90th) percentiles of observed discharge.

Towards an actionable forecast

Since EXT and SEAS are comparable in performance, the main justification for the use of EXT in an operational context lies in the time gain in a response situation. More frequent forecast updates are potentially useful in decision-making. As an example, we analysed the predicted low flow for the river Rhine at a station just upstream of Cologne, Germany, during the European heatwave in the summer of 2003. This was an exceptional meteorological event, which combined significant precipitation deficits with record-breaking high temperatures. At its peak in August, extremely low discharge levels of rivers were reported in large parts of Europe. For several months, inland navigation was severely disrupted and shipping on the Danube and the Rhine came to a complete halt.

Despite the fact that, in 2003, conditions were extremely unusual from a climatological point of view, the upcoming deficit in precipitation and the high temperatures are well predicted by SEAS5 seasonal re-forecasts. The good predictability of the event is confirmed by the low discharge prediction provided by SEAS for the Rhine upstream of Cologne (Figure 4). More than 30% of the ensemble members predict extreme low-flow conditions. In fact, the observed discharge confirms that the river flow on two separate occasions, from mid‑ to late August (with a short interruption) and from mid‑ to late September, went below the 3rd percentile of the climatological distribution for the season (Figure 4). While most SEAS ensemble members predict the extreme conditions two to four weeks ahead, there is only a weak indication of the recovery period observed between the two events in the forecast starting on 1 August. A more detailed picture of this temporary recovery is conveyed by the EXT forecasts. Thanks to the more frequent updates, there are indications of a temporary increase in river flow, giving a potential advantage of two to three weeks for planning actions. SEAS does indicate the second low flow but underestimates the severity of the event. EXT gives a much more detailed forecast of the two events.

Figure 4  Percentage of ensemble members predicting a low river discharge anomaly (lower than the 3rd percentile of the climatological distribution) on the river Rhine at a location north of Cologne during August and September 2003 for (a) SEAS and (b) EXT, for different starting dates. River discharge below the 3rd percentile was observed during three periods in August and September 2003, as indicated.

Even though this is a reasonable forecast for SEAS, the information it provides is more informative (anomaly condition) than actionable. In the above example, a decision-maker would have to make a decision based on a forecast that was issued 2.5 weeks earlier, which would inherently make the decision rather uncertain if they only had the seasonal forecast to go by. With a more frequently updated system, such as EXT, a decision-maker would gain the same early indication of a hazardous event and have the benefit of more frequent updates. In this particular case, the EXT forecast for the first event is more unstable for some ensemble members, but in general the event is well captured. The EXT is also able to give an indication of the recovery with higher water levels between the extreme low flow events. The onset of the second low period is correctly predicted by the EXT system about a week in advance, whereas this event is not well predicted by SEAS (Figure 5). Similar results are obtained when using different thresholds, for example below the 10th or 5th percentile (not shown).

Figure 5 Weekly forecast plumes for the event in September 2003 by (a) SEAS initalised on 1 Sepember and (b) EXT initialised on 4 September, (c) EXT initialised on 7 September, and (d) EXT initialised on 11 September.
​Going forward

The example given above highlights the potential for the use of sub-seasonal to seasonal forecasts in the case of an extreme low-flow situation on the river Rhine. The higher frequency of EXT means that these forecasts are more actionable than seasonal forecasts. However, care should be taken when using the forecasts in decision-making since their reliability over Europe is only “marginally useful” (Weisheimer & Palmer, 2014). It is therefore important to assess the reliability and skill of the forecasts at a given location and over the season of interest.

EXT and SEAS used very similar versions of the IFS, and they were both initialised using the same reanalysis. The results indicate that in these conditions they are very similar in skill despite some small differences in performance depending on the season and area. However, the system that produces ECMWF’s operational ensemble forecasts is updated more frequently than the seasonal forecasting system, so it is expected that the skill of EXT will increase more quickly than that of SEAS: every new IFS upgrade can be expected to further improve EXT. Wetterhall & Di Giuseppe (2018) showed such a difference in skill when comparing System 4 (IFS Cycle 36r4) seasonal forecasts with extended-range forecasts using IFS Cycles 41r1 and 41r2.

The EFAS extended-range forecast is planned to be made available to users operationally later this year. It will show weekly anomalies against a model climatology rather than daily values of discharge. This is intended to avoid over-interpretation of the forecasts. The operational S2S forecasts will be disseminated with a disclaimer regarding their skill and reliability. Further efforts will be made to improve the bias correction of the forecasts to achieve a hydrometeorological forecast that is as reliable and skilful as possible on time ranges that are useful for decision-makers. Experiments in which the operational ensemble forecasts are merged with extended-range forecasts are also planned. This could lead to sub-seasonal hydrometeorological forecasts being issued on a daily basis.

Further reading

Arnal, L., H.L. Cloke, E. Stephens, F. Wetterhall, C. Prudhomme, J. Neumann, B. Krzeminski & F. Pappenberger, 2018: Skilful seasonal forecasts of streamflow over Europe?, Hydrology and Earth System Sciences, 22, 2057–2072, doi:10.5194/hess-22-2057-2018.

Ferro, C.A.T., D.S. Richardson & A.P. Weigel, 2008: On the effect of ensemble size on the discrete and continuous ranked probability scores, Meteorological Applications, 15, 19–24, doi:10.1002/met.45.

Mazzetti, C. & C. Prudhomme, 2018: Major upgrade for European flood forecasts, ECMWF Newsletter No. 157, 32–38.

Weisheimer, A. & T. Palmer, 2014: On the reliability of seasonal climate forecasts, Journal of The Royal Society Interface, 11, doi:10.1098/rsif.2013.1162.

Wetterhall, F. & F. Di Giuseppe, 2018: The benefit of seamless forecasts for hydrological predictions over Europe, Hydrol. Earth Syst. Sci., 22, 3409–3420, doi:10.5194/hess-22-3409-2018.