Reference evapotranspiration forecasting in the North Etna aquifer: a comparative analysis of statistical, deep learning, and machine learning models

We compared statistical, machine learning, and deep learning models to forecast reference evapotranspiration (ET₀) in the North Etna aquifer. Results show deep learning offers superior accuracy, supporting sustainable water management under climate stress.
Reference evapotranspiration forecasting in the North Etna aquifer: a comparative analysis of statistical, deep learning, and machine learning models
Like

Share this post

Choose a social network to share with, or copy the URL to share elsewhere

This is a representation of how your post may appear on social media. The actual post will vary between social networks

Explore the Research

SpringerLink
SpringerLink SpringerLink

Reference evapotranspiration forecasting in the North Etna aquifer: a comparative analysis of statistical, deep learning, and machine learning models - Theoretical and Applied Climatology

Accurate forecasting of reference evapotranspiration (ET0) is essential for sustainable water resource management, particularly in drought-prone regions. This study evaluates the performance of statistical, machine learning (ML), and deep learning (DL) models in forecasting ET0 in the North Etna aquifer (Sicily, Italy), a semi-arid Mediterranean area characterized by warm, semi-arid to sub-humid conditions at mid elevations. Two ET0 estimation methods were used: Hargreaves-Samani (HS), calculated monthly from long-term meteorological data collected at four weather stations, and FAO Penman-Monteith (FAO-PM), obtained as monthly records from the SIAS regional agrometeorological database. The forecasting models applied include statistical approaches (Linear Regression, Exponential Smoothing, Prophet), machine learning methods (Support Vector Regression, Random Forest, Extreme Gradient Boosting), and deep learning architectures (Long Short-Term Memory, Gated Recurrent Unit, and Temporal Convolutional Network). Model performance was evaluated using standard error metrics: RMSE, MAPE, R2, and NSE. Results indicate that statistical models perform well for HS-based ET0, which relies on a limited set of climatic variables. In contrast, ML and DL models outperform FAO-PM ET0, which incorporates more complex meteorological inputs such as wind speed and humidity. Among DL models, Long Short-Term Memory and Temporal Convolutional Network show superior ability to capture long-term temporal dependencies, making them suitable for extended sequence forecasting. This comparative analysis highlights the importance of aligning model complexity with data requirements and ET0 formulation. Findings offer practical insights for improving irrigation scheduling and water planning in Mediterranean environments facing increasing climate variability.

Reference Evapotranspiration Forecasting in the North Etna Aquifer: How a Mediterranean PhD Journey Led to a Multi-Model Breakthrough

Water is the lifeblood of Mediterranean agriculture and urban life. In semi-arid regions like eastern Sicily, every litre counts. Yet climate variability is making water supply ever harder to predict, and the pressure on aquifers is growing. Our new article, “Reference evapotranspiration forecasting in the North Etna aquifer: a comparative analysis of statistical, deep learning, and machine learning models,” explores how advanced forecasting can give water managers the early warnings they need.

From Morocco to Sicily: the spark behind the study

This project has its roots in my 2022 Erasmus PhD exchange from Abdelmalek Essaâdi University in Tetouan, Morocco, to the University of Messina (UniMe), Italy. During that nine-month stay I focused on collecting and checking ground-based climatic data across eastern Sicily. This was not just a technical task; it meant long hours coordinating with local monitoring networks, negotiating data access, and validating every station record to be sure it reflected the real climate signals of Mount Etna’s northern slopes.

After returning home, I spent an additional year refining the research question. The North Etna aquifer is vital for both agriculture and urban supply, yet the scientific community still lacked a robust, long-range method to forecast reference evapotranspiration (ET₀) — a key driver of irrigation demand and groundwater use. I wanted to design a study that would both advance the science and speak directly to water-resource managers in this fragile region.

The “catch”: testing many forecasting minds

Out of that reflection came the central idea of the paper: compare multiple forecasting paradigms side by side. Instead of betting on a single algorithm, we tested classical statistical models, modern machine-learning (ML) tools, and state-of-the-art deep-learning (DL) architectures.
A further insight shaped the project’s originality. In many basins, key weather data needed for the FAO-Penman–Monteith (FAO-PM) method are incomplete. We therefore examined whether Hargreaves–Samani (HS)–based forecasts could serve as reliable proxies for FAO-PM values, effectively transferring information from simpler data to the more demanding standard method.
To our knowledge, this is the first study to systematically evaluate the transferability of HS-based forecasts for approximating FAO-PM values, offering practical utility in data-scarce environments. By fusing this proxy concept with a broad multi-model comparison, the research provides a fresh way to anticipate ET₀ dynamics and secure water management in the North Etna aquifer.

Data and methods in practice

We assembled a long, high-quality climate dataset from regional ground stations, courtesy of Servizio Informativo Agrometeorologico Siciliano (SIAS). To mimic real forecasting conditions and avoid overfitting, we applied a rolling-window strategy: 60 % of the data for training, 20 % for validation, and 20 % for testing, moving the window step by step through the entire 2002–2023 period.

Forecast horizons extended up to 24 months, a challenging target that tests a model’s ability to capture both seasonal cycles and slower climate signals.

Among the nine tested approaches were Gradient Boosting, Random Forest, Support Vector Regression, Long Short-Term Memory (LSTM) networks, and Gated Recurrent Units (GRU). The comparison focused on standard skill scores such as RMSE, MAE, and R², enabling an objective ranking of forecasting performance.

Key findings

The results were striking:

  • Deep-learning models (GRU and LSTM) consistently delivered the highest predictive accuracy, capturing non-linear temporal dependencies that statistical methods could not.

  • Machine-learning algorithms like Random Forest and Gradient Boosting performed well for short lead times, but their skill declined for forecasts beyond a few months.

  • The rolling-window validation proved crucial for assessing long-term robustness and avoiding false confidence.

Overall, our work demonstrates that accurate two-year ET₀ forecasts are feasible, providing a practical early-warning tool for farmers, utilities, and water authorities.

Why this matters

For the North Etna aquifer — and for Mediterranean basins worldwide — anticipating evapotranspiration is central to sustainable water management. Reliable ET₀ forecasts help:

  • plan irrigation schedules more efficiently,

  • reduce the risk of groundwater over-extraction,

  • and align agricultural decisions with anticipated drought conditions.

By rigorously comparing diverse modeling families, we also provide guidance for researchers and agencies choosing forecasting tools under similar climatic and data constraints.

Behind the science: lessons learned

Looking back, this project was as much about persistence as algorithms:

  • Months of data collection and harmonization were necessary to build trust in the climatic record.

  • Designing a comparative, multi-model framework required careful calibration to ensure a fair test for every method.

  • Bridging academic research and on-the-ground water management meant constant dialogue with local experts and attention to practical constraints.

These experiences shaped not only the study but also my own growth as a scientist, highlighting how international collaboration and patience in data work can unlock new scientific insights.

Next steps

Our future plans include extending the approach to real-time operational forecasting and integrating satellite-based observations to enhance spatial coverage. The methodology is also adaptable to other Mediterranean aquifers facing similar climatic stress.

Acknowledgments

I sincerely thank SIAS for providing the climatic dataset and my co-authors and mentors at UniMe and Abdelmalek Essaâdi University for their guidance. Their collaboration made this cross-Mediterranean study possible.


 👉 Read the full article here: [https://link.springer.com/epdf/10.1007/s00704-025-05726-2?sharing_token=6fWujxtt5sjbqY3_d30iu_e4RwlQNchNByi7wbcMAY67Zb4GyJCvXa-0hRorzCljuH-AMpEMJ2jlePvzra7bWGP-KqMTE_UYBnNpAZsAknAzdtD61pFG_hsf7WV06r_OkCbT0AJ9Ll_JDcQvJjwk0xHCReJaCBwXYGWEiIeI69w%3D]

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Follow the Topic

Climate Change
Physical Sciences > Earth and Environmental Sciences > Earth Sciences > Climate Sciences > Climate Change
Soil and Water Protection
Technology and Engineering > Civil Engineering > Environmental Civil Engineering > Soil and Water Protection
Drought
Life Sciences > Biological Sciences > Plant Science > Plant Stress Responses > Drought
Machine Learning
Mathematics and Computing > Statistics > Statistics and Computing > Machine Learning
Statistical Software
Mathematics and Computing > Statistics > Statistics and Computing > Statistical Software