On the importance of data uniformity and access

On the importance of data uniformity and access
Like

To advance the generation of high-resolution terrestrial water cycle components, the European Space Agency (ESA) initiated the "Hyper-resolution Earth observations and land-surface modeling for a better understanding of the water cycle" project, referred to as 4DHydro. This initiative aims to facilitate extensive collaboration between the Earth Observation (EO) and  Land Surface and Hydrologic Model (LSM/HM) communities, striving for better integration between innovative high-resolution satellite products and hyper-resolution modeling of the hydrological cycle.

An essential first step of 4DHydro involves a comprehensive assessment of the uncertainty, both systemic and due to inputs, inherent in existing LSM/HM simulations. Such an assessment will give a better understanding of the limitations in simulating high-resolution terrestrial water cycle components and will be used as a baseline for LSM/HM performance under EO integration.

However, such an assessment sounds easier than it is. As part of the 4DHydro project, 5 modeling groups have come together to provide LSM/HM simulations from 8 different models: Community Land Model (CLM), GEOframe, mesoscale Hydrologic Model (mHM), Parflow-CLM, PCRaster Global Balance (PCR-GLOBWB), TETIS, Terrestrial Systems Modeling Platform (TSMP) and wflow_sbm. These models represent a diverse range of model structures, conceptualizations and implementations, making comparing these model’s simulations not complicated.

This brings us to the importance of data uniformity and access. One of the major steps taken during the project is the development of a consistent storage protocol that ensures simulation outputs are uniformly formatted. For example, all output files are named consistently and are provided in the same file format. Therefore, despite the model diversity, simulation outputs could be seamlessly compared across simulations.

Furthermore, LSM/HMs simulations are publicly available through the 4DHydro open science catalog at 4dhydro.eu/catalog. Although each modeling group has stored its simulations in a public repository, all simulations are also referenced following the SpatioTemporal Asset Catalog (STAC) protocol on the open science catalog. This catalog ensures simulations are discoverable and enables open collaboration with end-users, including the scientific community and the general public.

As a result of these steps, we were able to combine and present a hydrological reference dataset. This dataset comprises 19 existing LSM/HM simulations from previous studies that represent the current state-of-the-art of our LSM/HMs. Moreover, these steps allowed us to perform a comprehensive benchmark that validates our simulations using observations from discharge gauges, evapotranspiration towers, soil moisture stations, and total water storage anomaly satellite products.

Considering that the main aim of the 4DHydro initiative is to facilitate collaboration, we welcome the scientific community to scrutinize our results and contribute to our hydrological reference dataset, including its benchmark, following the approach outlined in our paper.

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Follow the Topic

International Relations
Humanities and Social Sciences > Politics and International Studies > International Relations

Related Collections

With collections, you can get published faster and increase your visibility.

Epidemiological data

This Collection presents a series of articles describing epidemiological datasets spanning diverse populations, ecosystems, and disease contexts. Data are presented without hypotheses or significant analyses, and can be derived from population surveys, health registries, electronic health records, field sampling, or other sources.

Publishing Model: Open Access

Deadline: Dec 22, 2024

Metabolomics

This collection presents a series of articles describing metabolomics datasets, covering data from any organism type, collected via any valid metabolomic technique, and for any application.

Publishing Model: Open Access

Deadline: Nov 28, 2024