Introducing EnUSM: A newly integrated soil moisture data package over the Continental United States

Published in Earth & Environment

Introducing EnUSM: A newly integrated soil moisture data package over the Continental United States
Like

Share this post

Choose a social network to share with, or copy the URL to share elsewhere

This is a representation of how your post may appear on social media. The actual post will vary between social networks

Soil moisture, i.e., the amount of water stored within the pore spaces of soil, is essential to the terrestrial carbon, water, and energy cycles, land–atmosphere interactions, and agricultural practices. However, accurately monitoring soil moisture and quantifying data uncertainties at regional to global scales are challenging. To address these challenges, this paper in Scientific Data presents the Ensemble Unified Soil Moisture (EnUSM) data package, which integrates 19 datasets from land surface models, remote-sensing, reanalysis, and machine learning based methods, with a standardized protocol across the continental United States (CONUS). The datasets are processed for surface soil moisture at 0.25-degree and monthly spatiotemporal resolutions. We then perform an inter-data comparison across the six Köppen-Geiger Climate Classifications (KGCCs) of CONUS. This comprehensive soil moisture data package aims to enhance our understanding of soil moisture characteristics across the CONUS.

In this blog post, we further explore why soil moisture matters, and how EnUSM is essential to studies of terrestrial ecosystems, land–atmosphere interactions, and decision-making activities.

 

Why Is Soil Moisture Important?

Soil moisture is determined by various processes, through which water circulates within and across terrestrial ecosystems, as well as between the land and the atmosphere. These processes include:

  1. Precipitation: water with any forms – liquid or solid -- falls from the atmosphere to the Earth’s surface
  2. Soil evaporation: water directly evaporated from soil pores
  3. Transpiration: the process of water transpiration from plant cells, released through the stomata of the plant
  4. Runoff: the movement of excess water over the Earth's surface when the soil cannot absorb all the precipitation

Soil moisture plays a vital role in supporting plant growth, regulating the water cycle, influencing weather and climate patterns, and maintaining overall soil health. A comprehensive assessment of soil moisture is essential for effective soil management, sustaining agricultural productivity, preventing land degradation, and conserving valuable water resources.

 

What Is EnUSM?

EnUSM has the following features, making it important for both scientific research and decision-making activities:

  1. A comprehensive data package integrating data from 19 sources, including land surface models, remote sensing, reanalysis, and machine learning based methods
  2. A novel inter-data comparison framework that integrates and processes dataset at consistent resolutions, minimizing the need for data interpolation
  3. A moderate spatiotemporal resolution that captures the seasonality and interannual variability of soil moisture, and facilitating the benchmark of regional and global models over CONUS
  4. The feature importance analysis to identify the environmental factors that most strongly influence soil moisture characteristics

 

What EnUSM Shows?

By processing soil moisture data from various sources, we reveal the spatial and temporal variations in soil moisture across the CONUS and within different KGCCs. The extracted soil moisture characteristics highlight distinct patterns, with the western CONUS exhibiting lower soil moisture but higher coefficient of variation values compared to the eastern CONUS. Remote sensing datasets tend to indicate drier conditions, while reanalysis products suggest wetter conditions. The eXplainable Machine Learning is used to quantify the importance of different environmental factors in influencing soil moisture characteristics. The results show that precipitation, leaf area index (LAI), surface air temperature, and soil texture are the primary factors driving the spatial distribution of the mean soil moisture across the 19 datasets. We also incorporate in-situ soil moisture observations for wavelet power spectrum analysis to highlight discrepancies in temporal scales across datasets. This study provides a comprehensive soil moisture data package and an analytical framework that can be applied for Earth system model evaluation, uncertainty quantification, drought impact assessment, land-atmosphere interaction studies, and drought response planning.

 

 

 

(a) The Koppen-Geiger Climate Classification (KGCC) map of the continental United States (CONUS); (b) mean soil moisture (SM) of all 19 SM products over the growing season (April–October) of 2016–2018; (c) averaged soil moisture for all the 19 datasets (ALL) and for each data type –– land surface model (LSM), remote sensing (RS), reanalysis (RE), and machine learning (ML) ––over CONUS (gray bars), where the soil moisture in different KGCCs of different types are represented by colored bars and error bars show the spatial averages of soil moisture standard deviations of each data type. The histograms in (a) and (b) show the frequency distribution of the climate zones and the soil moisture mean of the 19 datasets, respectively.

 

How Can You Access EnUSM?

EnUSM and the scripts used for the data processing are freely available at Zenodo, https://zenodo.org/records/14542239. We also recompile the datasets to the 0.25 degree and daily spatiotemporal resolution, and the size of this entire data package is 39 gigabytes. Instead of publishing the daily soil moisture records on Zenodo, users can request the daily data by contacting our research team.

Would you like to use EnUSM in your research? Let us know in the comments.  

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Follow the Topic

Earth Sciences
Physical Sciences > Earth and Environmental Sciences > Earth Sciences

Related Collections

With collections, you can get published faster and increase your visibility.

Clinical informatics

This Scientific Data Collection presents descriptions of a series of datasets for use in clinical informatics fields. Datasets in clinical informatics are vital for improving healthcare quality, efficiency, and patient outcomes.

Publishing Model: Open Access

Deadline: Sep 19, 2025

Text and speech corpora for natural language processing and corpus linguistics

This Collection presents a series of annotated text and speech corpora alongside linguistic models tailored for CL and NLP applications. These resources aim to enrich the arsenals of CL and NLP users and facilitate interdisciplinary research.

Publishing Model: Open Access

Deadline: Jul 24, 2025