Wolfset: A High-Quality Underwater Acoustic Dataset for Algorithm Development and Analysis

Wolfset is a high-quality acoustic dataset recorded in an anechoic tank using a Bruel & Kjaer 8104 hydrophone. It features a variety of outboard and electric motor sounds, combined with realistic noise sources to create data for developing and testing sound classification algorithms.

Published in Earth & Environment and Physics

Wolfset: A High-Quality Underwater Acoustic Dataset for Algorithm Development and Analysis
Like

Share this post

Choose a social network to share with, or copy the URL to share elsewhere

This is a representation of how your post may appear on social media. The actual post will vary between social networks

General Notes:
Collecting underwater acoustics is costly and time-consuming, so we built Wolfset to provide a ready-to-use benchmark containing about 1.5 GB, 168 WAV files, and roughly 5 hours of recordings, all validated for consistency and quality. All the data were analyzed correctly and validated before being added to the final dataset, as illustrated in Figure 1.

Figure 1 -  Simplified diagram of the dataset creation scheme.

Controlled Facility:
All sounds were recorded inside the anechoic tank at Lisbon Naval Base, as illustrated in Figure 2. The tank, built in 1976, measures 8 m × 5 m × 5 m, is lined with cork-rubber absorbent panels, and is equipped with two movable bridges that position sensors and sources anywhere within the water volume.

 Figure 2 - The empty anechoic tank features absorbent plates shaped like spikes made of cork agglomerates and rubber to minimize sound reflections.

Instrumentation Chain:
Signals were recorded using a calibrated Brüel & Kjær 8104 hydrophone (0.1 Hz – 200 kHz, with constant directivity up to 20 kHz), placed 2.5 meters deep at the center of the tank, as illustrated in Figure 3. The hydrophone output was connected to a two-stage, adjustable-gain Brüel & Kjær 2636 amplifier set with a 22.4 kHz low-pass filter, followed by a 16-bit sound card sampling at 44.1 kHz. Levels were monitored with an HP oscilloscope and spectral analyzer.

 Figure 3 - Hydrophone Bruel & Kjaer type 8104 (left) and its typical directivity pattern (right)

Target sources:
Five propulsion units were tested: four Mercury outboard engines rated at 3.6, 4.5, 8, and 18 horsepower (Figure 4), along with an electric motor from a radio-controlled ship model (Figure 5). Each unit was recorded under various operating conditions, from idle disengaged to medium forward, with additional cyclic accelerations where specified.

Figure 4 -  Outboard Mercury motors: 3.6 horsepower (left), 4.5 horsepower (center left), 8 horsepower (center right), and 18 horsepower (right).
 Figure 5 - Used electric model (left) and the model 3-blade propeller (right).

Background noise and transients:
To mimic coastal clutter, we introduced controlled disturbances: intense and mild compressed-air bubbling, low- and high-flow water hoses, water-bucket pours, metallic-tube impacts with a mallet or hammer, and discrete air-rifle shots, as illustrated in Figures 6 and 7. Pure-noise and pure-transient segments were recorded separately to support data augmentation and detection tasks.

Figure 6 - Metallic bars used to generate the transients present in the dataset.
Figure 7 - Compressed air bubbles creation using compressed air hoses to generate noise in the dataset

Extended scenarios and use cases:
Wolfset includes all ten pairwise motor combinations plus one triple-motor case, recorded without added noise or transients to isolate interaction tones. These controlled mixes, along with the single-source clips, support research in source separation, classification, and domain adaptation between pristine and cluttered underwater environments. The strict control of space, hardware, and metadata makes Wolfset a reproducible reference for benchmarking modern machine learning models in underwater acoustics.

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Follow the Topic

Acoustics
Physical Sciences > Physics and Astronomy > Classical and Continuum Physics > Acoustics
Ocean Sciences
Physical Sciences > Earth and Environmental Sciences > Earth Sciences > Ocean Sciences

Related Collections

With Collections, you can get published faster and increase your visibility.

Data for crop management

This Scientific Data Collection welcomes submissions of Data Descriptors associated with datasets for crop management, which are essential for optimising agricultural productivity, sustainability, and food security.

Publishing Model: Open Access

Deadline: Jan 17, 2026

Computed Tomography (CT) Datasets

This Scientific Data Collection highlights a series of articles that describe CT imaging datasets.

Publishing Model: Open Access

Deadline: Feb 21, 2026