Behind the Paper

A deep learning framework based on structured space model for detecting small objects in complex underwater environments

Underwater target detection plays a crucial role in monitoring the marine ecological environment. In this paper, we propose a deep learning framework combining Structured Space Model(SSM) and CNN, specifically designed for small target detection tasks in complex underwater environments.

Published in Sustainability

Feb 25, 2025

Yaoming Zhuang

Assistant Professor, Northeastern University

Liked by India Ambler and 1 other

Explore the Research

Research Background

Regular monitoring of marine life is crucial for maintaining the stability of marine ecosystems, and effective marine monitoring relies on accurately counting the species and quantities of marine organisms. Therefore, underwater target detection algorithms play a significant role in assessing the stability of marine ecosystems and have profound research significance. However, current underwater target detection algorithms face three main challenges:

Underwater scenes are often affected by a blue-green light shift, causing confusion between the target and background, which increases the difficulty of detection.
The underwater environment contains numerous small targets, which are prone to stacking and occlusion, making it difficult for existing detection algorithms to identify them accurately.
Since underwater robots are the main tools for ocean exploration and documentation, their computational capacity is limited by hardware constraints, meaning underwater target detection algorithms must be lightweight to meet real-time detection requirements.

Why Consider Applying the Mamba Model to Object Detection Tasks?

The Mamba model is based on the structured space model (SSM), and its core advantage lies in its ability to model globally, effectively addressing the limitations of traditional convolutional neural networks (CNNs) in terms of local receptive fields. While Transformer models also offer global modeling capabilities, their computational complexity grows quadratically, which places high demands on hardware resources. In contrast, the Mamba model overcomes this limitation with its selective scanning mechanism and linear computational complexity, making it particularly well-suited for resource-constrained environments.

Despite Mamba's strong global modeling capabilities in object detection tasks, we found that relying solely on SSM for feature extraction did not achieve the desired results. This is because, as a causal modeling method similar to RNNs, the Mamba model processes each image block sequentially, lacking sensitivity to long-range dependencies between non-adjacent pixels. To address this issue, we attempted to combine SSM with CNNs, aiming to provide richer local information to the image through CNNs, thereby enhancing the feature representation before processing with SSM.

Summary and Future Directions

This paper proposes an underwater small target detection method, UWNet, which combines the Mamba model with the YOLO framework. By introducing the Mamba model and a multi-scale implicit feature fusion module, we significantly improve detection accuracy for small underwater targets, particularly in handling complex underwater scenes, demonstrating stronger robustness and accuracy compared to traditional detection algorithms. Experimental results show that UWNet outperforms existing object detection methods across several test sets.

Although UWNet has achieved good accuracy in underwater target detection, further optimization of the method is still possible in future research. First, model pruning and knowledge distillation techniques can be employed to further reduce computational costs and model complexity, enhancing real-time detection capabilities. Second, underwater image enhancement techniques can be considered to improve image clarity and reduce the impact of color distortion on detection results. Alternatively, diffusion models can be used to augment underwater datasets by generating underwater images in various scenes and styles, thereby enhancing data diversity and improving the model's generalization ability.

Yaoming Zhuang (He/Him)

Assistant Professor, Northeastern University

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Follow the Topic

Sustainability

Research Communities > Community > Sustainability

Communications Engineering

Communications Engineering

A selective open access journal from Nature Portfolio publishing high-quality research, reviews and commentary in all areas of engineering.

More about the journal

What are SDG Topics?

An introduction to Sustainable Development Goals (SDGs) Topics and their role in highlighting sustainable development research.

Related Collections

With Collections, you can get published faster and increase your visibility.

Engineering Solutions in Wind Energy Systems: Design, Efficiency, and Sustainability

In this collection, we aim to showcase cutting-edge research and developments that advance the efficiency and sustainability of wind energy systems, from turbine design and performance optimization to conversion, maintenance, multifunctional wind turbine foundations, and end-of-life management.

Publishing Model: Open Access

Deadline: Sep 30, 2026

Explore this Collection

Sensing Technologies for Crop Health and Growth

With this cross-journal Collection, we invite manuscripts that highlight continual or real-time sensing systems of any form, primarily for arable farming or crop production.

Publishing Model: Hybrid

Deadline: Jul 31, 2026

Explore this Collection

Call for papers: Medical Ultrasound: Emerging Techniques and Applications

Behind the Paper

Unlocking Scientific Data Hidden in Charts: Behind the Development of ChartRecover

Behind the Paper

Flying on Mars starts with understanding Mars

Opportunities

Call for papers: Electric Vehicle Battery Repurposing, Recycling and Regenerating

Opportunities

Call for papers: Microfluidic advances in biosensing, biofabrication, and next generation microreactors

Cookies

We use cookies to ensure the functionality of our website, to personalize content and advertising, to provide social media features, and to analyze our traffic. If you allow us to do so, we also inform our social media, advertising and analysis partners about your use of our website. You can decide for yourself which categories you want to deny or allow. Please note that based on your settings not all functionalities of the site are available.

Further information can be found in our privacy policy.

A deep learning framework based on structured space model for detecting small objects in complex underwater environments

Share this post

Share with...

...or copy the link