A water surface target detection algorithm based on SOE-YOLO lightweight network

doi:10.11996/JG.j.2095-302X.2024040736

Abstract

Abstract:

A lightweight water surface object detection algorithm SOE-YOLO based on YOLOv8 was proposed to address the issues of missed and false detections in complex and ever-changing water surface environments, as well as limited computing resources on the detection platform. Firstly, the Slim-Neck paradigm containing GSConv was employed to improve the weight of the model in the Neck part. Secondly, the Backbone section was reconstructed using a lightweight convolutional ODConv (omni-dimensional dynamic convolution) module, thereby reducing the number of parameters to improve the detection speed of the network. Finally, the multi-scale attention mechanism EMA (effective multi-scale attention) was introduced to enhance the network’s capability in extracting multi-scale features, thereby enhancing the small target detection accuracy. The experimental results on the WSODD (water surface object detection) test set demonstrated that the parameter and computational quantities of the SOE-YOLO model were 2.8 M and 6.6 GFLOPs, respectively, which were reduced by 12.5% and 18.6% compared to the original model. At the same time, mAP @% 0.5 and mAP@0.5-.95 reached 79.9% and 47.2%, respectively, which were 2.4% and 1.6% higher than the original model, and the missed detection rate decreased significantly, outperforming the current popular object detection algorithms. The FPS reached 64.25, meeting the requirements of real-time detection of surface targets. It could achieve better detection performance, while achieving lightweight, meeting deployment requirements in computing-resource-constrained environments.

Key words: water surface object detection, YOLOV8, lightweight improvement, Slim-Neck design paradigm, attention mechanisms

CLC Number:

TP391
U665

ZENG Zhichao, XU Yue, WANG Jingyu, YE Yuanlong, HUANG Zhikai, WANG Huan. A water surface target detection algorithm based on SOE-YOLO lightweight network[J]. Journal of Graphics, 2024, 45(4): 736-744.

Figures/Tables 11

References 27

[1]	侯瑞超, 唐智诚, 王博, 等. 水面无人艇智能化技术的发展现状和趋势[J]. 中国造船, 2020, 61(S1): 211-220.
	HOU R C, TANG Z C, WANG B, et al. Development status and trend of intelligent technology for surface unmanned boat[J]. Shipbuilding of China, 2020, 61(S1): 211-220 (in Chinese).
[2]	罗逸豪, 孙创, 邵成, 等. 基于深度学习的水面无人艇目标检测算法综述[J]. 数字海洋与水下攻防, 2022, 5(6): 524-538.
	LUO Y H, SUN C, SHAO C, et al. Review on object detection algorithm for unmanned surface vehicle based on deep learning[J]. Digital Ocean & Underwater Warfare, 2022, 5(6): 524-538 (in Chinese).
[3]	盛明伟, 李俊, 秦洪德, 等. 基于改进YOLOv3的船舶目标检测算法[J]. 导航与控制, 2021, 20(2): 95-109. DOI
	SHENG M W, LI J, QIN H D, et al. Ship target detection algorithm based on the improved YOLOv3[J]. Navigation and Control, 2021, 20(2): 95-109 (in Chinese).
[4]	程亮, 杨渊, 张云飞, 等. 面向无人艇智能感知的水上目标识别算法研究[J]. 电子测量与仪器学报, 2021, 35(9): 99-104.
	CHENG L, YANG Y, ZHANG Y F, et al. Research on water target recognition algorithm for unmanned surface vessel[J]. Journal of Electronic Measurement and Instrumentation, 2021, 35(9): 99-104 (in Chinese).
[5]	冯辉, 郭俊东, 徐海祥. 面向精准目标定位的水面目标检测算法[J]. 华中科技大学学报: 自然科学版, 2023, 51(10): 38-43.
	FENG H, GUO J D, XU H X. Water surface object detection algorithm for accurate object location[J]. Journal of Huazhong University of Science and Technology: Natural Science Edition, 2023, 51(10): 38-43 (in Chinese).
[6]	LIN F, HOU T, JIN Q, et al. Improved YOLO based detection algorithm for floating debris in waterway[EB/OL]. [2023-11-20]. https://doi.org/10.3390/e23091111.
[7]	刘子洋, 徐慧英, 朱信忠, 等. Bi-YOLO: 一种基于YOLOv8改进的轻量化目标检测算法[EB/OL]. [2023-12-20]. https://link.cnki.net/urlid/43.1258.TP.20231107.1657.002.
	LIU Z Y, XU H Y, ZHU X Z, et al. Bi-YOLO: an improved lightweight object detection algorithm based on YOLOv8[EB/OL]. [2023-11-20]. https://link.cnki.net/urlid/43.1258.TP.20231107.1657.002 (in Chinese).
[8]	HOWARD A G, ZHU M, CHEN B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications[EB/OL]. [2023-11-20]. https://arxiv.orgabs/1704.04861.
[9]	MA N, ZHANG X, ZHENG H, et al. ShuffleNet V2: practical guidelines for efficient CNN architecture design[EB/OL]. [2023-11-20]. https://arxiv.org/abs/1807.11164.
[10]	CHOLLET F. Xception: deep learning with depthwise separable convolutions[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 1800-1807.
[11]	LI H L, LI J, WEI H B, et al. Slim-neck by GSConv: a better design paradigm of detector architectures for autonomous vehicles[EB/OL]. [2023-11-20]. http://arxiv.org/abs/2206.02424.
[12]	YANG B, BENDER G, LE Q V, et al. CondConv: conditionally parameterized convolutions for efficient inference[EB/OL]. [2023-10-20]. http://arxiv.org/abs/1904.04971.
[13]	ZHANG Y K, ZHANG J, WANG Q, et al. DyNet: dynamic convolution for accelerating convolutional neural networks[EB/OL]. [2023-10-20]. http://arxiv.org/abs/2004.10694.
[14]	LI C, ZHOU A J, YAO A B. Omni-dimensional dynamic convolution[EB/OL]. [2023-10-20]. http://arxiv.org/abs/2209.07947.
[15]	丘锐聪, 周海峰, 陈颖, 等. 基于轻量化YOLOv7-tiny的船舶目标检测算法[EB/OL]. [2023-10-20]. https://link.cnki.net/urlid/21.1360.U.20231129.1740.002.
	QIU R C, ZHOU H F, CHEN Y, et al. Ship target detection algorithm based on lightweight YOLOv7-tiny[EB/OL]. [2023-12-20]. https://link.cnki.net/urlid/21.1360.U.20231129.1740.002 (in Chinese).
[16]	HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 7132-7141.
[17]	WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]// European Conference on Computer Vision. Cham: Springer, 2018: 3-19.
[18]	OUYANG D L, HE S, ZHANG G Z, et al. Efficient multi-scale attention module with cross-spatial learning[C]// ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing. New York: IEEE Press, 2023: 1-5.
[19]	ZHOU Z G, SUN J E, YU J B, et al. An image-based benchmark dataset and a novel object detector for water surface object detection[J]. Frontiers in Neurorobotics, 2021, 15: 723336.
[20]	HOU Q B, ZHOU D Q, FENG J S. Coordinate attention for efficient mobile network design[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 13708-13717.
[21]	YANG L X, ZHANG R Y, LI L D, et al. SimAM: a Simple, parameter-free attention module for convolutional neural networks[EB/OL]. [2023-12-20]. https://api.semanticscholar.org/CorpusID:235825945.
[22]	LIU Y C, SHAO Z R, TENG Y Y, et al. NAM: normalization- based attention module[EB/OL]. [2023-12-20]. http://arxiv.org/abs/2111.12419.
[23]	WANG Q L, WU B G, ZHU P F, et al. ECA-net: efficient channel attention for deep convolutional neural networks[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 11531-11539.
[24]	REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. DOI PMID
[25]	LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector[C]// European Conference on Computer Vision. Cham: Springer, 2016: 21-37.
[26]	ZHAO Y A, LV W Y, XU S L, et al. DETRs beat YOLOs on real-time object detection[EB/OL]. [2023-10-20]. http://arxiv.org/abs/2304.08069.
[27]	CHEN H T, WANG Y H, GUO J Y, et al. VanillaNet: the power of minimalism in deep learning[EB/OL]. [2023-10-20]. http://arxiv.org/abs/2305.12972.

配置环境	版本型号
操作系统	Windows10
深度学习框架	Pytorch 1.13.1
计算框架	CUDA 11.1
语言	Python3.8
CPU	AMD Ryzen 7 3700X 8-Core Processor
GPU	Nvidia GeForce RTX 3090Ti

配置环境	版本型号
操作系统	Windows10
深度学习框架	Pytorch 1.13.1
计算框架	CUDA 11.1
语言	Python3.8
CPU	AMD Ryzen 7 3700X 8-Core Processor
GPU	Nvidia GeForce RTX 3090Ti

类别	图片/张	实例/个
Boat	4 325	8 179
Ship	1 832	3 423
Ball	652	2 609
Bridge	1 827	2 014
Rock	696	1 540
Person	357	695
Rubbish	461	669
Mast	177	354
Buoy	153	167
Platform	480	614
Harbor	1 211	1 224
Tree	72	219
Grass	103	110
Animal	50	94

类别	图片/张	实例/个
Boat	4 325	8 179
Ship	1 832	3 423
Ball	652	2 609
Bridge	1 827	2 014
Rock	696	1 540
Person	357	695
Rubbish	461	669
Mast	177	354
Buoy	153	167
Platform	480	614
Harbor	1 211	1 224
Tree	72	219
Grass	103	110
Animal	50	94

模型	mAP@0.5/%	mAP@0.5~0.95/%	Params/M	FLOPs/G	FPS
YOLOv8n(baseline)	77.5	45.6	3.2	8.1	60.24
YOLOv8+Slim-Neck	79.2	45.6	2.8	7.3	65.05
YOLOv8+ODConv	78.8	46.0	3.0	7.2	63.51
YOLOv8+ODConv+Slim-Neck	79.2	47.1	2.6	6.6	68.50