基于SOE-YOLO轻量化的水面目标检测算法

doi:10.11996/JG.j.2095-302X.2024040736

图学学报 ›› 2024, Vol. 45 ›› Issue (4): 736-744.DOI: 10.11996/JG.j.2095-302X.2024040736

• 图像处理与计算机视觉 • 上一篇下一篇

基于SOE-YOLO轻量化的水面目标检测算法

曾志超¹(), 徐玥¹, 王景玉¹, 叶元龙¹, 黄志开¹(), 王欢²

1.南昌工程学院信息工程学院，江西南昌 330000
2.南昌工程学院机械工程学院，江西南昌 330000

收稿日期:2024-01-15 接受日期:2024-04-12 出版日期:2024-08-31 发布日期:2024-09-03
通讯作者:黄志开(1969-)，男，教授，博士。主要研究方向为图形图像处理、计算机视觉等。E-mail：1625305627@qq.com
第一作者:曾志超(1998-)，男，硕士研究生。主要研究方向为图像处理与目标检测。E-mail：z2c0828@163.com
基金资助:
国家重点研发计划项目(2019YFB1704502);国家自然科学基金项目(61472173);江西省研究生创新专项(yc2023-s995);江西省研究生创新专项(YJSCX202312)

A water surface target detection algorithm based on SOE-YOLO lightweight network

ZENG Zhichao¹(), XU Yue¹, WANG Jingyu¹, YE Yuanlong¹, HUANG Zhikai¹(), WANG Huan²

1. School of Information Engineering, Nanchang Institute of Technology, Nanchang Jiangxi 330000, China
2. School of Mechanical Engineering, Nanchang Institute of Technology, Nanchang Jiangxi 330000, China

Received:2024-01-15 Accepted:2024-04-12 Published:2024-08-31 Online:2024-09-03
Contact: HUANG Zhikai (1969-), professor, Ph.D. His main research interests cover graphic image processing, computer vision, etc. E-mail：1625305627@qq.com
First author：ZENG Zhichao (1998-), master student. His main research interests cover graphic image processing and object detection. E-mail：z2c0828@163.com
Supported by:
National Key Research and Development Plan of China(2019YFB1704502);National Natural Science Foundation of China(61472173);Jiangxi Provincial Graduate Innovation Special Fund Project(yc2023-s995);Jiangxi Provincial Graduate Innovation Special Fund Project(YJSCX202312)

摘要/Abstract

摘要：

针对复杂多变的水面环境，小目标检测存在漏检、错检且检测平台计算资源有限的问题，提出了基于YOLOv8的轻量化水面目标检测算法SOE-YOLO。首先在Neck部分使用包含GSConv的Slim-Neck设计范式对模型进行轻量化改进；其次通过使用轻量型卷积(ODConv)模块重新构建Backbone部分，以减少参数量从而提高网络的检测速度；最后引入多尺度注意力机制(EMA)增强网络多尺度特征提取能力，提高小目标检测能力。在WSODD测试集中的实验结果表明，SOE-YOLO模型参数量和计算量分别为2.8 M和6.6 GFLOPs，与原模型相比分别减少12.5%和18.6%，同时mAP@%0.5和mAP@0.5-0.95分别达到79.9%和47.2%，与原模型相比分别提高2.4%和1.6%，且漏检率下降明显，优于当前流行的目标检测算法。FPS达到了64.25，满足水面目标检测实时性的要求。在实现轻量化的同时具有更好的检测性能，满足了在计算资源受限环境下的部署需求。

关键词: 水面目标检测, YOLOv8, 轻量化改进, Slim-Neck设计范式, 注意力机制

Abstract:

A lightweight water surface object detection algorithm SOE-YOLO based on YOLOv8 was proposed to address the issues of missed and false detections in complex and ever-changing water surface environments, as well as limited computing resources on the detection platform. Firstly, the Slim-Neck paradigm containing GSConv was employed to improve the weight of the model in the Neck part. Secondly, the Backbone section was reconstructed using a lightweight convolutional ODConv (omni-dimensional dynamic convolution) module, thereby reducing the number of parameters to improve the detection speed of the network. Finally, the multi-scale attention mechanism EMA (effective multi-scale attention) was introduced to enhance the network’s capability in extracting multi-scale features, thereby enhancing the small target detection accuracy. The experimental results on the WSODD (water surface object detection) test set demonstrated that the parameter and computational quantities of the SOE-YOLO model were 2.8 M and 6.6 GFLOPs, respectively, which were reduced by 12.5% and 18.6% compared to the original model. At the same time, mAP @% 0.5 and mAP@0.5-.95 reached 79.9% and 47.2%, respectively, which were 2.4% and 1.6% higher than the original model, and the missed detection rate decreased significantly, outperforming the current popular object detection algorithms. The FPS reached 64.25, meeting the requirements of real-time detection of surface targets. It could achieve better detection performance, while achieving lightweight, meeting deployment requirements in computing-resource-constrained environments.

Key words: water surface object detection, YOLOV8, lightweight improvement, Slim-Neck design paradigm, attention mechanisms

中图分类号:

TP391
U665

曾志超, 徐玥, 王景玉, 叶元龙, 黄志开, 王欢. 基于SOE-YOLO轻量化的水面目标检测算法[J]. 图学学报, 2024, 45(4): 736-744.

ZENG Zhichao, XU Yue, WANG Jingyu, YE Yuanlong, HUANG Zhikai, WANG Huan. A water surface target detection algorithm based on SOE-YOLO lightweight network[J]. Journal of Graphics, 2024, 45(4): 736-744.

图/表 11

参考文献 27

[1]	侯瑞超, 唐智诚, 王博, 等. 水面无人艇智能化技术的发展现状和趋势[J]. 中国造船, 2020, 61(S1): 211-220.
	HOU R C, TANG Z C, WANG B, et al. Development status and trend of intelligent technology for surface unmanned boat[J]. Shipbuilding of China, 2020, 61(S1): 211-220 (in Chinese).
[2]	罗逸豪, 孙创, 邵成, 等. 基于深度学习的水面无人艇目标检测算法综述[J]. 数字海洋与水下攻防, 2022, 5(6): 524-538.
	LUO Y H, SUN C, SHAO C, et al. Review on object detection algorithm for unmanned surface vehicle based on deep learning[J]. Digital Ocean & Underwater Warfare, 2022, 5(6): 524-538 (in Chinese).
[3]	盛明伟, 李俊, 秦洪德, 等. 基于改进YOLOv3的船舶目标检测算法[J]. 导航与控制, 2021, 20(2): 95-109. DOI
	SHENG M W, LI J, QIN H D, et al. Ship target detection algorithm based on the improved YOLOv3[J]. Navigation and Control, 2021, 20(2): 95-109 (in Chinese).
[4]	程亮, 杨渊, 张云飞, 等. 面向无人艇智能感知的水上目标识别算法研究[J]. 电子测量与仪器学报, 2021, 35(9): 99-104.
	CHENG L, YANG Y, ZHANG Y F, et al. Research on water target recognition algorithm for unmanned surface vessel[J]. Journal of Electronic Measurement and Instrumentation, 2021, 35(9): 99-104 (in Chinese).
[5]	冯辉, 郭俊东, 徐海祥. 面向精准目标定位的水面目标检测算法[J]. 华中科技大学学报: 自然科学版, 2023, 51(10): 38-43.
	FENG H, GUO J D, XU H X. Water surface object detection algorithm for accurate object location[J]. Journal of Huazhong University of Science and Technology: Natural Science Edition, 2023, 51(10): 38-43 (in Chinese).
[6]	LIN F, HOU T, JIN Q, et al. Improved YOLO based detection algorithm for floating debris in waterway[EB/OL]. [2023-11-20]. https://doi.org/10.3390/e23091111.
[7]	刘子洋, 徐慧英, 朱信忠, 等. Bi-YOLO: 一种基于YOLOv8改进的轻量化目标检测算法[EB/OL]. [2023-12-20]. https://link.cnki.net/urlid/43.1258.TP.20231107.1657.002.
	LIU Z Y, XU H Y, ZHU X Z, et al. Bi-YOLO: an improved lightweight object detection algorithm based on YOLOv8[EB/OL]. [2023-11-20]. https://link.cnki.net/urlid/43.1258.TP.20231107.1657.002 (in Chinese).
[8]	HOWARD A G, ZHU M, CHEN B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications[EB/OL]. [2023-11-20]. https://arxiv.orgabs/1704.04861.
[9]	MA N, ZHANG X, ZHENG H, et al. ShuffleNet V2: practical guidelines for efficient CNN architecture design[EB/OL]. [2023-11-20]. https://arxiv.org/abs/1807.11164.
[10]	CHOLLET F. Xception: deep learning with depthwise separable convolutions[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 1800-1807.
[11]	LI H L, LI J, WEI H B, et al. Slim-neck by GSConv: a better design paradigm of detector architectures for autonomous vehicles[EB/OL]. [2023-11-20]. http://arxiv.org/abs/2206.02424.
[12]	YANG B, BENDER G, LE Q V, et al. CondConv: conditionally parameterized convolutions for efficient inference[EB/OL]. [2023-10-20]. http://arxiv.org/abs/1904.04971.
[13]	ZHANG Y K, ZHANG J, WANG Q, et al. DyNet: dynamic convolution for accelerating convolutional neural networks[EB/OL]. [2023-10-20]. http://arxiv.org/abs/2004.10694.
[14]	LI C, ZHOU A J, YAO A B. Omni-dimensional dynamic convolution[EB/OL]. [2023-10-20]. http://arxiv.org/abs/2209.07947.
[15]	丘锐聪, 周海峰, 陈颖, 等. 基于轻量化YOLOv7-tiny的船舶目标检测算法[EB/OL]. [2023-10-20]. https://link.cnki.net/urlid/21.1360.U.20231129.1740.002.
	QIU R C, ZHOU H F, CHEN Y, et al. Ship target detection algorithm based on lightweight YOLOv7-tiny[EB/OL]. [2023-12-20]. https://link.cnki.net/urlid/21.1360.U.20231129.1740.002 (in Chinese).
[16]	HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 7132-7141.
[17]	WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]// European Conference on Computer Vision. Cham: Springer, 2018: 3-19.
[18]	OUYANG D L, HE S, ZHANG G Z, et al. Efficient multi-scale attention module with cross-spatial learning[C]// ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing. New York: IEEE Press, 2023: 1-5.
[19]	ZHOU Z G, SUN J E, YU J B, et al. An image-based benchmark dataset and a novel object detector for water surface object detection[J]. Frontiers in Neurorobotics, 2021, 15: 723336.
[20]	HOU Q B, ZHOU D Q, FENG J S. Coordinate attention for efficient mobile network design[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 13708-13717.
[21]	YANG L X, ZHANG R Y, LI L D, et al. SimAM: a Simple, parameter-free attention module for convolutional neural networks[EB/OL]. [2023-12-20]. https://api.semanticscholar.org/CorpusID:235825945.
[22]	LIU Y C, SHAO Z R, TENG Y Y, et al. NAM: normalization- based attention module[EB/OL]. [2023-12-20]. http://arxiv.org/abs/2111.12419.
[23]	WANG Q L, WU B G, ZHU P F, et al. ECA-net: efficient channel attention for deep convolutional neural networks[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 11531-11539.
[24]	REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. DOI PMID
[25]	LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector[C]// European Conference on Computer Vision. Cham: Springer, 2016: 21-37.
[26]	ZHAO Y A, LV W Y, XU S L, et al. DETRs beat YOLOs on real-time object detection[EB/OL]. [2023-10-20]. http://arxiv.org/abs/2304.08069.
[27]	CHEN H T, WANG Y H, GUO J Y, et al. VanillaNet: the power of minimalism in deep learning[EB/OL]. [2023-10-20]. http://arxiv.org/abs/2305.12972.

配置环境	版本型号
操作系统	Windows10
深度学习框架	Pytorch 1.13.1
计算框架	CUDA 11.1
语言	Python3.8
CPU	AMD Ryzen 7 3700X 8-Core Processor
GPU	Nvidia GeForce RTX 3090Ti

配置环境	版本型号
操作系统	Windows10
深度学习框架	Pytorch 1.13.1
计算框架	CUDA 11.1
语言	Python3.8
CPU	AMD Ryzen 7 3700X 8-Core Processor
GPU	Nvidia GeForce RTX 3090Ti

类别	图片/张	实例/个
Boat	4 325	8 179
Ship	1 832	3 423
Ball	652	2 609
Bridge	1 827	2 014
Rock	696	1 540
Person	357	695
Rubbish	461	669
Mast	177	354
Buoy	153	167
Platform	480	614
Harbor	1 211	1 224
Tree	72	219
Grass	103	110
Animal	50	94

类别	图片/张	实例/个
Boat	4 325	8 179
Ship	1 832	3 423
Ball	652	2 609
Bridge	1 827	2 014
Rock	696	1 540
Person	357	695
Rubbish	461	669
Mast	177	354
Buoy	153	167
Platform	480	614
Harbor	1 211	1 224
Tree	72	219
Grass	103	110
Animal	50	94

模型	mAP@0.5/%	mAP@0.5~0.95/%	Params/M	FLOPs/G	FPS
YOLOv8n(baseline)	77.5	45.6	3.2	8.1	60.24
YOLOv8+Slim-Neck	79.2	45.6	2.8	7.3	65.05
YOLOv8+ODConv	78.8	46.0	3.0	7.2	63.51
YOLOv8+ODConv+Slim-Neck	79.2	47.1	2.6	6.6	68.50

基于SOE-YOLO轻量化的水面目标检测算法

A water surface target detection algorithm based on SOE-YOLO lightweight network

RichHTML

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

图/表 11

参考文献 27

相关文章 15

编辑推荐

Metrics

本文评价

模型	map@0.5/%	map@0.5~0.95/%	Params/M	FLOPs/G	FPS
SSD	44.8	-	26.3	31.1	32.40
Faster R-CNN	34.1	-	41.2	41.2	30.60
YOLOv3	56.8	27.3	61.6	27.9	40.50
YOLOv5s	80.1	43.0	7.0	16.6	57.13
YOLOv8n(baseline)	77.5	45.6	3.2	8.1	60.24
YOLOv8-ShuffleNetV2	74.0	41.8	1.9	5.2	77.17
YOLOv8-MobileNetV3	75.1	41.6	2.3	5.7	71.81
YOLOv8-Vanillnet	73.5	40.8	2.0	5.7	70.92
Bi-YOLO	64.8	36.5	2.9	63.5	42.37
RT-DETR-r18	70.2	40.1	20.0	60.0	55.30
YOLOv7-tiny	70.5	37.0	4.2	7.0	53.47
SOE-YOLO(本文)	79.9	47.2	2.8	6.6	64.25

[1]	胡凤阔 , 叶兰 , 谭显峰 , 张钦展 , 胡志新 , 方清 , 王磊 , 满孝锋 . 一种基于改进 YOLOv8 的轻量化路面病害检测算法[J]. 图学学报, 2024, 45(5): 892-900.
[2]	王亚茹, 冯利龙, 宋晓轲, 屈卓, 杨珂, 王乾铭, 翟永杰. TFD-YOLOv8：一种用于输电线路的异物检测方法 [J]. 图学学报, 2024, 45(5): 901-912.
[3]	刘义艳 , 郝婷楠 , 贺晨 , 常英杰 . 基于 DBBR-YOLO 的光伏电池表面缺陷检测[J]. 图学学报, 2024, 45(5): 913-921.
[4]	吴沛宸 , 袁立宁 , 胡皓 , 刘钊 , 郭放 . 基于注意力特征融合的视频异常行为检测[J]. 图学学报, 2024, 45(5): 922-929.
[5]	刘丽, 张起凡, 白宇昂, 黄凯烨. 结合Swin Transformer的多尺度遥感图像变化检测研究[J]. 图学学报, 2024, 45(5): 941-956.
[6]	章东平 , 魏杨悦 , 何数技 , 徐云超 , 胡海苗 , 黄文君 . 特征融合与层间传递：一种基于Anchor DETR改进的目标检测方法[J]. 图学学报, 2024, 45(5): 968-978.
[7]	李刚 , 蔡泽浩 , 孙华勋 , 赵振兵 . 基于改进 OLOv8与语义知识融合的金具缺陷检测方法研究[J]. 图学学报, 2024, 45(5): 979-986.
[8]	谢国波, 林松泽, 林志毅, 吴陈锋, 梁立辉. 基于改进YOLOv7-tiny的道路病害检测算法[J]. 图学学报, 2024, 45(5): 987-997.
[9]	熊超 , 王云艳 , 罗雨浩 . 特征对齐与上下文引导的多视图三维重建[J]. 图学学报, 2024, 45(5): 1008-1016.
[10]	彭文, 林金炜. 基于空间信息关注和纹理增强的短小染色体分类方法[J]. 图学学报, 2024, 45(5): 1017-1029.
[11]	刘宗明 , 洪唯 , 龙睿 , 祝越 , 张小宇 . 基于自注意机制的乳源瑶绣自动生成与应用研究[J]. 图学学报, 2024, 45(5): 1096-1105.
[12]	李大湘, 吉展, 刘颖, 唐垚. 改进YOLOv7遥感图像目标检测算法[J]. 图学学报, 2024, 45(4): 650-658.
[13]	魏敏, 姚鑫. 基于多尺度与注意力机制的两阶段风暴单体外推研究[J]. 图学学报, 2024, 45(4): 696-704.
[14]	胡欣, 常娅姝, 秦皓, 肖剑, 程鸿亮. 基于改进YOLOv8和GMM图像点集匹配的双目测距方法[J]. 图学学报, 2024, 45(4): 714-725.
[15]	牛为华, 郭迅. 基于改进YOLOv8的船舰遥感图像旋转目标检测算法[J]. 图学学报, 2024, 45(4): 726-735.

模型	mAP@0.5/%	mAP@0.5~0.95/%	Params/M	FLOPs/G	FPS
YOLOv8n(baseline)	77.5	45.6	3.1	8.1	60.24
YOLOv8n+CA	77.7	45.3	2.8	7.4	54.64
YOLOv8n+SE	76.2	44.5	3.0	8.0	55.13
YOLOv8n+NAM	78.5	45.8	3.0	8.1	56.82
YOLOv8n+SimAM	78.0	45.3	3.0	8.1	58.14
YOLOv8n+ECA	78.7	45.6	3.0	8.1	52.91
YOLOv8n+EMA	78.8	45.7	3.0	8.3	55.87
YOLOv8+ODConv+Slim-Neck	79.2	47.1	2.6	6.6	68.50
YOLOv8+ODConv+Slim-Neck+ECA	79.1	45.7	2.8	6.4	61.00
YOLOv8+ODConv+Slim-Neck+NAM	78.6	45.8	2.8	6.4	63.86
YOLOv8+ODConv+Slim-Neck+C2f_EMA	78.3	45.6	2.8	6.5	65.52
YOLOv8+ODConv+Slim-Neck+EMA(本文)	79.9	47.2	2.8	6.6	64.25

模型	mAP@0.5/%	mAP@0.5~0.95/%	Params/M	FLOPs/G	FPS
YOLOv8n(baseline)	77.5	45.6	3.1	8.1	60.24
YOLOv8n+CA	77.7	45.3	2.8	7.4	54.64
YOLOv8n+SE	76.2	44.5	3.0	8.0	55.13
YOLOv8n+NAM	78.5	45.8	3.0	8.1	56.82
YOLOv8n+SimAM	78.0	45.3	3.0	8.1	58.14
YOLOv8n+ECA	78.7	45.6	3.0	8.1	52.91
YOLOv8n+EMA	78.8	45.7	3.0	8.3	55.87
YOLOv8+ODConv+Slim-Neck	79.2	47.1	2.6	6.6	68.50
YOLOv8+ODConv+Slim-Neck+ECA	79.1	45.7	2.8	6.4	61.00
YOLOv8+ODConv+Slim-Neck+NAM	78.6	45.8	2.8	6.4	63.86
YOLOv8+ODConv+Slim-Neck+C2f_EMA	78.3	45.6	2.8	6.5	65.52
YOLOv8+ODConv+Slim-Neck+EMA(本文)	79.9	47.2	2.8	6.6	64.25