图学学报 ›› 2024, Vol. 45 ›› Issue (4): 736-744.DOI: 10.11996/JG.j.2095-302X.2024040736
曾志超1(
), 徐玥1, 王景玉1, 叶元龙1, 黄志开1(
), 王欢2
收稿日期:2024-01-15
接受日期:2024-04-12
出版日期:2024-08-31
发布日期:2024-09-03
通讯作者:黄志开(1969-),男,教授,博士。主要研究方向为图形图像处理、计算机视觉等。E-mail:1625305627@qq.com第一作者:曾志超(1998-),男,硕士研究生。主要研究方向为图像处理与目标检测。E-mail:z2c0828@163.com
基金资助:
ZENG Zhichao1(
), XU Yue1, WANG Jingyu1, YE Yuanlong1, HUANG Zhikai1(
), WANG Huan2
Received:2024-01-15
Accepted:2024-04-12
Published:2024-08-31
Online:2024-09-03
Contact:
HUANG Zhikai (1969-), professor, Ph.D. His main research interests cover graphic image processing, computer vision, etc. E-mail:1625305627@qq.comFirst author:ZENG Zhichao (1998-), master student. His main research interests cover graphic image processing and object detection. E-mail:z2c0828@163.com
Supported by:摘要:
针对复杂多变的水面环境,小目标检测存在漏检、错检且检测平台计算资源有限的问题,提出了基于YOLOv8的轻量化水面目标检测算法SOE-YOLO。首先在Neck部分使用包含GSConv的Slim-Neck设计范式对模型进行轻量化改进;其次通过使用轻量型卷积(ODConv)模块重新构建Backbone部分,以减少参数量从而提高网络的检测速度;最后引入多尺度注意力机制(EMA)增强网络多尺度特征提取能力,提高小目标检测能力。在WSODD测试集中的实验结果表明,SOE-YOLO模型参数量和计算量分别为2.8 M和6.6 GFLOPs,与原模型相比分别减少12.5%和18.6%,同时mAP@%0.5和mAP@0.5-0.95分别达到79.9%和47.2%,与原模型相比分别提高2.4%和1.6%,且漏检率下降明显,优于当前流行的目标检测算法。FPS达到了64.25,满足水面目标检测实时性的要求。在实现轻量化的同时具有更好的检测性能,满足了在计算资源受限环境下的部署需求。
中图分类号:
曾志超, 徐玥, 王景玉, 叶元龙, 黄志开, 王欢. 基于SOE-YOLO轻量化的水面目标检测算法[J]. 图学学报, 2024, 45(4): 736-744.
ZENG Zhichao, XU Yue, WANG Jingyu, YE Yuanlong, HUANG Zhikai, WANG Huan. A water surface target detection algorithm based on SOE-YOLO lightweight network[J]. Journal of Graphics, 2024, 45(4): 736-744.
| 配置环境 | 版本型号 |
|---|---|
| 操作系统 | Windows10 |
| 深度学习框架 | Pytorch 1.13.1 |
| 计算框架 | CUDA 11.1 |
| 语言 | Python3.8 |
| CPU | AMD Ryzen 7 3700X 8-Core Processor |
| GPU | Nvidia GeForce RTX 3090Ti |
表1 实验平台及设置
Table 1 Experimental platform and setup
| 配置环境 | 版本型号 |
|---|---|
| 操作系统 | Windows10 |
| 深度学习框架 | Pytorch 1.13.1 |
| 计算框架 | CUDA 11.1 |
| 语言 | Python3.8 |
| CPU | AMD Ryzen 7 3700X 8-Core Processor |
| GPU | Nvidia GeForce RTX 3090Ti |
| 类别 | 图片/张 | 实例/个 |
|---|---|---|
| Boat | 4 325 | 8 179 |
| Ship | 1 832 | 3 423 |
| Ball | 652 | 2 609 |
| Bridge | 1 827 | 2 014 |
| Rock | 696 | 1 540 |
| Person | 357 | 695 |
| Rubbish | 461 | 669 |
| Mast | 177 | 354 |
| Buoy | 153 | 167 |
| Platform | 480 | 614 |
| Harbor | 1 211 | 1 224 |
| Tree | 72 | 219 |
| Grass | 103 | 110 |
| Animal | 50 | 94 |
表2 数据集
Table 2 Dataset category
| 类别 | 图片/张 | 实例/个 |
|---|---|---|
| Boat | 4 325 | 8 179 |
| Ship | 1 832 | 3 423 |
| Ball | 652 | 2 609 |
| Bridge | 1 827 | 2 014 |
| Rock | 696 | 1 540 |
| Person | 357 | 695 |
| Rubbish | 461 | 669 |
| Mast | 177 | 354 |
| Buoy | 153 | 167 |
| Platform | 480 | 614 |
| Harbor | 1 211 | 1 224 |
| Tree | 72 | 219 |
| Grass | 103 | 110 |
| Animal | 50 | 94 |
| 模型 | mAP@0.5/% | mAP@0.5~0.95/% | Params/M | FLOPs/G | FPS |
|---|---|---|---|---|---|
| YOLOv8n(baseline) | 77.5 | 45.6 | 3.2 | 8.1 | 60.24 |
| YOLOv8+Slim-Neck | 79.2 | 45.6 | 2.8 | 7.3 | 65.05 |
| YOLOv8+ODConv | 78.8 | 46.0 | 3.0 | 7.2 | 63.51 |
| YOLOv8+ODConv+Slim-Neck | 79.2 | 47.1 | 2.6 | 6.6 | 68.50 |
表3 轻量化消融实验结果
Table 3 Lightweight ablation test results
| 模型 | mAP@0.5/% | mAP@0.5~0.95/% | Params/M | FLOPs/G | FPS |
|---|---|---|---|---|---|
| YOLOv8n(baseline) | 77.5 | 45.6 | 3.2 | 8.1 | 60.24 |
| YOLOv8+Slim-Neck | 79.2 | 45.6 | 2.8 | 7.3 | 65.05 |
| YOLOv8+ODConv | 78.8 | 46.0 | 3.0 | 7.2 | 63.51 |
| YOLOv8+ODConv+Slim-Neck | 79.2 | 47.1 | 2.6 | 6.6 | 68.50 |
| 模型 | mAP@0.5/% | mAP@0.5~0.95/% | Params/M | FLOPs/G | FPS |
|---|---|---|---|---|---|
| YOLOv8n(baseline) | 77.5 | 45.6 | 3.1 | 8.1 | 60.24 |
| YOLOv8n+CA | 77.7 | 45.3 | 2.8 | 7.4 | 54.64 |
| YOLOv8n+SE | 76.2 | 44.5 | 3.0 | 8.0 | 55.13 |
| YOLOv8n+NAM | 78.5 | 45.8 | 3.0 | 8.1 | 56.82 |
| YOLOv8n+SimAM | 78.0 | 45.3 | 3.0 | 8.1 | 58.14 |
| YOLOv8n+ECA | 78.7 | 45.6 | 3.0 | 8.1 | 52.91 |
| YOLOv8n+EMA | 78.8 | 45.7 | 3.0 | 8.3 | 55.87 |
| YOLOv8+ODConv+Slim-Neck | 79.2 | 47.1 | 2.6 | 6.6 | 68.50 |
| YOLOv8+ODConv+Slim-Neck+ECA | 79.1 | 45.7 | 2.8 | 6.4 | 61.00 |
| YOLOv8+ODConv+Slim-Neck+NAM | 78.6 | 45.8 | 2.8 | 6.4 | 63.86 |
| YOLOv8+ODConv+Slim-Neck+C2f_EMA | 78.3 | 45.6 | 2.8 | 6.5 | 65.52 |
| YOLOv8+ODConv+Slim-Neck+EMA(本文) | 79.9 | 47.2 | 2.8 | 6.6 | 64.25 |
表4 注意力机制对比实验结果
Table 4 Comparison of attention mechanisms with experimental results
| 模型 | mAP@0.5/% | mAP@0.5~0.95/% | Params/M | FLOPs/G | FPS |
|---|---|---|---|---|---|
| YOLOv8n(baseline) | 77.5 | 45.6 | 3.1 | 8.1 | 60.24 |
| YOLOv8n+CA | 77.7 | 45.3 | 2.8 | 7.4 | 54.64 |
| YOLOv8n+SE | 76.2 | 44.5 | 3.0 | 8.0 | 55.13 |
| YOLOv8n+NAM | 78.5 | 45.8 | 3.0 | 8.1 | 56.82 |
| YOLOv8n+SimAM | 78.0 | 45.3 | 3.0 | 8.1 | 58.14 |
| YOLOv8n+ECA | 78.7 | 45.6 | 3.0 | 8.1 | 52.91 |
| YOLOv8n+EMA | 78.8 | 45.7 | 3.0 | 8.3 | 55.87 |
| YOLOv8+ODConv+Slim-Neck | 79.2 | 47.1 | 2.6 | 6.6 | 68.50 |
| YOLOv8+ODConv+Slim-Neck+ECA | 79.1 | 45.7 | 2.8 | 6.4 | 61.00 |
| YOLOv8+ODConv+Slim-Neck+NAM | 78.6 | 45.8 | 2.8 | 6.4 | 63.86 |
| YOLOv8+ODConv+Slim-Neck+C2f_EMA | 78.3 | 45.6 | 2.8 | 6.5 | 65.52 |
| YOLOv8+ODConv+Slim-Neck+EMA(本文) | 79.9 | 47.2 | 2.8 | 6.6 | 64.25 |
| 模型 | map@0.5/% | map@0.5~0.95/% | Params/M | FLOPs/G | FPS |
|---|---|---|---|---|---|
| SSD | 44.8 | - | 26.3 | 31.1 | 32.40 |
| Faster R-CNN | 34.1 | - | 41.2 | 41.2 | 30.60 |
| YOLOv3 | 56.8 | 27.3 | 61.6 | 27.9 | 40.50 |
| YOLOv5s | 80.1 | 43.0 | 7.0 | 16.6 | 57.13 |
| YOLOv8n(baseline) | 77.5 | 45.6 | 3.2 | 8.1 | 60.24 |
| YOLOv8-ShuffleNetV2 | 74.0 | 41.8 | 1.9 | 5.2 | 77.17 |
| YOLOv8-MobileNetV3 | 75.1 | 41.6 | 2.3 | 5.7 | 71.81 |
| YOLOv8-Vanillnet | 73.5 | 40.8 | 2.0 | 5.7 | 70.92 |
| Bi-YOLO | 64.8 | 36.5 | 2.9 | 63.5 | 42.37 |
| RT-DETR-r18 | 70.2 | 40.1 | 20.0 | 60.0 | 55.30 |
| YOLOv7-tiny | 70.5 | 37.0 | 4.2 | 7.0 | 53.47 |
| SOE-YOLO(本文) | 79.9 | 47.2 | 2.8 | 6.6 | 64.25 |
表5 与其他模型对比实验结果
Table 5 Compare the experimental results with other models
| 模型 | map@0.5/% | map@0.5~0.95/% | Params/M | FLOPs/G | FPS |
|---|---|---|---|---|---|
| SSD | 44.8 | - | 26.3 | 31.1 | 32.40 |
| Faster R-CNN | 34.1 | - | 41.2 | 41.2 | 30.60 |
| YOLOv3 | 56.8 | 27.3 | 61.6 | 27.9 | 40.50 |
| YOLOv5s | 80.1 | 43.0 | 7.0 | 16.6 | 57.13 |
| YOLOv8n(baseline) | 77.5 | 45.6 | 3.2 | 8.1 | 60.24 |
| YOLOv8-ShuffleNetV2 | 74.0 | 41.8 | 1.9 | 5.2 | 77.17 |
| YOLOv8-MobileNetV3 | 75.1 | 41.6 | 2.3 | 5.7 | 71.81 |
| YOLOv8-Vanillnet | 73.5 | 40.8 | 2.0 | 5.7 | 70.92 |
| Bi-YOLO | 64.8 | 36.5 | 2.9 | 63.5 | 42.37 |
| RT-DETR-r18 | 70.2 | 40.1 | 20.0 | 60.0 | 55.30 |
| YOLOv7-tiny | 70.5 | 37.0 | 4.2 | 7.0 | 53.47 |
| SOE-YOLO(本文) | 79.9 | 47.2 | 2.8 | 6.6 | 64.25 |
图6 SOE-YOLO与YOLOv8n检测效果对比((a)原图;(b) YOLOv8-n;(c) SOE-YOLO)
Fig. 6 Comparison of the detection effects of SOE-YOLO and YOLOv8n ((a) Original image; (b) YOLOv8-n; (c) SOE-YOLO)
| [1] | 侯瑞超, 唐智诚, 王博, 等. 水面无人艇智能化技术的发展现状和趋势[J]. 中国造船, 2020, 61(S1): 211-220. |
| HOU R C, TANG Z C, WANG B, et al. Development status and trend of intelligent technology for surface unmanned boat[J]. Shipbuilding of China, 2020, 61(S1): 211-220 (in Chinese). | |
| [2] | 罗逸豪, 孙创, 邵成, 等. 基于深度学习的水面无人艇目标检测算法综述[J]. 数字海洋与水下攻防, 2022, 5(6): 524-538. |
| LUO Y H, SUN C, SHAO C, et al. Review on object detection algorithm for unmanned surface vehicle based on deep learning[J]. Digital Ocean & Underwater Warfare, 2022, 5(6): 524-538 (in Chinese). | |
| [3] |
盛明伟, 李俊, 秦洪德, 等. 基于改进YOLOv3的船舶目标检测算法[J]. 导航与控制, 2021, 20(2): 95-109.
DOI |
| SHENG M W, LI J, QIN H D, et al. Ship target detection algorithm based on the improved YOLOv3[J]. Navigation and Control, 2021, 20(2): 95-109 (in Chinese). | |
| [4] | 程亮, 杨渊, 张云飞, 等. 面向无人艇智能感知的水上目标识别算法研究[J]. 电子测量与仪器学报, 2021, 35(9): 99-104. |
| CHENG L, YANG Y, ZHANG Y F, et al. Research on water target recognition algorithm for unmanned surface vessel[J]. Journal of Electronic Measurement and Instrumentation, 2021, 35(9): 99-104 (in Chinese). | |
| [5] | 冯辉, 郭俊东, 徐海祥. 面向精准目标定位的水面目标检测算法[J]. 华中科技大学学报: 自然科学版, 2023, 51(10): 38-43. |
| FENG H, GUO J D, XU H X. Water surface object detection algorithm for accurate object location[J]. Journal of Huazhong University of Science and Technology: Natural Science Edition, 2023, 51(10): 38-43 (in Chinese). | |
| [6] | LIN F, HOU T, JIN Q, et al. Improved YOLO based detection algorithm for floating debris in waterway[EB/OL]. [2023-11-20]. https://doi.org/10.3390/e23091111. |
| [7] | 刘子洋, 徐慧英, 朱信忠, 等. Bi-YOLO: 一种基于YOLOv8改进的轻量化目标检测算法[EB/OL]. [2023-12-20]. https://link.cnki.net/urlid/43.1258.TP.20231107.1657.002. |
| LIU Z Y, XU H Y, ZHU X Z, et al. Bi-YOLO: an improved lightweight object detection algorithm based on YOLOv8[EB/OL]. [2023-11-20]. https://link.cnki.net/urlid/43.1258.TP.20231107.1657.002 (in Chinese). | |
| [8] | HOWARD A G, ZHU M, CHEN B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications[EB/OL]. [2023-11-20]. https://arxiv.orgabs/1704.04861. |
| [9] | MA N, ZHANG X, ZHENG H, et al. ShuffleNet V2: practical guidelines for efficient CNN architecture design[EB/OL]. [2023-11-20]. https://arxiv.org/abs/1807.11164. |
| [10] | CHOLLET F. Xception: deep learning with depthwise separable convolutions[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 1800-1807. |
| [11] | LI H L, LI J, WEI H B, et al. Slim-neck by GSConv: a better design paradigm of detector architectures for autonomous vehicles[EB/OL]. [2023-11-20]. http://arxiv.org/abs/2206.02424. |
| [12] | YANG B, BENDER G, LE Q V, et al. CondConv: conditionally parameterized convolutions for efficient inference[EB/OL]. [2023-10-20]. http://arxiv.org/abs/1904.04971. |
| [13] | ZHANG Y K, ZHANG J, WANG Q, et al. DyNet: dynamic convolution for accelerating convolutional neural networks[EB/OL]. [2023-10-20]. http://arxiv.org/abs/2004.10694. |
| [14] | LI C, ZHOU A J, YAO A B. Omni-dimensional dynamic convolution[EB/OL]. [2023-10-20]. http://arxiv.org/abs/2209.07947. |
| [15] | 丘锐聪, 周海峰, 陈颖, 等. 基于轻量化YOLOv7-tiny的船舶目标检测算法[EB/OL]. [2023-10-20]. https://link.cnki.net/urlid/21.1360.U.20231129.1740.002. |
| QIU R C, ZHOU H F, CHEN Y, et al. Ship target detection algorithm based on lightweight YOLOv7-tiny[EB/OL]. [2023-12-20]. https://link.cnki.net/urlid/21.1360.U.20231129.1740.002 (in Chinese). | |
| [16] | HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 7132-7141. |
| [17] | WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]// European Conference on Computer Vision. Cham: Springer, 2018: 3-19. |
| [18] | OUYANG D L, HE S, ZHANG G Z, et al. Efficient multi-scale attention module with cross-spatial learning[C]// ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing. New York: IEEE Press, 2023: 1-5. |
| [19] | ZHOU Z G, SUN J E, YU J B, et al. An image-based benchmark dataset and a novel object detector for water surface object detection[J]. Frontiers in Neurorobotics, 2021, 15: 723336. |
| [20] | HOU Q B, ZHOU D Q, FENG J S. Coordinate attention for efficient mobile network design[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 13708-13717. |
| [21] | YANG L X, ZHANG R Y, LI L D, et al. SimAM: a Simple, parameter-free attention module for convolutional neural networks[EB/OL]. [2023-12-20]. https://api.semanticscholar.org/CorpusID:235825945. |
| [22] | LIU Y C, SHAO Z R, TENG Y Y, et al. NAM: normalization- based attention module[EB/OL]. [2023-12-20]. http://arxiv.org/abs/2111.12419. |
| [23] | WANG Q L, WU B G, ZHU P F, et al. ECA-net: efficient channel attention for deep convolutional neural networks[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 11531-11539. |
| [24] |
REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
DOI PMID |
| [25] | LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector[C]// European Conference on Computer Vision. Cham: Springer, 2016: 21-37. |
| [26] | ZHAO Y A, LV W Y, XU S L, et al. DETRs beat YOLOs on real-time object detection[EB/OL]. [2023-10-20]. http://arxiv.org/abs/2304.08069. |
| [27] | CHEN H T, WANG Y H, GUO J Y, et al. VanillaNet: the power of minimalism in deep learning[EB/OL]. [2023-10-20]. http://arxiv.org/abs/2305.12972. |
| [1] | 胡凤阔 , 叶兰 , 谭显峰 , 张钦展 , 胡志新 , 方清 , 王磊 , 满孝锋 . 一种基于改进 YOLOv8 的轻量化路面病害检测算法[J]. 图学学报, 2024, 45(5): 892-900. |
| [2] | 王亚茹, 冯利龙, 宋晓轲, 屈卓, 杨珂, 王乾铭, 翟永杰.
TFD-YOLOv8:一种用于输电线路的异物检测方法
[J]. 图学学报, 2024, 45(5): 901-912. |
| [3] | 刘义艳 , 郝婷楠 , 贺晨 , 常英杰 . 基于 DBBR-YOLO 的光伏电池表面缺陷检测[J]. 图学学报, 2024, 45(5): 913-921. |
| [4] | 吴沛宸 , 袁立宁 , 胡皓 , 刘钊 , 郭放 . 基于注意力特征融合的视频异常行为检测[J]. 图学学报, 2024, 45(5): 922-929. |
| [5] | 刘丽, 张起凡, 白宇昂, 黄凯烨. 结合Swin Transformer的多尺度遥感图像变化检测研究[J]. 图学学报, 2024, 45(5): 941-956. |
| [6] | 章东平 , 魏杨悦 , 何数技 , 徐云超 , 胡海苗 , 黄文君 . 特征融合与层间传递:一种基于Anchor DETR改进的目标检测方法[J]. 图学学报, 2024, 45(5): 968-978. |
| [7] | 李刚 , 蔡泽浩 , 孙华勋 , 赵振兵 . 基于改进 OLOv8与语义知识融合的金具缺陷检测方法研究[J]. 图学学报, 2024, 45(5): 979-986. |
| [8] | 谢国波, 林松泽, 林志毅, 吴陈锋, 梁立辉. 基于改进YOLOv7-tiny的道路病害检测算法[J]. 图学学报, 2024, 45(5): 987-997. |
| [9] | 熊超 , 王云艳 , 罗雨浩 . 特征对齐与上下文引导的多视图三维重建[J]. 图学学报, 2024, 45(5): 1008-1016. |
| [10] | 彭文, 林金炜. 基于空间信息关注和纹理增强的短小染色体分类方法[J]. 图学学报, 2024, 45(5): 1017-1029. |
| [11] | 刘宗明 , 洪唯 , 龙睿 , 祝越 , 张小宇 . 基于自注意机制的乳源瑶绣自动生成与应用研究[J]. 图学学报, 2024, 45(5): 1096-1105. |
| [12] | 李大湘, 吉展, 刘颖, 唐垚. 改进YOLOv7遥感图像目标检测算法[J]. 图学学报, 2024, 45(4): 650-658. |
| [13] | 魏敏, 姚鑫. 基于多尺度与注意力机制的两阶段风暴单体外推研究[J]. 图学学报, 2024, 45(4): 696-704. |
| [14] | 胡欣, 常娅姝, 秦皓, 肖剑, 程鸿亮. 基于改进YOLOv8和GMM图像点集匹配的双目测距方法[J]. 图学学报, 2024, 45(4): 714-725. |
| [15] | 牛为华, 郭迅. 基于改进YOLOv8的船舰遥感图像旋转目标检测算法[J]. 图学学报, 2024, 45(4): 726-735. |
| 阅读次数 | ||||||
|
全文 |
|
|||||
|
摘要 |
|
|||||