图学学报 ›› 2024, Vol. 45 ›› Issue (4): 770-778.DOI: 10.11996/JG.j.2095-302X.2024040770
收稿日期:2024-04-26
接受日期:2024-06-28
出版日期:2024-08-31
发布日期:2024-09-03
通讯作者:田莹(1971-),女,教授,博士。主要研究方向为模式识别、数字图像处理。E-mail:t_tianying@126.com第一作者:武兵(1999-),男,硕士研究生。主要研究方向为计算机视觉、深度学习。E-mail:2258860606@qq.com
基金资助:Received:2024-04-26
Accepted:2024-06-28
Published:2024-08-31
Online:2024-09-03
Contact:
TIAN Ying (1971), PhD, professor, her research interests include pattern recognition, digital image processing. E-mail:t_tianying@126.comFirst author:WU Bing (1999), MS candidate, his research interests include computer vision, deep learning. E-mail:2258860606@qq.com
Supported by:摘要:
路损伤检测是道路养护与修复的一项重要任务。现有的道路损伤检测方式以传统的人工检测为主,人工检测需要投入大量的人力和物力,检测效率低,无法适应当前道路发展的需求。进而提出了一种改进的多尺度道路损伤检测算法YOLOv8-RDD。首先,YOLOv8-RDD算法在C2f模块中使用可变形卷积(DCN)建了全新的C2f_DCN模块,扩大感受野的有效范围,更准确地定位目标对象的边界和位置,有助于提升对目标的识别和定位能力;其次,网络末端设计了全新的SPPF_GS模块,在SPPF模块中引入了自注意力机制(SA)和幻影卷积Ghost模块,并重新优化了池化核的大小,更好的处理长距离依赖性和捕获全局信息;最后,在Neck中引入坐标注意力机制(CA),强化模型的特征提取能力,减少冗余信息。实验结果表明,改进后的算法在RDD2022数据集上面的精确度(Precision)为61.1%、召回率(Recall)为55.5%,平均精度(mAP)为56.2%,相较于YOLOv8n算法分别提高了4.6%、4.7%和5.2%,在道路损伤的目标检测上取得了优异的效果。
中图分类号:
武兵, 田莹. 基于注意力机制的多尺度道路损伤检测算法研究[J]. 图学学报, 2024, 45(4): 770-778.
WU Bing, TIAN Ying. Research on multi-scale road damage detection algorithm based on attention mechanism[J]. Journal of Graphics, 2024, 45(4): 770-778.
| Algorithm | P/% | R/% | mAP50/% | mAP50~95/% | Params/106 | GFLOPs |
|---|---|---|---|---|---|---|
| YOLOv8n | 57.8 | 53.0 | 53.4 | 24.1 | 3.0 | 8.1 |
| YOLOv8+CBAM | 59.1 | 54.3 | 53.8 | 23.9 | 3.3 | 8.3 |
| YOLOv8+SE | 58.8 | 52.3 | 52.5 | 23.8 | 3.0 | 8.2 |
| YOLOv8+CA (Ours) | 63.0 | 54.5 | 54.3 | 24.1 | 3.0 | 8.2 |
表1 注意力机制对比实验
Table 1 Comparison of experimental results of attention mechanism
| Algorithm | P/% | R/% | mAP50/% | mAP50~95/% | Params/106 | GFLOPs |
|---|---|---|---|---|---|---|
| YOLOv8n | 57.8 | 53.0 | 53.4 | 24.1 | 3.0 | 8.1 |
| YOLOv8+CBAM | 59.1 | 54.3 | 53.8 | 23.9 | 3.3 | 8.3 |
| YOLOv8+SE | 58.8 | 52.3 | 52.5 | 23.8 | 3.0 | 8.2 |
| YOLOv8+CA (Ours) | 63.0 | 54.5 | 54.3 | 24.1 | 3.0 | 8.2 |
| Algorithm | mAP50/% | mAP50~95/% | Params/106 | GFLOPs |
|---|---|---|---|---|
| SPPF | 53.4 | 24.1 | 3.0 | 8.1 |
| SPPF+SA | 53.9 | 24.0 | 3.1 | 8.3 |
| SPPF+Ghost | 53.4 | 23.9 | 2.9 | 8.0 |
| SPPF-GS | 54.6 | 24.1 | 3.0 | 8.2 |
表2 SPPF模块改进实验结果
Table 2 The SPPF module improves the experimental results
| Algorithm | mAP50/% | mAP50~95/% | Params/106 | GFLOPs |
|---|---|---|---|---|
| SPPF | 53.4 | 24.1 | 3.0 | 8.1 |
| SPPF+SA | 53.9 | 24.0 | 3.1 | 8.3 |
| SPPF+Ghost | 53.4 | 23.9 | 2.9 | 8.0 |
| SPPF-GS | 54.6 | 24.1 | 3.0 | 8.2 |
| YOLOv8n | C2f_DCN | CA | SPPF_GS | mAP50/% | mAP50~95/% | Params/106 | GFLOPs |
|---|---|---|---|---|---|---|---|
| √ | 53.4 | 24.1 | 3.0 | 8.1 | |||
| √ | √ | 54.3 | 24.5 | 3.2 | 7.7 | ||
| √ | √ | 54.3 | 24.0 | 3.0 | 8.2 | ||
| √ | √ | 54.6 | 24.8 | 3.0 | 8.1 | ||
| √ | √ | √ | 55.4 | 25.3 | 3.2 | 7.8 | |
| √ | √ | √ | 54.2 | 24.7 | 3.2 | 7.8 | |
| √ | √ | √ | √ | 56.2 | 25.0 | 3.2 | 7.8 |
表3 消融实验结果
Table 3 Results of ablation experiment
| YOLOv8n | C2f_DCN | CA | SPPF_GS | mAP50/% | mAP50~95/% | Params/106 | GFLOPs |
|---|---|---|---|---|---|---|---|
| √ | 53.4 | 24.1 | 3.0 | 8.1 | |||
| √ | √ | 54.3 | 24.5 | 3.2 | 7.7 | ||
| √ | √ | 54.3 | 24.0 | 3.0 | 8.2 | ||
| √ | √ | 54.6 | 24.8 | 3.0 | 8.1 | ||
| √ | √ | √ | 55.4 | 25.3 | 3.2 | 7.8 | |
| √ | √ | √ | 54.2 | 24.7 | 3.2 | 7.8 | |
| √ | √ | √ | √ | 56.2 | 25.0 | 3.2 | 7.8 |
图9 光线充足场景检测对比图((a)原始图片;(b) YOLOv8n;(c) YOLOv8-RDD)
Fig. 9 Comparison of detection in well-lit scenarios ((a) Original image; (b) YOLOv8n; (c) YOLOv8-RDD)
图10 光线不充足场景检测对比图((a)原始图片;(b) YOLOv8n;(c) YOLOv8-RDD)
Fig. 10 Comparison of detection of insufficient light ((a) Original image; (b) YOLOv8n; (c) YOLOv8-RDD)
| Algorithm | mAP50/% | mAP50~95/% | Params/106 | GFLOPs | FPS |
|---|---|---|---|---|---|
| Faster-RCNN | 43.5 | 17.2 | 137.5 | 370.3 | 28 |
| YOLOv3tiny | 42.4 | 17.2 | 12.1 | 19.1 | 176 |
| YOLOv4tiny | 40.0 | 16.5 | 6.1 | 16.5 | 156 |
| YOLOv5s | 51.6 | 23.7 | 7.0 | 16.0 | 85 |
| YOLOv6 | 52.1 | 23.8 | 4.2 | 11.9 | 87 |
| YOLOv7tiny | 48.4 | 19.4 | 6.0 | 13.2 | 117 |
| YOLOv8n | 53.4 | 24.0 | 3.0 | 8.2 | 105 |
| YOLOv9c | 59.1 | 29.6 | 156.0 | 68.1 | 103 |
| RT-DETR | 56.7 | 25.6 | 20.1 | 58.3 | 124 |
| Swin Transformer | 55.8 | 25.1 | 96.8 | 17.1 | 99 |
| YOLOv8-RDD | 56.2 | 25.0 | 3.2 | 7.8 | 95 |
表4 对比实验结果
Table 4 Results of comparative experiment
| Algorithm | mAP50/% | mAP50~95/% | Params/106 | GFLOPs | FPS |
|---|---|---|---|---|---|
| Faster-RCNN | 43.5 | 17.2 | 137.5 | 370.3 | 28 |
| YOLOv3tiny | 42.4 | 17.2 | 12.1 | 19.1 | 176 |
| YOLOv4tiny | 40.0 | 16.5 | 6.1 | 16.5 | 156 |
| YOLOv5s | 51.6 | 23.7 | 7.0 | 16.0 | 85 |
| YOLOv6 | 52.1 | 23.8 | 4.2 | 11.9 | 87 |
| YOLOv7tiny | 48.4 | 19.4 | 6.0 | 13.2 | 117 |
| YOLOv8n | 53.4 | 24.0 | 3.0 | 8.2 | 105 |
| YOLOv9c | 59.1 | 29.6 | 156.0 | 68.1 | 103 |
| RT-DETR | 56.7 | 25.6 | 20.1 | 58.3 | 124 |
| Swin Transformer | 55.8 | 25.1 | 96.8 | 17.1 | 99 |
| YOLOv8-RDD | 56.2 | 25.0 | 3.2 | 7.8 | 95 |
| Algorithm | YOLOv8n | YOLOv8-RDD | ||
|---|---|---|---|---|
| Type | mAP50% | mAP50%~95% | mAP50% | mAP50%~95% |
| D00 | 58.2 | 26.5 | 60.1(+1.9) | 27.4(+0.9) |
| D10 | 50.6 | 24.1 | 54.2(+3.6) | 25.1(+1.0) |
| D20 | 59.4 | 25.9 | 61.0(+1.6) | 27.7(+1.8) |
| D40 | 60.2 | 27.5 | 62.8(+2.6) | 29.4(+1.9) |
| All | 57.2 | 25.3 | 59.5(+2.3) | 27.2(+1.9) |
表5 泛化能力实验
Table 5 Results of generalization ability experiment
| Algorithm | YOLOv8n | YOLOv8-RDD | ||
|---|---|---|---|---|
| Type | mAP50% | mAP50%~95% | mAP50% | mAP50%~95% |
| D00 | 58.2 | 26.5 | 60.1(+1.9) | 27.4(+0.9) |
| D10 | 50.6 | 24.1 | 54.2(+3.6) | 25.1(+1.0) |
| D20 | 59.4 | 25.9 | 61.0(+1.6) | 27.7(+1.8) |
| D40 | 60.2 | 27.5 | 62.8(+2.6) | 29.4(+1.9) |
| All | 57.2 | 25.3 | 59.5(+2.3) | 27.2(+1.9) |
| [1] | 曾志超, 徐玥, 王景玉, 等. 基于SOE-YOLO轻量化的水面目标检测算法[EB/OL]. [2024-04-25]. http://kns.cnki.net/kcms/detail/10.1034.T.20240417.1457.002.html. |
| ZENG Z C, XU Y, WANG J Y, et al. A water surface target detection algorithm based on SOE-YOLO lightweight network[EB/OL]. [2024-04-25]. http://kns.cnki.net/kcms/detail/10.1034.T.20240417.1457.002.html (in Chinese). | |
| [2] | GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]// 2014 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2014: 580-587. |
| [3] | GIRSHICK R. Fast R-CNN[C]// 2015 IEEE International Conference on Computer Vision. New York: IEEE Press, 2015: 1440-1448. |
| [4] |
REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
DOI PMID |
| [5] | KANG D, BENIPAL S S, GOPAL D L, et al. Hybrid pixel-level concrete crack segmentation and quantification across complex backgrounds using deep learning[J]. Automation in Construction, 2020, 118: 103291. |
| [6] | YAMAGUCHI T, MIZUTANI T. Quantitative road crack evaluation by a U-Net architecture using smartphone images and Lidar data[J]. Computer-Aided Civil and Infrastructure Engineering, 2024, 39(7): 963-982. |
| [7] | REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 779-788. |
| [8] | REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 6517-6525. |
| [9] | REDMON J, FARHADI A. YOLOv3: an incremental improvement[EB/OL]. [2024-04-25]. http://arxiv.org/abs/1804.02767. |
| [10] | BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. [2024-04-25]. http://arxiv.org/abs/2004.10934. |
| [11] | LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector[M]//Computer Vision-ECCV 2016. Cham: Springer International Publishing, 2016: 21-37. |
| [12] | WANG N N, SHANG L H, SONG X T. A transformer- optimized deep learning network for road damage detection and tracking[J]. Sensors, 2023, 23(17): 7395. |
| [13] | XIANG W N, WANG H C, XU Y, et al. Road disease detection algorithm based on YOLOv5s-DSG[J]. Journal of Real-Time Image Processing, 2023, 20(3): 56. |
| [14] | WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 7464-7475. |
| [15] |
崔克彬, 焦静颐. 基于MCB-FAH-YOLOv8的钢材表面缺陷检测算法[J]. 图学学报, 2024, 45(1): 112-125.
DOI |
|
CUI K B, JIAO J Y. Steel surface defect detection algorithm based on MCB-FAH-YOLOv8[J]. Journal of Graphics, 2024, 45(1): 112-125 (in Chinese).
DOI |
|
| [16] | DAI J F, QI H Z, XIONG Y W, et al. Deformable convolutional networks[C]// 2017 IEEE International Conference on Computer Vision. New York: IEEE Press, 2017: 764-773. |
| [17] | ZHU X Z, HU H, LIN S, et al. Deformable ConvNets V2: more deformable, better results[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 9300-9308. |
| [18] | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[EB/OL]. [2024-01-12]. https://arxiv.org/abs/1706.03762. |
| [19] | HOWARD A G, ZHU M L, CHEN B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications[EB/OL]. [2024-01-12]. http://arxiv.org/abs/1704.04861. |
| [20] | HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 7132-7141. |
| [21] | WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[M]//Computer Vision - ECCV 2018. Cham: Springer International Publishing, 2018: 3-19. |
| [1] | 胡凤阔 , 叶兰 , 谭显峰 , 张钦展 , 胡志新 , 方清 , 王磊 , 满孝锋 . 一种基于改进 YOLOv8 的轻量化路面病害检测算法[J]. 图学学报, 2024, 45(5): 892-900. |
| [2] | 王亚茹, 冯利龙, 宋晓轲, 屈卓, 杨珂, 王乾铭, 翟永杰.
TFD-YOLOv8:一种用于输电线路的异物检测方法
[J]. 图学学报, 2024, 45(5): 901-912. |
| [3] | 刘义艳 , 郝婷楠 , 贺晨 , 常英杰 . 基于 DBBR-YOLO 的光伏电池表面缺陷检测[J]. 图学学报, 2024, 45(5): 913-921. |
| [4] | 吴沛宸 , 袁立宁 , 胡皓 , 刘钊 , 郭放 . 基于注意力特征融合的视频异常行为检测[J]. 图学学报, 2024, 45(5): 922-929. |
| [5] | 刘丽, 张起凡, 白宇昂, 黄凯烨. 结合Swin Transformer的多尺度遥感图像变化检测研究[J]. 图学学报, 2024, 45(5): 941-956. |
| [6] | 章东平 , 魏杨悦 , 何数技 , 徐云超 , 胡海苗 , 黄文君 . 特征融合与层间传递:一种基于Anchor DETR改进的目标检测方法[J]. 图学学报, 2024, 45(5): 968-978. |
| [7] | 李刚 , 蔡泽浩 , 孙华勋 , 赵振兵 . 基于改进 OLOv8与语义知识融合的金具缺陷检测方法研究[J]. 图学学报, 2024, 45(5): 979-986. |
| [8] | 谢国波, 林松泽, 林志毅, 吴陈锋, 梁立辉. 基于改进YOLOv7-tiny的道路病害检测算法[J]. 图学学报, 2024, 45(5): 987-997. |
| [9] | 熊超 , 王云艳 , 罗雨浩 . 特征对齐与上下文引导的多视图三维重建[J]. 图学学报, 2024, 45(5): 1008-1016. |
| [10] | 彭文, 林金炜. 基于空间信息关注和纹理增强的短小染色体分类方法[J]. 图学学报, 2024, 45(5): 1017-1029. |
| [11] | 孙己龙 , 刘勇 , 周黎伟 , 路鑫 , 侯小龙 , 王亚琼 , 王志丰 . 基于DCNv2和Transformer Decoder的隧道衬砌裂缝高效检测模型研究[J]. 图学学报, 2024, 45(5): 1050-1061. |
| [12] | 刘宗明 , 洪唯 , 龙睿 , 祝越 , 张小宇 . 基于自注意机制的乳源瑶绣自动生成与应用研究[J]. 图学学报, 2024, 45(5): 1096-1105. |
| [13] | 李大湘, 吉展, 刘颖, 唐垚. 改进YOLOv7遥感图像目标检测算法[J]. 图学学报, 2024, 45(4): 650-658. |
| [14] | 魏敏, 姚鑫. 基于多尺度与注意力机制的两阶段风暴单体外推研究[J]. 图学学报, 2024, 45(4): 696-704. |
| [15] | 胡欣, 常娅姝, 秦皓, 肖剑, 程鸿亮. 基于改进YOLOv8和GMM图像点集匹配的双目测距方法[J]. 图学学报, 2024, 45(4): 714-725. |
| 阅读次数 | ||||||
|
全文 |
|
|||||
|
摘要 |
|
|||||
