Research on multi-scale road damage detection algorithm based on attention mechanism

doi:10.11996/JG.j.2095-302X.2024040770

Abstract

Abstract:

Road damage detection is an important task in road maintenance and repair. The existing road damage detection methods primarily rely on traditional manual detection, which requires significant manpower and material resources, resulting in low detection efficiency and an inability to meet the needs of current road development.To address these problems, an improved multi-scale road damage detection algorithm, YOLOv8-RDD, was proposed. Firstly, the YOLOv8-RDD algorithm employed Deformable Convolutional Networks (DCN) in the C2f module to build a new C2f_DCN module. This expanded the effective range of the receptive field and located the boundary and position of target objects more accurately, thus enhancing the ability to identify and locate the target. At the end of backbone network, a new SPPF_GS module was designed, introducing the Self-Attention (SA) mechanism and the Phantom Convolution Ghost module into the SPPF module, with the size of pooled kernel re-optimized to better deal with long-distance dependence and capture global information. Finally, Coordinate Attention (CA) was introduced into the Neck to strengthen the feature extraction ability of the model and reduce redundant information. Experimental results demonstrated that the improved algorithm achieved a Precision of 61.1%, a Recall rate of 55.5%, and a mean average precision (mAP) of 56.2% on the RDD2022 dataset. Compared with the YOLOv8n algorithm, the results were improved by 4.6%, 4.7%, and 5.2%, respectively, which achieved excellent performance in the target detection of road damage.

Key words: road damage detection, YOLOv8, deformable convolutional networks, attention mechanism, Ghost module

CLC Number:

WU Bing, TIAN Ying. Research on multi-scale road damage detection algorithm based on attention mechanism[J]. Journal of Graphics, 2024, 45(4): 770-778.

Figures/Tables 16

References 21

[1]	曾志超, 徐玥, 王景玉, 等. 基于SOE-YOLO轻量化的水面目标检测算法[EB/OL]. [2024-04-25]. http://kns.cnki.net/kcms/detail/10.1034.T.20240417.1457.002.html.
	ZENG Z C, XU Y, WANG J Y, et al. A water surface target detection algorithm based on SOE-YOLO lightweight network[EB/OL]. [2024-04-25]. http://kns.cnki.net/kcms/detail/10.1034.T.20240417.1457.002.html (in Chinese).
[2]	GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]// 2014 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2014: 580-587.
[3]	GIRSHICK R. Fast R-CNN[C]// 2015 IEEE International Conference on Computer Vision. New York: IEEE Press, 2015: 1440-1448.
[4]	REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. DOI PMID
[5]	KANG D, BENIPAL S S, GOPAL D L, et al. Hybrid pixel-level concrete crack segmentation and quantification across complex backgrounds using deep learning[J]. Automation in Construction, 2020, 118: 103291.
[6]	YAMAGUCHI T, MIZUTANI T. Quantitative road crack evaluation by a U-Net architecture using smartphone images and Lidar data[J]. Computer-Aided Civil and Infrastructure Engineering, 2024, 39(7): 963-982.
[7]	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 779-788.
[8]	REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 6517-6525.
[9]	REDMON J, FARHADI A. YOLOv3: an incremental improvement[EB/OL]. [2024-04-25]. http://arxiv.org/abs/1804.02767.
[10]	BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. [2024-04-25]. http://arxiv.org/abs/2004.10934.
[11]	LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector[M]//Computer Vision-ECCV 2016. Cham: Springer International Publishing, 2016: 21-37.
[12]	WANG N N, SHANG L H, SONG X T. A transformer- optimized deep learning network for road damage detection and tracking[J]. Sensors, 2023, 23(17): 7395.
[13]	XIANG W N, WANG H C, XU Y, et al. Road disease detection algorithm based on YOLOv5s-DSG[J]. Journal of Real-Time Image Processing, 2023, 20(3): 56.
[14]	WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 7464-7475.
[15]	崔克彬, 焦静颐. 基于MCB-FAH-YOLOv8的钢材表面缺陷检测算法[J]. 图学学报, 2024, 45(1): 112-125. DOI
	CUI K B, JIAO J Y. Steel surface defect detection algorithm based on MCB-FAH-YOLOv8[J]. Journal of Graphics, 2024, 45(1): 112-125 (in Chinese). DOI
[16]	DAI J F, QI H Z, XIONG Y W, et al. Deformable convolutional networks[C]// 2017 IEEE International Conference on Computer Vision. New York: IEEE Press, 2017: 764-773.
[17]	ZHU X Z, HU H, LIN S, et al. Deformable ConvNets V2: more deformable, better results[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 9300-9308.
[18]	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[EB/OL]. [2024-01-12]. https://arxiv.org/abs/1706.03762.
[19]	HOWARD A G, ZHU M L, CHEN B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications[EB/OL]. [2024-01-12]. http://arxiv.org/abs/1704.04861.
[20]	HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 7132-7141.
[21]	WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[M]//Computer Vision - ECCV 2018. Cham: Springer International Publishing, 2018: 3-19.

Algorithm	P/%	R/%	mAP50/%	mAP50~95/%	Params/10⁶	GFLOPs
YOLOv8n	57.8	53.0	53.4	24.1	3.0	8.1
YOLOv8+CBAM	59.1	54.3	53.8	23.9	3.3	8.3
YOLOv8+SE	58.8	52.3	52.5	23.8	3.0	8.2
YOLOv8+CA (Ours)	63.0	54.5	54.3	24.1	3.0	8.2

Algorithm	P/%	R/%	mAP50/%	mAP50~95/%	Params/10⁶	GFLOPs
YOLOv8n	57.8	53.0	53.4	24.1	3.0	8.1
YOLOv8+CBAM	59.1	54.3	53.8	23.9	3.3	8.3
YOLOv8+SE	58.8	52.3	52.5	23.8	3.0	8.2
YOLOv8+CA (Ours)	63.0	54.5	54.3	24.1	3.0	8.2

Algorithm	mAP50/%	mAP50~95/%	Params/10⁶	GFLOPs
SPPF	53.4	24.1	3.0	8.1
SPPF+SA	53.9	24.0	3.1	8.3
SPPF+Ghost	53.4	23.9	2.9	8.0
SPPF-GS	54.6	24.1	3.0	8.2

Algorithm	mAP50/%	mAP50~95/%	Params/10⁶	GFLOPs
SPPF	53.4	24.1	3.0	8.1
SPPF+SA	53.9	24.0	3.1	8.3
SPPF+Ghost	53.4	23.9	2.9	8.0
SPPF-GS	54.6	24.1	3.0	8.2

YOLOv8n	C2f_DCN	CA	SPPF_GS	mAP50/%	mAP50~95/%	Params/10⁶	GFLOPs
√				53.4	24.1	3.0	8.1
√	√			54.3	24.5	3.2	7.7
√		√		54.3	24.0	3.0	8.2
√			√	54.6	24.8	3.0	8.1
√	√	√		55.4	25.3	3.2	7.8
√	√		√	54.2	24.7	3.2	7.8
√	√	√	√	56.2	25.0	3.2	7.8