基于改进YOLOv5算法的钢材表面缺陷检测

doi:10.11996/JG.j.2095-302X.2023020335

摘要/Abstract

摘要：

针对单阶段检测网络YOLOv5的特征提取能力不足、模型感受野受限以及特征融合不充分等问题，提出一种改进YOLOv5的钢材表面缺陷检测算法。该方法构造一种带残差边的SPP_Res特征金字塔结构，加快模型的训练速度，增强模型的特征提取能力；加入多头注意力机制(C3_MHSA)，优化了网络结构，专注全局感受野，提取更加丰富的目标特征；引入多层特征融合机制，进一步融合浅层与深层特征，兼顾到更多的位置、语义、细节信息，提高网络对钢材表面缺陷的检测精度。实验结果表明，改进后的YOLOv5网络模型具有良好地检测性能，在NEU-DET数据集上的mAP达到了74.1%，相比原始YOLOv5网络提升了3.4%，较YOLOX提升4.0%，较YOLOv3提升了8.6%，较SSD算法提升了23.4%。检测速度优于其他主流算法，且在保持原检测速度基本不变的情况下，能够快速准确地对钢材表面缺陷进行检测。

关键词: YOLOv5, SPP_Res, 多头注意力机制, 多层融合, 缺陷检测

Abstract:

An improved YOLOv5 steel surface defects detection algorithm was proposed to address the one-stage detection network YOLOv5, such as inadequate feature extraction ability, limited receptive field, and insufficient feature fusion. A feature pyramid structure of SPP_Res with residual edges was proposed to speed up the training of the model and enhance the feature extraction ability of the model. Additionally, a multi-head self-attention mechanism (C3_MHSA) was added to optimize the network structure, focusing on the global receptive field of the model and extracting richer features of the target. Furthermore, a multi-layer fusion mechanism was introduced to further integrate shallow and deep features, taking into account more information on location, semantics, and details, thereby improving the detection accuracy of steel surface defects. The experimental results demonstrated that the improved YOLOv5 algorithm could exhibit excellent detection performance, and that the mAP on the NEU-DET datasets reached 74.1%, which was 3.4% higher than that of the original YOLOv5 algorithm, 4.0% higher than that of the YOLOX algorithm, 8.6% higher than that of YOLOv3 algorithm, and 23.4% higher than that of the SSD algorithm. The improved YOLOv5 network could detect steel surface defects more accurately than YOLOv5 with similar detection speed, while outperforming other mainstream algorithms in both accuracy and speed.

Key words: YOLOv5, SPP_Res, muti-head self-attention mechanism, muti-layer fusion mechanism, defect detection

中图分类号:

TP391

曹义亲, 伍铭林, 徐露. 基于改进YOLOv5算法的钢材表面缺陷检测[J]. 图学学报, 2023, 44(2): 335-345.

CAO Yi-qin, WU Ming-lin, XU Lu. Steel surface defect detection based on improved YOLOv5 algorithm[J]. Journal of Graphics, 2023, 44(2): 335-345.

图/表 16

图1 Mosaic处理的训练图((a) Mosaic处理训练图1；(b) Mosaic处理训练图2)

Fig. 1 Training graph for Mosaic processing ((a) Training graph 1 for mosaic processing; (b) Training graph 2 for mosaic processing)

图2 改进后的YOLOv5网络结构图

Fig. 2 Improved network structure of YOLOv5

图3 预测结构编码图

Fig. 3 Schematic diagram of prediction box

图4 SPP_Res网络结构图

Fig. 4 SPP_Res network structure

图5 注意力分数提取信息图

Fig. 5 Extract information based on attention score

图6 C3结构图

Fig. 6 C3 network structure

图7 C3_MHSA结构图

Fig. 7 C3_MHSA network structure

图8 多层融合机制

Fig. 8 Structure diagram of multi-layer fusion mechanism

图9 带有Ground truth框的缺陷类别图((a)裂纹；(b)夹杂；(c)斑块；(d)麻点；(e)压入氧化皮；(f)划痕)

Fig. 9 Defect category with ground truth box ((a) CR; (b) IN; (c) PA; (d) PS; (e) RS; (f) SC)

表1 训练数据集标签分布

Table 1 Label distribution of the training dataset

类别	个数
CR	479
IN	741
PA	649
PS	292
RS	435
SC	395

表2 实验环境的硬件与软件配置

Table 2 The hardware and software configuration of the experimental environment

名称	参数
GPU	RTX3060Ti-12 G
CPU	Intel(R) Core(TM) i7-10875H CPU @2.30 GHz
操作系统	Windows10
深度学习框架	Pytorch 1.9.1+cuda10.2
编译软件	PyCharm

图10 YOLOv5改进前后训练结果对比图((a)原始网络；(b)改进网络)

Fig. 10 Comparison of training results before and after improving the YOLOv5 model ((a) Original network; (b) Improved network)

表3 对比实验结果

Table 3 Comparative experimental results

算法	mAP	FPS	Params (M)
SSD	0.507	63.30	24.4
Cascade R-CNN	0.596	37.00	107.0
RetinaNet	0.617	42.85	28.5
YOLOv3	0.655	55.00	63.0
文献[16]	0.676	51.60	-
YOLOX(s)	0.695	102.00	9.0
YOLOX(m)	0.701	87.90	25.3
YOLOv5(m)	0.707	75.10	21.2
本文	0.741	75.00	23.9
YOLOv6(s)	0.706	121.00	17.2
YOLOv7(tiny)	0.735	165.00	6.2
YOLOv7	0.768	138.00	36.9

表4 YOLOv5各算法在VOC2012数据集上的实验结果

Table 4 Experimental results of the YOLOv5 algorithm of different sizes on the VOC2012 dataset

算法	mAP	FPS
YOLOv5s	0.840	96.2
本文(s)	0.881	96.3
YOLOv5m	0.892	75.1
本文(m)	0.919	75.0
YOLOv5l	0.923	61.6
本文(l)	0.936	61.4
YOLOv5x	0.945	51.9
本文(x)	0.959	51.6

图11 各类别检测结果图((a)裂纹；(b)夹杂；(c)斑块；(d)麻点；(e)压入氧化皮；(f)划痕)

Fig. 11 Detection results for each defect category ((a) CR; (b) IN; (c) PA; (d) PS; (e) RS; (f) SC)

表5 改进方案的消融实验

Table 5 Ablation study experiments with improved strategies

SPP_Res	C3_MHSA	多层融合	AP						mAP
SPP_Res	C3_MHSA	多层融合	CR	IN	PA	PS	RS	SC	mAP
-	-	-	0.225	0.848	0.888	0.780	0.631	0.869	0.707
√	-	-	0.248	0.853	0.891	0.782	0.628	0.872	0.712
-	√	-	0.284	0.862	0.911	0.809	0.639	0.881	0.731
-	-	√	0.303	0.856	0.893	0.795	0.633	0.876	0.726
√	√	-	0.298	0.863	0.915	0.815	0.641	0.882	0.735
√	-	√	0.315	0.869	0.894	0.799	0.639	0.881	0.733
-	√	√	0.314	0.861	0.911	0.819	0.642	0.883	0.738
√	√	√	0.323	0.873	0.896	0.827	0.643	0.884	0.741

参考文献 25

[1]	MAO T Q, REN L R, YUAN F Q, et al. Defect recognition method based on HOG and SVM for drone inspection images of power transmission line[C]// 2019 International Conference on High Performance Big Data and Intelligent Systems. New York: IEEE Press, 2019: 254-257.
[2]	CHU M X, GONG R F, GAO S, et al. Steel surface defects recognition based on multi-type statistical features and enhanced twin support vector machine[J]. Chemometrics and Intelligent Laboratory Systems, 2017, 171: 140-150. DOI URL
[3]	GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]// 2014 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2014: 580-587.
[4]	GIRSHICK R. Fast R-CNN[C]// 2015 IEEE International Conference on Computer Vision. New York: IEEE Press, 2016: 1440-1448.
[5]	REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. DOI PMID
[6]	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 779-788.
[7]	REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 6517-6525.
[8]	REDMON J, FARHADI A. YOLOv3: an incremental improvement[EB/OL]. [2022-05-17]. https://arxiv.org/abs/1804.02767.
[9]	FU G Z, SUN P Z, ZHU W B, et al. A deep-learning-based approach for fast and robust steel surface defects classification[J]. Optics and Lasers in Engineering, 2019, 121: 397-405. DOI URL
[10]	LV X M, DUAN F J, JIANG J J, et al. Deep metallic surface defect detection: the new benchmark and detection network[J]. Sensors: Basel, Switzerland, 2020, 20(6): 1562.
[11]	VANNOCCI M, RITACCO A, CASTELLANO A, et al. Flatness defect detection and classification in hot rolled steel strips using convolutional neural networks[M]// Advances in Computational Intelligence. Cham: Springer International Publishing, 2019: 220-234.
[12]	HAN C J, LI G Y, LIU Z. Two-stage edge reuse network for salient object detection of strip steel surface defects[J]. IEEE Transactions on Instrumentation and Measurement, 2022, 71: 1-12.
[13]	王海云, 王剑平, 罗付华. 融合多层次特征Faster R-CNN的金属板带材表面缺陷检测研究[J]. 机械科学与技术, 2021, 40(2) : 262-269.
	WANG H Y, WANG J P, LUO F H. Study on surface defect detection of metal sheet and strip using faster R-CNN with multilevel feature[J]. Mechanical Science and Technology for Aerospace Engineering, 2021, 40(2): 262-269. (in Chinese)
[14]	代小红, 陈华江, 朱超平. 一种基于改进Faster RCNN的金属材料工件表面缺陷检测与实现研究[J]. 表面技术, 2020, 49(10): 362-371.
	DAI X H, CHEN H J, ZHU C P. Surface defect detection and realization of metal workpiece based on improved faster RCNN[J]. Surface Technology, 2020, 49(10): 362-371. (in Chinese)
[15]	LI J Y, SU Z F, GENG J H, et al. Real-time detection of steel strip surface defects based on improved YOLO detection network[J]. IFAC-PapersOnLine, 2018, 51(21): 76-81.
[16]	程婧怡, 段先华, 朱伟. 改进YOLOv3的金属表面缺陷检测研究[J]. 计算机工程与应用, 2021, 57(19): 252-258. DOI
	CHENG J Y, DUAN X H, ZHU W. Research on metal surface defect detection by improved YOLOv3[J]. Computer Engineering and Applications, 2021, 57(19): 252-258. (in Chinese) DOI
[17]	KOU X P, LIU S J, CHENG K Q, et al. Development of a YOLO-V3-based model for detecting defects on steel strip surface[J]. Measurement, 2021, 182: 109454. DOI URL
[18]	王杨, 曹铁勇, 杨吉斌, 等. 基于YOLO v5算法的迷彩伪装目标检测技术研究[J]. 计算机科学, 2021, 48(10): 226-232. DOI
	WANG Y, CAO T Y, YANG J B, et al. Camouflaged object detection based on improved YOLO v5 algorithm[J]. Computer Science, 2021, 48(10): 226-232. (in Chinese) DOI
[19]	LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 936-944.
[20]	LIU S, QI L, QIN H F, et al. Path aggregation network for instance segmentation[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 8759-8768.
[21]	HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 770-778.
[22]	RAMACHANDRAN P, PARMAR N, VASWANI A, et al. Stand-alone self-attention in vision models[EB/OL]. [2022-05-17]. 1906. 05909. https://arxiv.org/abs/1906.05909.
[23]	ZHAO H S, JIA J Y, KOLTUN V. Exploring self-attention for image recognition[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 10073-10082.
[24]	SRINIVAS A, LIN T Y, PARMAR N, et al. Bottleneck transformers for visual recognition[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 16514-16524.
[25]	XIE E Z, WANG W H, YU Z D, et al. SegFormer: simple and efficient design for semantic segmentation with transformers[J]. Advances in Neural Information Processing Systems, 2021, 34: 12077-12090.