Steel surface defect detection algorithm based on MCB-FAH-YOLOv8

doi:10.11996/JG.j.2095-302X.2024010112

Abstract

Abstract:

To address the problems of misdetection, omission, and low detection accuracy in existing deep learning-based algorithms for detecting defects on steel surfaces, a YOLOv8 steel surface defect detection algorithm was proposed based on a modified CBAM (MCB) and replaceable four-head ASFF prediction head (FAH), abbreviated as MCB-FAH-YOLOv8. By integrating the modified convolutional attention mechanism module (CBAM), the algorithm could achieve better determination of the densely populated targets. By changing the FPN structure to BiFPN, it could extract context information more efficiently. It also incorporated adaptive feature fusion (ASFF) for the automatic identification of the most suitable fusion features. The algorithm also boosted its precision by replacing the SPPF module with the SimCSPSPPF module. Meanwhile, for tiny object detection, a four-head ASFF prediction head was proposed, designed to be replaceable based on the dataset characteristics. The experimental results demonstrated that the MCB-FAH-YOLOv8 algorithm could achieve a detection accuracy (mAP) of 88.8% on the VOC2007 dataset and 81.8% on the NEU-DET steel defect detection dataset, outperforming the benchmark model by 5.1% and 3.4%, respectively. This new algorithm achieved a higher detection accuracy with less loss of detection speed, thus ensuring a good balance between accuracy and speed.

Key words: MCB-FAH-YOLOv8, defect detection, attention mechanism, four-head ASFF prediction head, feature fusion

CLC Number:

TP391

CUI Kebin, JIAO Jingyi. Steel surface defect detection algorithm based on MCB-FAH-YOLOv8[J]. Journal of Graphics, 2024, 45(1): 112-125.

Figures/Tables 28

References 26

[1]	张艳, 冯锋. 带钢表面缺陷检测技术探析[J]. 信息与电脑: 理论版, 2021, 33(11): 19-22.
	ZHANG Y, FENG F. Analysis of strip surface defect detection technology[J]. China Computer & Communication, 2021, 33(11): 19-22 (in Chinese).
[2]	李维刚, 叶欣, 赵云涛, 等. 基于改进YOLOv3算法的带钢表面缺陷检测[J]. 电子学报, 2020, 48(7): 1284-1292. DOI
	LI W G, YE X, ZHAO Y T, et al. Strip steel surface defect detection based on improved YOLOv3 algorithm[J]. Acta Electronica Sinica, 2020, 48(7): 1284-1292 (in Chinese).
[3]	梁日强, 胡燕林, 蒋占四. 基于改进的残差收缩网络的带钢表面缺陷识别[J]. 组合机床与自动化加工技术, 2022(6): 82-85. DOI
	LIANG R Q, HU Y L, JIANG Z S. Strip surface defect identification based on improved residual shrinkage network[J]. Modular Machine Tool & Automatic Manufacturing Technique, 2022(6): 82-85 (in Chinese).
[4]	彭晏飞, 袁晓龙, 陈炎康, 等. 改进YOLOv5s的带钢表面缺陷检测方法[J/OL]. 机械科学与技术(2023-06-02) [2023-07- 01]. https://kns.cnki.net/kcms2/article/abstract?v=sf24_f5fySZD1- Ycih16WJXSfIVveZZx-CIfY7zktvjqzQn6B0DTDcZgNVLA0NIuChhowEzC5HzZY9pPVuTLKBnfsQHrTmDr0UAjhaYBQI51aIHBb882L1pGFuSBOaqzFrS6UO45b_s=&uniplatform=NZKPT&language=CHS.
	PENG Y F, YUAN X L, CHEN Y K, et al. Improved YOLOv5s strip surface defect detection method[J/OL]. Mechanical Science and Technology for Aerospace Engineering (2023-06-02) [2023-07-01]. https://kns.cnki.net/kcms2/article/abstract?v=sf24_f5fySZD1- Ycih16WJXSfIVveZZx-CIfY7zktvjqzQn6B0DTDcZgNVLA0NIuChhowEzC5HzZY9pPVuTLKBnfsQHrTmDr0UAjhaYBQI51aIHBb882L1pGFuSBOaqzFrS6UO45b_s=&uniplatform=NZKPT&language=CHS. (in Chinese).
[5]	唐东林, 杨洲, 程衡, 等. 浅层卷积神经网络融合Transformer的金属缺陷图像识别方法[J]. 中国机械工程, 2022, 33(19): 2298-2305, 2316.
	TANG D L, YANG Z, CHENG H, et al. Metal defect image recognition method based on shallow CNN fusion transformer[J]. China Mechanical Engineering, 2022, 33(19): 2298-2305, 2316 (in Chinese). DOI
[6]	WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]// European Conference on Computer Vision. Cham: Springer, 2018: 3-19.
[7]	LI C Y, LI L L, GENG Y F, et al. YOLOv6 v3.0: a full-scale reloading[EB/OL]. [2023-06-29]. https://arxiv.org/abs/2301.05586.pdf.
[8]	LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 936-944.
[9]	TAN M X, PANG R M, LE Q V. EfficientDet: scalable and efficient object detection[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 10781-10790.
[10]	LIU S T, HUANG D, WANG Y H. Learning spatial fusion for single-shot object detection[EB/OL]. [2023-06-29]. https://arxiv.org/abs/1911.09516.pdf.
[11]	BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. [2023-06-29]. https://arxiv.org/abs/2004.10934.pdf.
[12]	ZHANG H Y, CISSE M, DAUPHIN Y N, et al. Mixup: beyond empirical risk minimization[EB/OL]. [2023-06-29]. https://arxiv.org/abs/1710.09412.pdf.
[13]	YUN S, HAN D, CHUN S, et al. CutMix: regularization strategy to train strong classifiers with localizable features[C]// 2019 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2020: 6022-6031.
[14]	GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]// 2014 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2014: 580-587.
[15]	HE K M, ZHANG X Y, REN S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[C]// European Conference on Computer Vision. Cham: Springer, 2014: 346-361.
[16]	REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. DOI PMID
[17]	LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector[M]//Computer Vision - ECCV 2016. Cham: Springer International Publishing, 2016: 21-37.
[18]	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 779-788.
[19]	REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 6517-6525.
[20]	REDMON J, FARHADI A. YOLOv3: an incremental improvement[EB/OL]. [2023-06-29]. https://arxiv.org/abs/1804.02767.pdf.
[21]	LIU Z, LIN Y T, CAO Y, et al. Swin transformer: hierarchical vision transformer using shifted windows[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2022: 9992-10002.
[22]	DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16 x16 words: transformers for image recognition at scale[EB/OL]. [2023-06-29]. https://arxiv.org/abs/2010.11929.pdf.
[23]	HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 7132-7141.
[24]	FU J, LIU J, TIAN H J, et al. Dual attention network for scene segmentation[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 3141-3149.
[25]	GE Z, LIU S T, WANG F, et al. YOLOX: exceeding YOLO series in 2021[EB/OL]. [2023-06-29]. https://arxiv.org/abs/2107.08430.pdf.
[26]	WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[EB/OL]. [2023-06-29]. https://arxiv.org/abs/2207.02696.pdf.

算法模型	体积/MB	参数量/M	计算量/GFLOPs	mAP@0.5:0.95/%	FPS
YOLOv8n	5.96	3.01	8.9	83.7	116
YOLOv8n-CA	6.34	3.06	8.3	84.2	103
YOLOv8n-CBAM	6.81	3.29	8.5	85.5	93
YOLOv8n-MCB	9.07	4.41	9.6	86.5	101

算法模型	体积/MB	参数量/M	计算量/GFLOPs	mAP@0.5:0.95/%	FPS
YOLOv8n	5.96	3.01	8.9	83.7	116
YOLOv8n-CA	6.34	3.06	8.3	84.2	103
YOLOv8n-CBAM	6.81	3.29	8.5	85.5	93
YOLOv8n-MCB	9.07	4.41	9.6	86.5	101

算法模型	体积/MB	参数量/M	计算量/GFLOPs	mAP@0.5:0.95/%	FPS
YOLOv8n-MCB	9.07	4.41	9.6	86.5	101
YOLOv8n-MCB-B2	9.07	4.41	9.6	87.0	94
YOLOv8n-MCB-B3	9.10	4.43	9.7	87.2	97

算法模型	体积/MB	参数量/M	计算量/GFLOPs	mAP@0.5:0.95/%	FPS
YOLOv8n-MCB	9.07	4.41	9.6	86.5	101
YOLOv8n-MCB-B2	9.07	4.41	9.6	87.0	94
YOLOv8n-MCB-B3	9.10	4.43	9.7	87.2	97

算法模型	体积/MB	参数量/M	计算量/GFLOPs	mAP@0.5:0.95/%	FPS
YOLOv8n-MCB-B3	9.10	4.43	9.7	87.2	97
YOLOv8n-MCB-BA	11.70	5.73	11.8	88.5	81