An edge and sematic-aware segmentation network for defect detection

doi:10.11996/JG.j.2095-302X.2025030578

Abstract

Abstract:

To address challenges such as weak defect features, blurred boundaries, and significant scale variations, an edge and semantic-aware segmentation network for defect detection (ESNet) was proposed. Specifically, a dual-branch network was employed to learn semantic and detailed information of the image separately. To effectively utilize the complementary information from both branches, a bilateral attention guidance module (BAGM) was proposed. This module guided the detailed branch to learn contextual information via the channel attention of the semantic branch, while the spatial attention of the detailed branch guided the semantic branch to capture low-level detailed information. In the semantic branch, a multi-scale pyramid pooling module (MPPM) was designed to thoroughly learn and encode multi-level contextual information. Simultaneously, in the detailed branch, an edge-aware module (EAM) was incorporated, which used the boundary map predicted by the lower layers to guide the higher-level feature maps in learning boundary information. Finally, to effectively fuse high-level and low-level feature maps, a semantic-aware module (SAM) was proposed to alleviate the semantic misalignment problem in cross-scale feature fusion. Extensive experiments on public defect segmentation datasets NEU-Seg, MT-Defect, and MSD demonstrated the effectiveness of the proposed method.

Key words: surface defect, semantic segmentation, edge information, semantic information, attention

CLC Number:

TP391
TP18

CUI Lisha, SONG Zhiwen, JIANG Xiaoheng, MA Xin, CHEN Enqing, XU Mingliang. An edge and sematic-aware segmentation network for defect detection[J]. Journal of Graphics, 2025, 46(3): 578-587.

Figures/Tables 15

References 34

[1]	LIU J H, FU M R, LIU F L, et al. Window feature-based two-stage defect identification using magnetic flux leakage measurements[J]. IEEE Transactions on Instrumentation and Measurement, 2018, 67(1): 12-23.
[2]	ZHANG H, JIN X T, WU Q M J, et al. Automatic visual detection system of railway surface defects with curvature filter and improved Gaussian mixture model[J]. IEEE Transactions on Instrumentation and Measurement, 2018, 67(7): 1593-1608.
[3]	LUO Q W, FANG X X, SUN Y C, et al. Surface defect classification for hot-rolled steel strips by selectively dominant local binary patterns[J]. IEEE Access, 2019, 7: 23488-23499.
[4]	MA J X, WANG Y X, SHI C, et al. Fast surface defect detection using improved Gabor filters[C]// The 25th IEEE International Conference on Image Processing. New York: IEEE Press, 2018: 1508-1512.
[5]	LIU W H, YANG X Q, YANG X B, et al. A novel industrial chip parameters identification method based on cascaded region segmentation for surface-mount equipment[J]. IEEE Transactions on Industrial Electronics, 2022, 69(5): 5247-5256.
[6]	HE Y, SONG K C, DONG H W, et al. Semi-supervised defect classification of steel surface based on multi-training and generative adversarial network[J]. Optics and Lasers in Engineering, 2019, 122: 294-302.
[7]	MASCI J, MEIER U, FRICOUT G, et al. Multi-scale pyramidal pooling network for generic steel defect classification[C]// 2013 International Joint Conference on Neural Networks. New York: IEEE Press, 2013: 1-8.
[8]	ZHAO Y D, HAO K R, HE H B, et al. A visual long-short-term memory based integrated CNN model for fabric defect image classification[J]. Neurocomputing, 2020, 380: 259-270.
[9]	张相胜, 杨骁. 基于改进YOLOv7-tiny的橡胶密封圈缺陷检测方法[J]. 图学学报, 2024, 45(3): 446-453. DOI
	ZHANG X S, YANG X. Defect detection method of rubber seal ring based on improved YOLOv7-tiny[J]. Journal of Graphics, 2024, 45(3): 446-453 (in Chinese). DOI
[10]	BLOCK S B, DA SILVA R D, DORINI L B, et al. Inspection of imprint defects in stamped metal surfaces using deep learning and tracking[J]. IEEE Transactions on Industrial Electronics, 2021, 68(5): 4498-4507.
[11]	HE Y, SONG K C, MENG Q G, et al. An end-to-end steel surface defect detection approach via fusing multiple hierarchical features[J]. IEEE Transactions on Instrumentation and Measurement, 2020, 69(4): 1493-1504.
[12]	王素琴, 任琪, 石敏, 等. 基于异常检测的产品表面缺陷检测与分割[J]. 图学学报, 2022, 43(3): 377-386.
	WANG S Q, REN Q, SHI M, et al. Product surface defect detection and segmentation based on anomaly detection[J]. Journal of Graphics, 2022, 43(3): 377-386 (in Chinese). DOI
[13]	DONG H W, SONG K C, HE Y, et al. PGA-Net: pyramid feature fusion and global context attention network for automated surface defect detection[J]. IEEE Transactions on Industrial Informatics, 2020, 16(12): 7448-7458.
[14]	ZHANG J, DING R W, BAN M J, et al. FDSNeT: an accurate real-time surface defect segmentation network[C]// 2022 IEEE International Conference on Acoustics, Speech and Signal Processing. New York: IEEE Press, 2022: 3803-3807.
[15]	ZHANG T P, WEI X M, WU X M, et al. DBRNet: dual-branch real-time segmentation network for metal defect detection[C]// The 6th Chinese Conference on Pattern Recognition and Computer Vision. Cham: Springer, 2023: 422-434.
[16]	LIU T H, HE Z S. TAS²-Net: Triple-attention semantic segmentation network for small surface defect detection[J]. IEEE Transactions on Instrumentation and Measurement, 2022, 71: 5004512.
[17]	CHEN X D, FU C, TIE M, et al. AFFNet: an attention-based feature-fused network for surface defect segmentation[J]. Applied Sciences, 2023, 13(11): 6428.
[18]	HOWARD A, SANDLER M, CHU G, et al. Searching for MobileNetV3[C]// 2019 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2019: 1314-1324.
[19]	MILLETARI F, NAVAB N, AHMADI S A. V-Net: fully convolutional neural networks for volumetric medical image segmentation[C]// The 4th International Conference on 3D Vision. New York: IEEE Press, 2016: 565-571.
[20]	HUANG Y B, QIU C Y, YUAN K. Surface defect saliency of magnetic tile[J]. The Visual Computer, 2020, 36(1): 85-96.
[21]	CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834-848.
[22]	LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[C]// 2015 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2015: 3431-3440.
[23]	CHEN L C, ZHU Y K, PAPANDREOU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]// The 15th European Conference on Computer Vision. Cham: Springer, 2018: 801-818.
[24]	ZHAO H S, SHI J P, QI X J, et al. Pyramid scene parsing network[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 2881-2890.
[25]	ZHAO H S, QI X J, SHEN X Y, et al. ICNet for real-time semantic segmentation on high-resolution images[C]// The 15th European Conference on Computer Vision. Cham: Springer, 2018: 405-420.
[26]	YU C Q, WANG J B, PENG C, et al. BiSeNet: bilateral segmentation network for real-time semantic segmentation[C]// The 15th European Conference on Computer Vision. Cham: Springer, 2018: 325-341.
[27]	YU C Q, GAO C X, WANG J B, et al. BiSeNet V2: bilateral network with guided aggregation for real-time semantic segmentation[J]. International Journal of Computer Vision, 2021, 129(11): 3051-3068.
[28]	FAN M Y, LAI S Q, HUANG J S, et al. Rethinking BiSeNet for real-time semantic segmentation[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 9716-9725.
[29]	PASZKE A, CHAURASIA A, KIM S, et al. ENet: a deep neural network architecture for real-time semantic segmentation[EB/OL]. [2024-06-22]http://arxiv.org/abs/1606.02147.
[30]	HONG Y D, PAN H H, SUN W C, et al. Deep dual-resolution networks for real-time and accurate semantic segmentation of road scenes[EB/OL]. [2024-06-22]https://arxiv.org/abs/2101.06085.
[31]	SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: visual explanations from deep networks via gradient-based localization[C]// 2017 IEEE International Conference on Computer Vision. New York: IEEE Press, 2017: 618-626.
[32]	MA X, DAI X Y, BAI Y, et al. Rewrite the stars[C]// 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2024: 5694-5703.
[33]	TANG Y H, HAN K, GUO J Y, et al. GhostNetv2:enhance cheap operation with long-range attention[EB/OL]. [2024-06-22]https://arxiv.org/abs/2211.12905.
[34]	HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 770-778.

	方法	主干网络	Param/M	NEU-Seg			MT-Defect			MSD
	方法	主干网络	Param/M	mIoU/%	FLOPs/G	FPS	mIoU/%	FLOPs/G	FPS	mIoU/%	FLOPs/G	FPS
通用分割模型	FCN-8s^‎[22]	VGG16	30.02	81.3	320.87	95.78	64.9	320.87	95.78	89.5	1423.97	28.18
	DeepLabV3+^‎[23]	Xception	55.94	83.1	248.98	26.07	77.1	248.98	26.07	90.0	1115.41	6.23
	PSPNet^‎[24]	ResNet50	46.70	82.6	184.73	47.17	61.4	184.73	47.17	90.1	827.54	11.94
	ICNet^‎[25]	ResNet50	26.24	81.1	36.97	98.54	60.1	36.97	98.54	77.4	166.17	37.21
	BiseNetV1^‎[26]	ResNet18	12.79	81.1	13.04	324.57	68.7	13.04	324.57	88.2	58.57	120.30
	BiseNetV2^‎[27]	-	5.19	82.0	17.85	245.95	66.5	17.85	245.95	89.0	79.99	81.21
	STDCNet^‎[28]	STDC1	14.23	83.4	23.52	255.45	69.1	23.52	255.45	90.0	105.69	98.85
	ENet^‎[29]	-	0.33	82.5	2.05	301.19	38.2	2.05	301.19	87.0	9.11	97.16
	DDRNet^‎[30]	-	5.73	82.6	4.73	421.89	75.3	4.73	421.89	88.8	21.27	213.63
缺陷分割模型	FDSNet^‎[14]	-	0.96	81.0	1.04	513.51	66.0	1.04	513.51	90.2	4.67	377.13
	DBRNet^‎[15]	-	3.34	83.1	3.44	404.03	70.5	3.44	404.30	89.1	15.57	188.12
	ESNet	MobileNetV3	5.11	85.1	6.43	231.09	80.0	6.43	231.09	91.0	28.85	75.83

	方法	主干网络	Param/M	NEU-Seg			MT-Defect			MSD
	方法	主干网络	Param/M	mIoU/%	FLOPs/G	FPS	mIoU/%	FLOPs/G	FPS	mIoU/%	FLOPs/G	FPS
通用分割模型	FCN-8s^‎[22]	VGG16	30.02	81.3	320.87	95.78	64.9	320.87	95.78	89.5	1423.97	28.18
	DeepLabV3+^‎[23]	Xception	55.94	83.1	248.98	26.07	77.1	248.98	26.07	90.0	1115.41	6.23
	PSPNet^‎[24]	ResNet50	46.70	82.6	184.73	47.17	61.4	184.73	47.17	90.1	827.54	11.94
	ICNet^‎[25]	ResNet50	26.24	81.1	36.97	98.54	60.1	36.97	98.54	77.4	166.17	37.21
	BiseNetV1^‎[26]	ResNet18	12.79	81.1	13.04	324.57	68.7	13.04	324.57	88.2	58.57	120.30
	BiseNetV2^‎[27]	-	5.19	82.0	17.85	245.95	66.5	17.85	245.95	89.0	79.99	81.21
	STDCNet^‎[28]	STDC1	14.23	83.4	23.52	255.45	69.1	23.52	255.45	90.0	105.69	98.85
	ENet^‎[29]	-	0.33	82.5	2.05	301.19	38.2	2.05	301.19	87.0	9.11	97.16
	DDRNet^‎[30]	-	5.73	82.6	4.73	421.89	75.3	4.73	421.89	88.8	21.27	213.63
缺陷分割模型	FDSNet^‎[14]	-	0.96	81.0	1.04	513.51	66.0	1.04	513.51	90.2	4.67	377.13
	DBRNet^‎[15]	-	3.34	83.1	3.44	404.03	70.5	3.44	404.30	89.1	15.57	188.12
	ESNet	MobileNetV3	5.11	85.1	6.43	231.09	80.0	6.43	231.09	91.0	28.85	75.83

设备	显存/GB	模型	mIoU/%	FPS
TitanX	12	Baseline	82.6	59.12
TitanX	12	ESNet	85.0	28.73
GTX 3090	24	Baseline	82.6	421.89
GTX 3090	24	ESNet	85.1	231.09
GTX 4090	24	Baseline	82.7	440.47
GTX 4090	24	ESNet	85.3	304.66

设备	显存/GB	模型	mIoU/%	FPS
TitanX	12	Baseline	82.6	59.12
TitanX	12	ESNet	85.0	28.73
GTX 3090	24	Baseline	82.6	421.89
GTX 3090	24	ESNet	85.1	231.09
GTX 4090	24	Baseline	82.7	440.47
GTX 4090	24	ESNet	85.3	304.66

行号	Baseline	M3	BAGM	EAM	SAM	MPPM	mIoU/%	Param/M
1	√						82.6	5.73
2	√	√					82.7	4.45
3	√	√	√				83.3	4.28
4	√	√		√			83.8	5.56
5	√	√			√		83.2	5.43
6	√	√				√	83.0	3.87
7	√	√	√		√		83.9	5.26
8	√	√	√	√			84.2	5.39
9	√	√	√	√	√		84.9	5.70
10	√	√	√	√	√	√	85.1	5.11