基于边界和语义感知的表面缺陷分割网络

doi:10.11996/JG.j.2095-302X.2025030578

图学学报 ›› 2025, Vol. 46 ›› Issue (3): 578-587.DOI: 10.11996/JG.j.2095-302X.2025030578

• 图像处理与计算机视觉 • 上一篇下一篇

基于边界和语义感知的表面缺陷分割网络

崔丽莎¹(), 宋志文¹, 姜晓恒¹, 马鑫¹, 陈恩庆², 徐明亮¹()

1.郑州大学计算机与人工智能学院，河南郑州 450001
2.郑州大学电气与信息工程学院，河南郑州 450001

收稿日期:2024-08-22 接受日期:2025-01-12 出版日期:2025-06-30 发布日期:2025-06-13
通讯作者:徐明亮(1981-)，男，教授，博士。主要研究方向为大数据与人工智能等。E-mail：iexumingliang@zzu.edu.cn
第一作者:崔丽莎(1988-)，女，副教授，博士。主要研究方向为人工智能、目标检测和工业质检。E-mail：ielscui@zzu.edu.cn
基金资助:
国家自然科学基金(62106232);国家自然科学基金(62172371);国家自然科学基金(62036010);国家自然科学基金(U21B2037);中国博士后科学基金(2021TQ0301)

An edge and sematic-aware segmentation network for defect detection

CUI Lisha¹(), SONG Zhiwen¹, JIANG Xiaoheng¹, MA Xin¹, CHEN Enqing², XU Mingliang¹()

1. School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou Henan 450001, China
2. School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou Henan 450001, China

Received:2024-08-22 Accepted:2025-01-12 Published:2025-06-30 Online:2025-06-13
Contact: XU Mingliang (1981-), professor, Ph.D. His main research interests cover big data and artificial intelligence, etc. E-mail：iexumingliang@zzu.edu.cn
First author：CUI Lisha (1988-), associate professor, Ph.D. Her main research interests cover artificial intelligence, object detection, and industrial quality inspection. E-mail：ielscui@zzu.edu.cn
Supported by:
National Natural Science Foundation of China(62106232);National Natural Science Foundation of China(62172371);National Natural Science Foundation of China(62036010);National Natural Science Foundation of China(U21B2037);China Postdoctoral Science Foundation(2021TQ0301)

摘要/Abstract

摘要：

针对部分缺陷特征微弱、边界模糊以及尺度变化大等问题，提出了一种基于边界和语义感知的表面缺陷分割方法ESNet。首先采用双分支网络分别学习图像的语义信息和细节信息，为有效利用2个分支的全面互补信息，提出了双边注意力指导模块(BAGM)，通过语义分支的通道注意力指导细节分支学习上下文信息，而细节分支的空间注意力则指导语义分支捕捉底层细节信息。在语义分支中，设计了多尺度金字塔池化模块(MPPM)，充分学习和编码多层次上下文信息。同时，在细节分支中，进一步引入了边界感知模块(EAM)，通过底层预测的边界图指导高层特征图学习并增强边界信息。最后，为了有效融合细节特征和语义特征，提出了语义感知模块(SAM)，缓解跨尺度特征融合的语义信息不对齐问题。在公开缺陷分割数据集NEU-Seg，MT-Defect和MSD上进行了大量实验，实验结果验证了该方法的有效性。

关键词: 表面缺陷, 语义分割, 边界信息, 语义信息, 注意力

Abstract:

To address challenges such as weak defect features, blurred boundaries, and significant scale variations, an edge and semantic-aware segmentation network for defect detection (ESNet) was proposed. Specifically, a dual-branch network was employed to learn semantic and detailed information of the image separately. To effectively utilize the complementary information from both branches, a bilateral attention guidance module (BAGM) was proposed. This module guided the detailed branch to learn contextual information via the channel attention of the semantic branch, while the spatial attention of the detailed branch guided the semantic branch to capture low-level detailed information. In the semantic branch, a multi-scale pyramid pooling module (MPPM) was designed to thoroughly learn and encode multi-level contextual information. Simultaneously, in the detailed branch, an edge-aware module (EAM) was incorporated, which used the boundary map predicted by the lower layers to guide the higher-level feature maps in learning boundary information. Finally, to effectively fuse high-level and low-level feature maps, a semantic-aware module (SAM) was proposed to alleviate the semantic misalignment problem in cross-scale feature fusion. Extensive experiments on public defect segmentation datasets NEU-Seg, MT-Defect, and MSD demonstrated the effectiveness of the proposed method.

Key words: surface defect, semantic segmentation, edge information, semantic information, attention

中图分类号:

TP391
TP18

崔丽莎, 宋志文, 姜晓恒, 马鑫, 陈恩庆, 徐明亮. 基于边界和语义感知的表面缺陷分割网络[J]. 图学学报, 2025, 46(3): 578-587.

CUI Lisha, SONG Zhiwen, JIANG Xiaoheng, MA Xin, CHEN Enqing, XU Mingliang. An edge and sematic-aware segmentation network for defect detection[J]. Journal of Graphics, 2025, 46(3): 578-587.

图/表 15

图1 分割数据集NEU-Seg中的缺陷示例((a)缺陷及其边界对比度低；(b)缺陷尺度变化大)

Fig. 1 Example of defects in the NEU-Seg segmentation dataset ((a) Defects and low boundary contrast; (b) Large variation in defect scale)

图2 ESNet缺陷分割模型的整体网络结构图

Fig. 2 Overall network structure diagram of ESNet

图3 双边注意力指导模块结构图

Fig. 3 Structure diagram of BAGM

图4 多尺度金字塔池化模块结构图

Fig. 4 Structure diagram of MPPM

图5 边界感知模块结构图

Fig. 5 Structure diagram of EAM

图6 语义感知模块结构图

Fig. 6 Structure diagram of SAM

图7 数据集部分图像展示

Fig. 7 Partial image display of dataset ((a) NEU-Seg; (b) MT-Defect; (c) MSD)

表1 ESNet与其他方法在缺陷数据集上的实验结果对比

Table 1 Comparison of experimental results between ESNet and other methods on defect datasets

	方法	主干网络	Param/M	NEU-Seg			MT-Defect			MSD
	方法	主干网络	Param/M	mIoU/%	FLOPs/G	FPS	mIoU/%	FLOPs/G	FPS	mIoU/%	FLOPs/G	FPS
通用分割模型	FCN-8s^‎[22]	VGG16	30.02	81.3	320.87	95.78	64.9	320.87	95.78	89.5	1423.97	28.18
	DeepLabV3+^‎[23]	Xception	55.94	83.1	248.98	26.07	77.1	248.98	26.07	90.0	1115.41	6.23
	PSPNet^‎[24]	ResNet50	46.70	82.6	184.73	47.17	61.4	184.73	47.17	90.1	827.54	11.94
	ICNet^‎[25]	ResNet50	26.24	81.1	36.97	98.54	60.1	36.97	98.54	77.4	166.17	37.21
	BiseNetV1^‎[26]	ResNet18	12.79	81.1	13.04	324.57	68.7	13.04	324.57	88.2	58.57	120.30
	BiseNetV2^‎[27]	-	5.19	82.0	17.85	245.95	66.5	17.85	245.95	89.0	79.99	81.21
	STDCNet^‎[28]	STDC1	14.23	83.4	23.52	255.45	69.1	23.52	255.45	90.0	105.69	98.85
	ENet^‎[29]	-	0.33	82.5	2.05	301.19	38.2	2.05	301.19	87.0	9.11	97.16
	DDRNet^‎[30]	-	5.73	82.6	4.73	421.89	75.3	4.73	421.89	88.8	21.27	213.63
缺陷分割模型	FDSNet^‎[14]	-	0.96	81.0	1.04	513.51	66.0	1.04	513.51	90.2	4.67	377.13
	DBRNet^‎[15]	-	3.34	83.1	3.44	404.03	70.5	3.44	404.30	89.1	15.57	188.12
	ESNet	MobileNetV3	5.11	85.1	6.43	231.09	80.0	6.43	231.09	91.0	28.85	75.83

图8 ESNet与其他方法在NEU-Seg上的可视化分割结果对比((a)输入图片；(b) FCN-8s；(c) DeepLabV3；(d) PSPNet；(e) ICNet；(f) BiseNetV1；(g) BiseNetV2；(h) STDCNet；(i) Enet；(j) FDSNet；(k) DBRNet；(l) DDRNet；(m)本文方法；(n)真实标注)

Fig. 8 Comparison of visual segmentation results between ESNet and other methods on NEU-Seg ((a) Input image; (b) FCN-8s; (c) DeepLabV3; (d) PSPNet; (e) ICNet; (f) BiseNetV1; (g) BiseNetV2; (h) STDCNet; (i) ENet; (j) FDSNet; (k) DBRNet; (l) DDRNet; (m) Ours; (n) Ground truth)

表2 不同设备上的实验结果

Table 2 Experimental results on different devices

设备	显存/GB	模型	mIoU/%	FPS
TitanX	12	Baseline	82.6	59.12
TitanX	12	ESNet	85.0	28.73
GTX 3090	24	Baseline	82.6	421.89
GTX 3090	24	ESNet	85.1	231.09
GTX 4090	24	Baseline	82.7	440.47
GTX 4090	24	ESNet	85.3	304.66

图9 在NEU-Seg上特征图可视化结果((a)输入图片；(b) 真实标注；(c) FDSNet；(d) DBRNet；(e) DDRNet；(f)本文方法)

Fig. 9 Visualization results of feature maps on NEU-Seg ((a) Input image; (b) Ground truth; (c) FDSNet; (d) DBRNet; (e) DDRNet; (f) Ours)

表3 不同模块在NEU-Seg上的消融实验

Table 3 Ablation experiments of different modules on NEU-Seg

行号	Baseline	M3	BAGM	EAM	SAM	MPPM	mIoU/%	Param/M
1	√						82.6	5.73
2	√	√					82.7	4.45
3	√	√	√				83.3	4.28
4	√	√		√			83.8	5.56
5	√	√			√		83.2	5.43
6	√	√				√	83.0	3.87
7	√	√	√		√		83.9	5.26
8	√	√	√	√			84.2	5.39
9	√	√	√	√	√		84.9	5.70
10	√	√	√	√	√	√	85.1	5.11

表4 不同大小池化核在NEU-Seg上的消融实验

Table 4 Ablation experiments of different sized pooling nuclei on NEU-Seg

卷积核尺寸	输出尺寸	mIoU/%
GPA,17,9,5,1	1,2,4,8,16	84.9
GPA,11,7,3,1	1,3,5,7,16	85.1
GPA,9,5,3,1	1,3,5,7,16	84.5

表5 本文模块与现有模块对比

Table 5 Comparison with existing modules

模块	mIoU/%	Param/M	FLOPs/G
BGA^‎[27]	84.4	0.13	0.33
BF^‎[30]	84.1	0.06	0.07
BAGM	85.1	0.07	0.07
PPM^‎[24]	84.8	1.26	0.30
DAPPM^‎[30]	84.9	0.82	0.19
MPPM	85.1	0.24	0.05

表6 不同主干网络对比

Table 6 Comparison of backbone networks

主干网络	mIoU/%	Param/M	FLOPs/G
StarNet-s2^‎[32]	84.9	5.83	11.09
GhostNetV2 1.0×^‎[33]	85.0	6.03	6.51
ResNet-18‎^[34]	85.1	6.27	8.30
MobileNetV3-Large^‎[18]	85.1	5.11	6.43

参考文献 34

[1]	LIU J H, FU M R, LIU F L, et al. Window feature-based two-stage defect identification using magnetic flux leakage measurements[J]. IEEE Transactions on Instrumentation and Measurement, 2018, 67(1): 12-23.
[2]	ZHANG H, JIN X T, WU Q M J, et al. Automatic visual detection system of railway surface defects with curvature filter and improved Gaussian mixture model[J]. IEEE Transactions on Instrumentation and Measurement, 2018, 67(7): 1593-1608.
[3]	LUO Q W, FANG X X, SUN Y C, et al. Surface defect classification for hot-rolled steel strips by selectively dominant local binary patterns[J]. IEEE Access, 2019, 7: 23488-23499.
[4]	MA J X, WANG Y X, SHI C, et al. Fast surface defect detection using improved Gabor filters[C]// The 25th IEEE International Conference on Image Processing. New York: IEEE Press, 2018: 1508-1512.
[5]	LIU W H, YANG X Q, YANG X B, et al. A novel industrial chip parameters identification method based on cascaded region segmentation for surface-mount equipment[J]. IEEE Transactions on Industrial Electronics, 2022, 69(5): 5247-5256.
[6]	HE Y, SONG K C, DONG H W, et al. Semi-supervised defect classification of steel surface based on multi-training and generative adversarial network[J]. Optics and Lasers in Engineering, 2019, 122: 294-302.
[7]	MASCI J, MEIER U, FRICOUT G, et al. Multi-scale pyramidal pooling network for generic steel defect classification[C]// 2013 International Joint Conference on Neural Networks. New York: IEEE Press, 2013: 1-8.
[8]	ZHAO Y D, HAO K R, HE H B, et al. A visual long-short-term memory based integrated CNN model for fabric defect image classification[J]. Neurocomputing, 2020, 380: 259-270.
[9]	张相胜, 杨骁. 基于改进YOLOv7-tiny的橡胶密封圈缺陷检测方法[J]. 图学学报, 2024, 45(3): 446-453. DOI
	ZHANG X S, YANG X. Defect detection method of rubber seal ring based on improved YOLOv7-tiny[J]. Journal of Graphics, 2024, 45(3): 446-453 (in Chinese). DOI
[10]	BLOCK S B, DA SILVA R D, DORINI L B, et al. Inspection of imprint defects in stamped metal surfaces using deep learning and tracking[J]. IEEE Transactions on Industrial Electronics, 2021, 68(5): 4498-4507.
[11]	HE Y, SONG K C, MENG Q G, et al. An end-to-end steel surface defect detection approach via fusing multiple hierarchical features[J]. IEEE Transactions on Instrumentation and Measurement, 2020, 69(4): 1493-1504.
[12]	王素琴, 任琪, 石敏, 等. 基于异常检测的产品表面缺陷检测与分割[J]. 图学学报, 2022, 43(3): 377-386.
	WANG S Q, REN Q, SHI M, et al. Product surface defect detection and segmentation based on anomaly detection[J]. Journal of Graphics, 2022, 43(3): 377-386 (in Chinese). DOI
[13]	DONG H W, SONG K C, HE Y, et al. PGA-Net: pyramid feature fusion and global context attention network for automated surface defect detection[J]. IEEE Transactions on Industrial Informatics, 2020, 16(12): 7448-7458.
[14]	ZHANG J, DING R W, BAN M J, et al. FDSNeT: an accurate real-time surface defect segmentation network[C]// 2022 IEEE International Conference on Acoustics, Speech and Signal Processing. New York: IEEE Press, 2022: 3803-3807.
[15]	ZHANG T P, WEI X M, WU X M, et al. DBRNet: dual-branch real-time segmentation network for metal defect detection[C]// The 6th Chinese Conference on Pattern Recognition and Computer Vision. Cham: Springer, 2023: 422-434.
[16]	LIU T H, HE Z S. TAS²-Net: Triple-attention semantic segmentation network for small surface defect detection[J]. IEEE Transactions on Instrumentation and Measurement, 2022, 71: 5004512.
[17]	CHEN X D, FU C, TIE M, et al. AFFNet: an attention-based feature-fused network for surface defect segmentation[J]. Applied Sciences, 2023, 13(11): 6428.
[18]	HOWARD A, SANDLER M, CHU G, et al. Searching for MobileNetV3[C]// 2019 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2019: 1314-1324.
[19]	MILLETARI F, NAVAB N, AHMADI S A. V-Net: fully convolutional neural networks for volumetric medical image segmentation[C]// The 4th International Conference on 3D Vision. New York: IEEE Press, 2016: 565-571.
[20]	HUANG Y B, QIU C Y, YUAN K. Surface defect saliency of magnetic tile[J]. The Visual Computer, 2020, 36(1): 85-96.
[21]	CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834-848.
[22]	LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[C]// 2015 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2015: 3431-3440.
[23]	CHEN L C, ZHU Y K, PAPANDREOU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]// The 15th European Conference on Computer Vision. Cham: Springer, 2018: 801-818.
[24]	ZHAO H S, SHI J P, QI X J, et al. Pyramid scene parsing network[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 2881-2890.
[25]	ZHAO H S, QI X J, SHEN X Y, et al. ICNet for real-time semantic segmentation on high-resolution images[C]// The 15th European Conference on Computer Vision. Cham: Springer, 2018: 405-420.
[26]	YU C Q, WANG J B, PENG C, et al. BiSeNet: bilateral segmentation network for real-time semantic segmentation[C]// The 15th European Conference on Computer Vision. Cham: Springer, 2018: 325-341.
[27]	YU C Q, GAO C X, WANG J B, et al. BiSeNet V2: bilateral network with guided aggregation for real-time semantic segmentation[J]. International Journal of Computer Vision, 2021, 129(11): 3051-3068.
[28]	FAN M Y, LAI S Q, HUANG J S, et al. Rethinking BiSeNet for real-time semantic segmentation[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 9716-9725.
[29]	PASZKE A, CHAURASIA A, KIM S, et al. ENet: a deep neural network architecture for real-time semantic segmentation[EB/OL]. [2024-06-22]http://arxiv.org/abs/1606.02147.
[30]	HONG Y D, PAN H H, SUN W C, et al. Deep dual-resolution networks for real-time and accurate semantic segmentation of road scenes[EB/OL]. [2024-06-22]https://arxiv.org/abs/2101.06085.
[31]	SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: visual explanations from deep networks via gradient-based localization[C]// 2017 IEEE International Conference on Computer Vision. New York: IEEE Press, 2017: 618-626.
[32]	MA X, DAI X Y, BAI Y, et al. Rewrite the stars[C]// 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2024: 5694-5703.
[33]	TANG Y H, HAN K, GUO J Y, et al. GhostNetv2:enhance cheap operation with long-range attention[EB/OL]. [2024-06-22]https://arxiv.org/abs/2211.12905.
[34]	HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 770-778.

基于边界和语义感知的表面缺陷分割网络

An edge and sematic-aware segmentation network for defect detection

RichHTML

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

图/表 15

参考文献 34

相关文章 15

编辑推荐

Metrics

本文评价

[1]	牛杭, 葛鑫雨, 赵晓瑜, 杨珂, 王乾铭, 翟永杰. 基于改进YOLOv8的防振锤缺陷目标检测算法[J]. 图学学报, 2025, 46(3): 532-541.
[2]	于冰, 程广, 黄东晋, 丁友东. 基于双流网络融合的三维人体网格重建[J]. 图学学报, 2025, 46(3): 625-634.
[3]	雷玉林, 刘利刚. 基于深度强化学习的可缓冲的物体运输和装箱[J]. 图学学报, 2025, 46(3): 697-708.
[4]	张立立, 杨康, 张珂, 魏薇, 李晶, 谭洪鑫, 张翔宇. 面向柴油车辆排放黑烟的改进型YOLOv8检测算法研究[J]. 图学学报, 2025, 46(2): 249-258.
[5]	李治寰, 宁小娟, 吕志勇, 石争浩, 金海燕, 王映辉, 周文明. DEMF-Net：基于双分支增强和多尺度融合的大规模点云语义分割[J]. 图学学报, 2025, 46(2): 259-269.
[6]	郭业才, 胡晓伟, 毛湘南. 多尺度密集交互注意力残差真实图像去噪网络[J]. 图学学报, 2025, 46(2): 279-287.
[7]	刘高屹, 胡瑞珍, 刘利刚. 基于2D特征蒸馏的3D高斯泼溅语义分割与编辑[J]. 图学学报, 2025, 46(2): 312-321.
[8]	崔克彬, 耿佳昌. 基于EE-YOLOv8s的多场景火灾迹象检测算法[J]. 图学学报, 2025, 46(1): 13-27.
[9]	吴亦奇, 何嘉乐, 张甜甜, 张德军, 李艳丽, 陈壹林. 基于多重特征提取和点对应关系的三维点云非刚配准[J]. 图学学报, 2025, 46(1): 150-158.
[10]	陈冠豪, 徐丹, 贺康建, 施洪贞, 张浩. 基于转置注意力和CNN的图像超分辨率重建网络[J]. 图学学报, 2025, 46(1): 35-46.
[11]	张文祥, 王夏黎, 王欣仪, 杨宗宝. 一种强化伪造区域关注的深度伪造人脸检测方法[J]. 图学学报, 2025, 46(1): 47-58.
[12]	苑朝, 赵明雪, 张丰羿, 冯晓勇, 李冰, 陈瑞. 基于点云特征增强的复杂室内场景3D目标检测[J]. 图学学报, 2025, 46(1): 59-69.
[13]	卢洋, 陈林慧, 姜晓恒, 徐明亮. SDENet：基于多尺度注意力质量感知的合成缺陷数据评价网络[J]. 图学学报, 2025, 46(1): 94-103.
[14]	胡凤阔, 叶兰, 谭显峰, 张钦展, 胡志新, 方清, 王磊, 满孝锋. 一种基于改进YOLOv8的轻量化路面病害检测算法[J]. 图学学报, 2024, 45(5): 892-900.
[15]	刘义艳, 郝婷楠, 贺晨, 常英杰. 基于DBBR-YOLO的光伏电池表面缺陷检测[J]. 图学学报, 2024, 45(5): 913-921.