Journal of Graphics ›› 2025, Vol. 46 ›› Issue (3): 578-587.DOI: 10.11996/JG.j.2095-302X.2025030578
• Image Processing and Computer Vision • Previous Articles Next Articles
CUI Lisha1(), SONG Zhiwen1, JIANG Xiaoheng1, MA Xin1, CHEN Enqing2, XU Mingliang1(
)
Received:
2024-08-22
Accepted:
2025-01-12
Online:
2025-06-30
Published:
2025-06-13
Contact:
XU Mingliang
About author:
First author contact:CUI Lisha (1988-), associate professor, Ph.D. Her main research interests cover artificial intelligence, object detection, and industrial quality inspection. E-mail:ielscui@zzu.edu.cn
Supported by:
CLC Number:
CUI Lisha, SONG Zhiwen, JIANG Xiaoheng, MA Xin, CHEN Enqing, XU Mingliang. An edge and sematic-aware segmentation network for defect detection[J]. Journal of Graphics, 2025, 46(3): 578-587.
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.txxb.com.cn/EN/10.11996/JG.j.2095-302X.2025030578
方法 | 主干网络 | Param/M | NEU-Seg | MT-Defect | MSD | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
mIoU/% | FLOPs/G | FPS | mIoU/% | FLOPs/G | FPS | mIoU/% | FLOPs/G | FPS | ||||
通用分割模型 | FCN-8s[ | VGG16 | 30.02 | 81.3 | 320.87 | 95.78 | 64.9 | 320.87 | 95.78 | 89.5 | 1423.97 | 28.18 |
DeepLabV3+[ | Xception | 55.94 | 83.1 | 248.98 | 26.07 | 77.1 | 248.98 | 26.07 | 90.0 | 1115.41 | 6.23 | |
PSPNet[ | ResNet50 | 46.70 | 82.6 | 184.73 | 47.17 | 61.4 | 184.73 | 47.17 | 90.1 | 827.54 | 11.94 | |
ICNet[ | ResNet50 | 26.24 | 81.1 | 36.97 | 98.54 | 60.1 | 36.97 | 98.54 | 77.4 | 166.17 | 37.21 | |
BiseNetV1[ | ResNet18 | 12.79 | 81.1 | 13.04 | 324.57 | 68.7 | 13.04 | 324.57 | 88.2 | 58.57 | 120.30 | |
BiseNetV2[ | - | 5.19 | 82.0 | 17.85 | 245.95 | 66.5 | 17.85 | 245.95 | 89.0 | 79.99 | 81.21 | |
STDCNet[ | STDC1 | 14.23 | 83.4 | 23.52 | 255.45 | 69.1 | 23.52 | 255.45 | 90.0 | 105.69 | 98.85 | |
ENet[ | - | 0.33 | 82.5 | 2.05 | 301.19 | 38.2 | 2.05 | 301.19 | 87.0 | 9.11 | 97.16 | |
DDRNet[ | - | 5.73 | 82.6 | 4.73 | 421.89 | 75.3 | 4.73 | 421.89 | 88.8 | 21.27 | 213.63 | |
缺陷分割模型 | FDSNet[ | - | 0.96 | 81.0 | 1.04 | 513.51 | 66.0 | 1.04 | 513.51 | 90.2 | 4.67 | 377.13 |
DBRNet[ | - | 3.34 | 83.1 | 3.44 | 404.03 | 70.5 | 3.44 | 404.30 | 89.1 | 15.57 | 188.12 | |
ESNet | MobileNetV3 | 5.11 | 85.1 | 6.43 | 231.09 | 80.0 | 6.43 | 231.09 | 91.0 | 28.85 | 75.83 |
Table 1 Comparison of experimental results between ESNet and other methods on defect datasets
方法 | 主干网络 | Param/M | NEU-Seg | MT-Defect | MSD | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
mIoU/% | FLOPs/G | FPS | mIoU/% | FLOPs/G | FPS | mIoU/% | FLOPs/G | FPS | ||||
通用分割模型 | FCN-8s[ | VGG16 | 30.02 | 81.3 | 320.87 | 95.78 | 64.9 | 320.87 | 95.78 | 89.5 | 1423.97 | 28.18 |
DeepLabV3+[ | Xception | 55.94 | 83.1 | 248.98 | 26.07 | 77.1 | 248.98 | 26.07 | 90.0 | 1115.41 | 6.23 | |
PSPNet[ | ResNet50 | 46.70 | 82.6 | 184.73 | 47.17 | 61.4 | 184.73 | 47.17 | 90.1 | 827.54 | 11.94 | |
ICNet[ | ResNet50 | 26.24 | 81.1 | 36.97 | 98.54 | 60.1 | 36.97 | 98.54 | 77.4 | 166.17 | 37.21 | |
BiseNetV1[ | ResNet18 | 12.79 | 81.1 | 13.04 | 324.57 | 68.7 | 13.04 | 324.57 | 88.2 | 58.57 | 120.30 | |
BiseNetV2[ | - | 5.19 | 82.0 | 17.85 | 245.95 | 66.5 | 17.85 | 245.95 | 89.0 | 79.99 | 81.21 | |
STDCNet[ | STDC1 | 14.23 | 83.4 | 23.52 | 255.45 | 69.1 | 23.52 | 255.45 | 90.0 | 105.69 | 98.85 | |
ENet[ | - | 0.33 | 82.5 | 2.05 | 301.19 | 38.2 | 2.05 | 301.19 | 87.0 | 9.11 | 97.16 | |
DDRNet[ | - | 5.73 | 82.6 | 4.73 | 421.89 | 75.3 | 4.73 | 421.89 | 88.8 | 21.27 | 213.63 | |
缺陷分割模型 | FDSNet[ | - | 0.96 | 81.0 | 1.04 | 513.51 | 66.0 | 1.04 | 513.51 | 90.2 | 4.67 | 377.13 |
DBRNet[ | - | 3.34 | 83.1 | 3.44 | 404.03 | 70.5 | 3.44 | 404.30 | 89.1 | 15.57 | 188.12 | |
ESNet | MobileNetV3 | 5.11 | 85.1 | 6.43 | 231.09 | 80.0 | 6.43 | 231.09 | 91.0 | 28.85 | 75.83 |
Fig. 8 Comparison of visual segmentation results between ESNet and other methods on NEU-Seg ((a) Input image; (b) FCN-8s; (c) DeepLabV3; (d) PSPNet; (e) ICNet; (f) BiseNetV1; (g) BiseNetV2; (h) STDCNet; (i) ENet; (j) FDSNet; (k) DBRNet; (l) DDRNet; (m) Ours; (n) Ground truth)
设备 | 显存/GB | 模型 | mIoU/% | FPS |
---|---|---|---|---|
TitanX | 12 | Baseline | 82.6 | 59.12 |
ESNet | 85.0 | 28.73 | ||
GTX 3090 | 24 | Baseline | 82.6 | 421.89 |
ESNet | 85.1 | 231.09 | ||
GTX 4090 | 24 | Baseline | 82.7 | 440.47 |
ESNet | 85.3 | 304.66 |
Table 2 Experimental results on different devices
设备 | 显存/GB | 模型 | mIoU/% | FPS |
---|---|---|---|---|
TitanX | 12 | Baseline | 82.6 | 59.12 |
ESNet | 85.0 | 28.73 | ||
GTX 3090 | 24 | Baseline | 82.6 | 421.89 |
ESNet | 85.1 | 231.09 | ||
GTX 4090 | 24 | Baseline | 82.7 | 440.47 |
ESNet | 85.3 | 304.66 |
行号 | Baseline | M3 | BAGM | EAM | SAM | MPPM | mIoU/% | Param/M |
---|---|---|---|---|---|---|---|---|
1 | √ | 82.6 | 5.73 | |||||
2 | √ | √ | 82.7 | 4.45 | ||||
3 | √ | √ | √ | 83.3 | 4.28 | |||
4 | √ | √ | √ | 83.8 | 5.56 | |||
5 | √ | √ | √ | 83.2 | 5.43 | |||
6 | √ | √ | √ | 83.0 | 3.87 | |||
7 | √ | √ | √ | √ | 83.9 | 5.26 | ||
8 | √ | √ | √ | √ | 84.2 | 5.39 | ||
9 | √ | √ | √ | √ | √ | 84.9 | 5.70 | |
10 | √ | √ | √ | √ | √ | √ | 85.1 | 5.11 |
Table 3 Ablation experiments of different modules on NEU-Seg
行号 | Baseline | M3 | BAGM | EAM | SAM | MPPM | mIoU/% | Param/M |
---|---|---|---|---|---|---|---|---|
1 | √ | 82.6 | 5.73 | |||||
2 | √ | √ | 82.7 | 4.45 | ||||
3 | √ | √ | √ | 83.3 | 4.28 | |||
4 | √ | √ | √ | 83.8 | 5.56 | |||
5 | √ | √ | √ | 83.2 | 5.43 | |||
6 | √ | √ | √ | 83.0 | 3.87 | |||
7 | √ | √ | √ | √ | 83.9 | 5.26 | ||
8 | √ | √ | √ | √ | 84.2 | 5.39 | ||
9 | √ | √ | √ | √ | √ | 84.9 | 5.70 | |
10 | √ | √ | √ | √ | √ | √ | 85.1 | 5.11 |
卷积核尺寸 | 输出尺寸 | mIoU/% |
---|---|---|
GPA,17,9,5,1 | 1,2,4,8,16 | 84.9 |
GPA,11,7,3,1 | 1,3,5,7,16 | 85.1 |
GPA,9,5,3,1 | 1,3,5,7,16 | 84.5 |
Table 4 Ablation experiments of different sized pooling nuclei on NEU-Seg
卷积核尺寸 | 输出尺寸 | mIoU/% |
---|---|---|
GPA,17,9,5,1 | 1,2,4,8,16 | 84.9 |
GPA,11,7,3,1 | 1,3,5,7,16 | 85.1 |
GPA,9,5,3,1 | 1,3,5,7,16 | 84.5 |
模块 | mIoU/% | Param/M | FLOPs/G |
---|---|---|---|
BGA[ | 84.4 | 0.13 | 0.33 |
BF[ | 84.1 | 0.06 | 0.07 |
BAGM | 85.1 | 0.07 | 0.07 |
PPM[ | 84.8 | 1.26 | 0.30 |
DAPPM[ | 84.9 | 0.82 | 0.19 |
MPPM | 85.1 | 0.24 | 0.05 |
Table 5 Comparison with existing modules
模块 | mIoU/% | Param/M | FLOPs/G |
---|---|---|---|
BGA[ | 84.4 | 0.13 | 0.33 |
BF[ | 84.1 | 0.06 | 0.07 |
BAGM | 85.1 | 0.07 | 0.07 |
PPM[ | 84.8 | 1.26 | 0.30 |
DAPPM[ | 84.9 | 0.82 | 0.19 |
MPPM | 85.1 | 0.24 | 0.05 |
主干网络 | mIoU/% | Param/M | FLOPs/G |
---|---|---|---|
StarNet-s2[ | 84.9 | 5.83 | 11.09 |
GhostNetV2 1.0×[ | 85.0 | 6.03 | 6.51 |
ResNet-18[ | 85.1 | 6.27 | 8.30 |
MobileNetV3-Large[ | 85.1 | 5.11 | 6.43 |
Table 6 Comparison of backbone networks
主干网络 | mIoU/% | Param/M | FLOPs/G |
---|---|---|---|
StarNet-s2[ | 84.9 | 5.83 | 11.09 |
GhostNetV2 1.0×[ | 85.0 | 6.03 | 6.51 |
ResNet-18[ | 85.1 | 6.27 | 8.30 |
MobileNetV3-Large[ | 85.1 | 5.11 | 6.43 |
[1] | LIU J H, FU M R, LIU F L, et al. Window feature-based two-stage defect identification using magnetic flux leakage measurements[J]. IEEE Transactions on Instrumentation and Measurement, 2018, 67(1): 12-23. |
[2] | ZHANG H, JIN X T, WU Q M J, et al. Automatic visual detection system of railway surface defects with curvature filter and improved Gaussian mixture model[J]. IEEE Transactions on Instrumentation and Measurement, 2018, 67(7): 1593-1608. |
[3] | LUO Q W, FANG X X, SUN Y C, et al. Surface defect classification for hot-rolled steel strips by selectively dominant local binary patterns[J]. IEEE Access, 2019, 7: 23488-23499. |
[4] | MA J X, WANG Y X, SHI C, et al. Fast surface defect detection using improved Gabor filters[C]// The 25th IEEE International Conference on Image Processing. New York: IEEE Press, 2018: 1508-1512. |
[5] | LIU W H, YANG X Q, YANG X B, et al. A novel industrial chip parameters identification method based on cascaded region segmentation for surface-mount equipment[J]. IEEE Transactions on Industrial Electronics, 2022, 69(5): 5247-5256. |
[6] | HE Y, SONG K C, DONG H W, et al. Semi-supervised defect classification of steel surface based on multi-training and generative adversarial network[J]. Optics and Lasers in Engineering, 2019, 122: 294-302. |
[7] | MASCI J, MEIER U, FRICOUT G, et al. Multi-scale pyramidal pooling network for generic steel defect classification[C]// 2013 International Joint Conference on Neural Networks. New York: IEEE Press, 2013: 1-8. |
[8] | ZHAO Y D, HAO K R, HE H B, et al. A visual long-short-term memory based integrated CNN model for fabric defect image classification[J]. Neurocomputing, 2020, 380: 259-270. |
[9] |
张相胜, 杨骁. 基于改进YOLOv7-tiny的橡胶密封圈缺陷检测方法[J]. 图学学报, 2024, 45(3): 446-453.
DOI |
ZHANG X S, YANG X. Defect detection method of rubber seal ring based on improved YOLOv7-tiny[J]. Journal of Graphics, 2024, 45(3): 446-453 (in Chinese).
DOI |
|
[10] | BLOCK S B, DA SILVA R D, DORINI L B, et al. Inspection of imprint defects in stamped metal surfaces using deep learning and tracking[J]. IEEE Transactions on Industrial Electronics, 2021, 68(5): 4498-4507. |
[11] | HE Y, SONG K C, MENG Q G, et al. An end-to-end steel surface defect detection approach via fusing multiple hierarchical features[J]. IEEE Transactions on Instrumentation and Measurement, 2020, 69(4): 1493-1504. |
[12] | 王素琴, 任琪, 石敏, 等. 基于异常检测的产品表面缺陷检测与分割[J]. 图学学报, 2022, 43(3): 377-386. |
WANG S Q, REN Q, SHI M, et al. Product surface defect detection and segmentation based on anomaly detection[J]. Journal of Graphics, 2022, 43(3): 377-386 (in Chinese).
DOI |
|
[13] | DONG H W, SONG K C, HE Y, et al. PGA-Net: pyramid feature fusion and global context attention network for automated surface defect detection[J]. IEEE Transactions on Industrial Informatics, 2020, 16(12): 7448-7458. |
[14] | ZHANG J, DING R W, BAN M J, et al. FDSNeT: an accurate real-time surface defect segmentation network[C]// 2022 IEEE International Conference on Acoustics, Speech and Signal Processing. New York: IEEE Press, 2022: 3803-3807. |
[15] | ZHANG T P, WEI X M, WU X M, et al. DBRNet: dual-branch real-time segmentation network for metal defect detection[C]// The 6th Chinese Conference on Pattern Recognition and Computer Vision. Cham: Springer, 2023: 422-434. |
[16] | LIU T H, HE Z S. TAS2-Net: Triple-attention semantic segmentation network for small surface defect detection[J]. IEEE Transactions on Instrumentation and Measurement, 2022, 71: 5004512. |
[17] | CHEN X D, FU C, TIE M, et al. AFFNet: an attention-based feature-fused network for surface defect segmentation[J]. Applied Sciences, 2023, 13(11): 6428. |
[18] | HOWARD A, SANDLER M, CHU G, et al. Searching for MobileNetV3[C]// 2019 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2019: 1314-1324. |
[19] | MILLETARI F, NAVAB N, AHMADI S A. V-Net: fully convolutional neural networks for volumetric medical image segmentation[C]// The 4th International Conference on 3D Vision. New York: IEEE Press, 2016: 565-571. |
[20] | HUANG Y B, QIU C Y, YUAN K. Surface defect saliency of magnetic tile[J]. The Visual Computer, 2020, 36(1): 85-96. |
[21] | CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834-848. |
[22] | LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[C]// 2015 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2015: 3431-3440. |
[23] | CHEN L C, ZHU Y K, PAPANDREOU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]// The 15th European Conference on Computer Vision. Cham: Springer, 2018: 801-818. |
[24] | ZHAO H S, SHI J P, QI X J, et al. Pyramid scene parsing network[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 2881-2890. |
[25] | ZHAO H S, QI X J, SHEN X Y, et al. ICNet for real-time semantic segmentation on high-resolution images[C]// The 15th European Conference on Computer Vision. Cham: Springer, 2018: 405-420. |
[26] | YU C Q, WANG J B, PENG C, et al. BiSeNet: bilateral segmentation network for real-time semantic segmentation[C]// The 15th European Conference on Computer Vision. Cham: Springer, 2018: 325-341. |
[27] | YU C Q, GAO C X, WANG J B, et al. BiSeNet V2: bilateral network with guided aggregation for real-time semantic segmentation[J]. International Journal of Computer Vision, 2021, 129(11): 3051-3068. |
[28] | FAN M Y, LAI S Q, HUANG J S, et al. Rethinking BiSeNet for real-time semantic segmentation[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 9716-9725. |
[29] | PASZKE A, CHAURASIA A, KIM S, et al. ENet: a deep neural network architecture for real-time semantic segmentation[EB/OL]. [2024-06-22]http://arxiv.org/abs/1606.02147. |
[30] | HONG Y D, PAN H H, SUN W C, et al. Deep dual-resolution networks for real-time and accurate semantic segmentation of road scenes[EB/OL]. [2024-06-22]https://arxiv.org/abs/2101.06085. |
[31] | SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: visual explanations from deep networks via gradient-based localization[C]// 2017 IEEE International Conference on Computer Vision. New York: IEEE Press, 2017: 618-626. |
[32] | MA X, DAI X Y, BAI Y, et al. Rewrite the stars[C]// 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2024: 5694-5703. |
[33] | TANG Y H, HAN K, GUO J Y, et al. GhostNetv2:enhance cheap operation with long-range attention[EB/OL]. [2024-06-22]https://arxiv.org/abs/2211.12905. |
[34] | HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 770-778. |
[1] | NIU Hang, GE Xinyu, ZHAO Xiaoyu, YANG Ke, WANG Qianming, ZHAI Yongjie. Vibration damper defect detection algorithm based on improved YOLOv8 [J]. Journal of Graphics, 2025, 46(3): 532-541. |
[2] | YU Bing, CHENG Guang, HUANG Dongjin, DING Youdong. 3D human mesh reconstruction based on dual-stream network fusion [J]. Journal of Graphics, 2025, 46(3): 625-634. |
[3] | LEI Yulin, LIU Ligang. Transport-and-packing with buffer via deep reinforcement learning [J]. Journal of Graphics, 2025, 46(3): 697-708. |
[4] | ZHANG Lili, YANG Kang, ZHANG Ke, WEI Wei, LI Jing, TAN Hongxin, ZHANG Xiangyu. Research on improved YOLOv8 detection algorithm for diesel vehicle emission of black smoke [J]. Journal of Graphics, 2025, 46(2): 249-258. |
[5] | LI Zhihuan, NING Xiaojuan, LV Zhiyong, SHI Zhenghao, JIN Haiyan, WANG Yinghui, ZHOU Wenming. DEMF-Net: dual-branch feature enhancement and multi-scale fusion for semantic segmentation of large-scale point clouds [J]. Journal of Graphics, 2025, 46(2): 259-269. |
[6] | GUO Yecai, HU Xiaowei, AMITAVE Saha, MAO Xiangnan. Multiscale dense interactive attention residual real image denoising network [J]. Journal of Graphics, 2025, 46(2): 279-287. |
[7] | PAN Shuyan, LIU Liqun. MSFAFuse: sar and optical image fusion model based on multi-scale feature information and attention mechanism [J]. Journal of Graphics, 2025, 46(2): 300-311. |
[8] | LIU Gaoyi, HU Ruizhen, LIU Ligang. 3D Gaussian splatting semantic segmentation and editing based on 2D feature distillation [J]. Journal of Graphics, 2025, 46(2): 312-321. |
[9] | CUI Kebin, GENG Jiachang. A multi-scene fire sign detection algorithm based on EE-YOLOv8s [J]. Journal of Graphics, 2025, 46(1): 13-27. |
[10] | WU Yiqi, HE Jiale, ZHANG Tiantian, ZHANG Dejun, LI Yanli, CHEN Yilin. Unsupervised 3D point cloud non-rigid registration based on multi-feature extraction and point correspondence [J]. Journal of Graphics, 2025, 46(1): 150-158. |
[11] | CHEN Guanhao, XU Dan, HE Kangjian, SHI Hongzhen, ZHANG Hao. TSA-SFNet: transpose self-attention and CNN based stereoscopic fusion network for image super-resolution [J]. Journal of Graphics, 2025, 46(1): 35-46. |
[12] | ZHANG Wenxiang, WANG Xiali, WANG Xinyi, YANG Zongbao. A deepfake face detection method that enhances focus on forgery regions [J]. Journal of Graphics, 2025, 46(1): 47-58. |
[13] | YUAN Chao, ZHAO Mingxue, ZHANG Fengyi, FENG Xiaoyong, LI Bing, CHEN Rui. Point cloud feature enhanced 3D object detection in complex indoor scenes [J]. Journal of Graphics, 2025, 46(1): 59-69. |
[14] | LU Yang, CHEN Linhui, JIANG Xiaoheng, XU Mingliang. SDENet: a synthetic defect data evaluation network based on multi-scale attention quality perception [J]. Journal of Graphics, 2025, 46(1): 94-103. |
[15] | HU Fengkuo, YE Lan, TAN Xianfeng, ZHANG Qinzhan, HU Zhixin, FANG Qing, WANG Lei, MAN Xiaofeng. A refined YOLOv8-based algorithm for lightweight pavement disease detection [J]. Journal of Graphics, 2024, 45(5): 892-900. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||