SDENet: a synthetic defect data evaluation network based on multi-scale attention quality perception

doi:10.11996/JG.j.2095-302X.2025010094

Abstract

Abstract:

The quality evaluation of defect data synthesized through data augmentation can facilitate high-quality expansion of defect data, thereby mitigating the problem of poor detection model performance caused by insufficient defect data. When evaluating the quality of synthetic defect data, existing quality evaluation algorithms primarily focus on the distortion characteristics of the data but tend to overlook the defect attributes of the data. To address this issue, a SDENet model based on attention feature enhancement (AFE) and multi-scale attention quality perception (MAQP) was proposed, which comprehensively considered the distortion characteristics and defect attributes of synthesized defect data for quality evaluation. Firstly, the AFE module improved the model's generalization ability to defects of different sizes and positions through dual-branch pooling operation, while also using an attention mechanism to enhance the feature expression ability of the model. Secondly, the MAQP module vectorized and fused the features enhanced by AFE to better perceive the quality of synthetic defect data. Finally, the fused features were fed into the quality evaluation section, and the final evaluation score was generated. Experiments conducted on the constructed synthetic defect data set of road cracks demonstrated that the SDENet model achieved optimal results in RMSE, RMAE, PLCC, and SROCC metrics, with improvements of 10.7%, 5.0%, 1.8% and 1.8% compared to the suboptimal model, thereby verifying the effectiveness of the model. On the distorted dataset TID2013, the SDENet model also produced competitive results, reaching 0.902 and 0.876 on the PLCC and SROCC metrics, respectively.

Key words: attention mechanism, feature enhancement, feature fusion, synthetic defect data, quality evaluation

CLC Number:

LU Yang, CHEN Linhui, JIANG Xiaoheng, XU Mingliang. SDENet: a synthetic defect data evaluation network based on multi-scale attention quality perception[J]. Journal of Graphics, 2025, 46(1): 94-103.

Figures/Tables 9

References 46

[1]	崔克彬, 焦静颐. 基于MCB-FAH-YOLOv8的钢材表面缺陷检测算法[J]. 图学学报, 2024, 45(1): 112-125. DOI
	CUI K B, JIAO J Y. Steel surface defect detection algorithm based on MCB-FAH-YOLOv8[J]. Journal of Graphics, 2024, 45(1): 112-125 (in Chinese). DOI
[2]	鄢杰斌, 方玉明, 刘学林. 图像质量评价研究综述—从失真的角度[J]. 中国图象图形学报, 2022, 27(5): 1430-1466.
	YAN J B, FANG Y M, LIU X L. The review of distortion-related image quality assessment[J]. Journal of Image and Graphics, 2022, 27(5): 1430-1466 (in Chinese).
[3]	李娜, 顾庆, 姜枫, 等. 一种基于卷积神经网络的砂岩显微图像特征表示方法[J]. 软件学报, 2020, 31(11): 3621-3639.
	LI N, GU Q, JIANG F, et al. Feature representation method of microscopic sandstone images based on convolutional neural network[J]. Journal of Software, 2020, 31(11): 3621-3639 (in Chinese).
[4]	易令, 吕忠元, 丁进良, 等. 面向原油总氢物性预测的数据扩增预处理方法[J]. 控制与决策, 2018, 33(12): 2153-2160.
	YI L, LYU Z Y, DING J L, et al. Data pretreatment approach for crude oil hydrogen properties prediction[J]. Control and Decision, 2018, 33(12): 2153-2160 (in Chinese).
[5]	李琦. 基于深度学习的布匹缺陷检测算法研究[D]. 大连: 大连理工大学, 2022.
	LI Q. Research on fabric defect detection algorithm based on deep learning[D]. Dalian: Dalian University of Technology, 2022 (in Chinese).
[6]	柴伟佳, 王连明. 卷积神经网络的多字体汉字识别[J]. 中国图象图形学报, 2018, 23(3): 410-417.
	CHAI W J, WANG L M. Recognition of Chinese characters using deep convolutional neural network[J]. Journal of Image and Graphics, 2018, 23(3): 410-417 (in Chinese).
[7]	KINGMA D P, WELLING M. Auto-encoding variational bayes[EB/OL]. (2022-12-10)[2024-04-16]. https://www.arxiv.org/abs/1312.6114v10.
[8]	CRESWELL A, WHITE T, DUMOULIN V, et al. Generative adversarial networks: an overview[J]. IEEE Signal Processing Magazine, 2018, 35(1): 53-65.
[9]	KAMATA H, MUKUTA Y, HARADA T. Fully spiking variational autoencoder[C]// The 36th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2022: 7059-7067.
[10]	DENTON E, CHINTALA S, SZLAM A, et al. Deep generative image models using a Laplacian pyramid of adversarial networks[C]// The 28th International Conference on Neural Information Processing Systems. New York: ACM, 2015: 1486-1494.
[11]	杨春玲, 杨雅静. 基于多尺度特征逐层融合深度神经网络的无参考图像质量评价方法[J]. 华南理工大学学报(自然科学版), 2022, 50(4): 81-89, 141. DOI
	YANG C L, YANG Y J. A deep neural network based on layer-by-layer fusion of multi-scale features for no-reference image quality assessment[J]. Journal of South China University of Technology (Natural Science Edition), 2022, 50(4): 81-89, 141 (in Chinese).
[12]	JIANG W, LI L T, MA Y, et al. Image quality assessment with transformers and multi-metric fusion modules[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 1804-1808.
[13]	SAHA A, MISHRA S, BOVIK A C. Re-IQA: unsupervised learning for image quality assessment in the wild[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 5846-5855.
[14]	史再峰, 佟博文, 孔凡宁, 等. 基于双重注意力和分层感知表征的IQA方法[J]. 天津大学学报 (自然科学与工程技术版), 2024, 57(3): 234-243.
	SHI Z F, TONG B W, KONG F N, et al. Image quality assessment method based on dual attention and hierarchical perceptual representation[J]. Journal of Tianjin University (Science and Technology), 2024, 57(3): 234-243 (in Chinese).
[15]	HE G, WANG Y, XU L, et al. Focused feature differentiation network for image quality assessment[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. New York: IEEE Press, 2022: 1799-1803.
[16]	ZHANF G J, CUI K W, HUNG T Y, et al. Defect-GAN: high-fidelity defect synthesis for automated defect inspection[C]// 2021 IEEE Winter Conference on Applications of Computer Vision. New York: IEEE Press, 2021: 2523-2533.
[17]	NIU S L, LI B, WANG X G, et al. Defect image sample generation with GAN for improving defect recognition[J]. IEEE Transactions on Automation Science and Engineering, 2020, 17(3): 1611-1622.
[18]	NIU S L, LI B, WANG X G, et al. Region- and strength-controllable GAN for defect generation and segmentation in industrial images[J]. IEEE Transactions on Industrial Informatics, 2022, 18(7): 4531-4541.
[19]	YANG B Y, LIU Z Y, DUAN G F, et al. Mask2Defect: a prior knowledge-based data augmentation method for metal surface defect inspection[J]. IEEE Transactions on Industrial Informatics, 2022, 18(10): 6743-6755.
[20]	ZHANG Y L, WANG Y L, JIANG Z Q, et al. Diversifying tire-defect image generation based on generative adversarial network[J]. IEEE Transactions on Instrumentation and Measurement, 2022, 71: 5007312.
[21]	HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 770-778.
[22]	YU F, KOLTUN V, FUNKHOUSER T. Dilated residual networks[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 636-644.
[23]	XIE S N, GIRSHICK R, DOLLÁR P, et al. Aggregated residual transformations for deep neural networks[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 5987-5995.
[24]	郭琪周, 袁春. 基于空间语义信息特征融合的目标检测与分割[J]. 软件学报, 2023, 34(6): 2776-2788.
	GUO Q Z, YUAN C. Leveraging spatial-semantic information in object detection and segmentation[J]. Journal of Software, 2023, 34(6): 2776-2788 (in Chinese).
[25]	LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 936-944.
[26]	FU C Y, LIU W, RANGA A, et al. DSSD: deconvolutional single shot detector[EB/OL]. (2017-01-23) [2024-04-16]. https://arxiv.org/abs/1701.06659.
[27]	李利霞, 王鑫, 王军, 等. 基于特征融合与注意力机制的无人机图像小目标检测算法[J]. 图学学报, 2023, 44(4): 658-666. DOI
	LI L X, WANG X, WANG J, et al. Small object detection algorithm in UAV image based on feature fusion and attention mechanism[J]. Journal of Graphics, 2023, 44(4): 658-666 (in Chinese).
[28]	吕佳, 孙亚南, 许鹏程. 条带池化注意力的实时语义分割算法[J]. 计算机辅助设计与图形学学报, 2023, 35(9): 1395-1404.
	LYU J, SUN Y N, XU P C. Stripe pooling attention for real-time semantic segmentation[J]. Journal of Computer- Aided Design & Computer Graphics, 2023, 35(9): 1395-1404 (in Chinese).
[29]	FU J, LIU J, TIAN H J, et al. Dual attention network for scene segmentation[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 3141-3149.
[30]	HU J, SHEN L, ALBANIE S, et al. Squeeze-and-excitation networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(8): 2011-2023. DOI PMID
[31]	WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]// The 15th European Conference on Computer Vision. Cham: Springer, 2018: 3-19.
[32]	CHEN Y P, KALANTIDIS Y, LI J S, et al. A²-Nets: double attention networks[C]// The 32nd International Conference on Neural Information Processing Systems. New York: ACM, 2018: 350-359.
[33]	唐祎玲, 江顺亮, 徐少平, 等. 考虑双目竞争视觉现象的非对称失真立体图像质量评价方法[J]. 中国图象图形学报, 2023, 28(10): 3049-3063.
	TANG Y L, JIANG S L, XU S P, et al. Binocular rivalry-based stereoscopic images quality assessment relevant to its asymmetric and distorted contexts[J]. Journal of Image and Graphics, 2023, 28(10): 3049-3063 (in Chinese).
[34]	ZHANG W X, ZHAI G T, WEI Y, et al. Blind image quality assessment via vision-language correspondence: a multitask learning perspective[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 14071-14081.
[35]	杨岸霖, 蔡永香, 胡华科, 等. MR-GA: 一种基于实例分割的地下排水管道缺陷评估方法[J]. 给水排水, 2024, 60(6): 137-145.
	YANG A L, CAI Y X, HU H K, et al. MR-GA: an instance segmentation-based method for assessing underground drainage pipe defects[J]. Water & Wastewater Engineering, 2024, 60(6): 137-145 (in Chinese).
[36]	倪放翊. 基于生成对抗网络的缺陷样本生成与缺陷检测方法研究[D]. 哈尔滨: 哈尔滨工业大学, 2021.
	NI F Y. Research on GAN-bases defect sample generation and defect detection[D]. Harbin:Harbin Institute of Technology, 2021 (in Chinese).
[37]	郭逸汀. 光谱与工艺协同感知的弧焊质量在线预测技术[D]. 南京: 南京理工大学, 2019.
	GUO Y T. Online prediction technology for arc welding quality based on spectral and process collaborative perception[D]. Nanjing: Nanjing University of Science & Technology, 2019 (in Chinese).
[38]	SHI Y, CUI L M, QI Z Q, et al. Automatic road crack detection using random structured forests[J]. IEEE Transactions on Intelligent Transportation Systems, 2016, 17(12): 3434-3445.
[39]	褚江, 陈强, 杨曦晨. 全参考图像质量评价综述[J]. 计算机应用研究, 2014, 31(1): 13-22.
	CHU J, CHEN Q, YANG X C. Review on full reference image quality assessment algorithms[J]. Application Research of Computers, 2014, 31(1): 13-22 (in Chinese).
[40]	SANDLER M, HOWARD A, ZHU M L, et al. MobileNetV2: inverted residuals and linear bottlenecks[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 4510-4520.
[41]	SU S L, YAN Q S, ZHU Y, et al. Blindly assess image quality in the wild guided by a self-adaptive hyper network[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 3664-3673.
[42]	ZHANG W X, MA K D, YAN J, et al. Blind image quality assessment using a deep bilinear convolutional neural network[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30(1): 36-47.
[43]	GOLESTANEH S A, DADSETAN S, KITANI K M. No-reference image quality assessment via transformers, relative ranking, and self-consistency[C]// 2022 IEEE/CVF Winter Conference on Applications of Computer Vision. New York: IEEE Press, 2022: 3989-3999.
[44]	HE S, ZHANG Y C, XIE R, et al. Rethinking image aesthetics assessment: models, datasets and benchmarks[C]// The 31st International Joint Conference on Artificial Intelligence. Vienna: IJCAI.Org, 2022: 942-948.
[45]	YU L, LI J Y, PAKDAMAN F, et al. MAMIQA: no-reference image quality assessment based on multiscale attention mechanism with natural scene statistics[J]. IEEE Signal Processing Letters, 2023, 30: 588-592.
[46]	WANG Q L, WU B G, ZHU P F, et al. ECA-Net: efficient channel attention for deep convolutional neural networks[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 11531-11539.

模型	Param/M	Flops/G	Speed/FPS	合成缺陷数据集				TID2013数据集
模型	Param/M	Flops/G	Speed/FPS	RMSE↓	RMAE↓	PLCC↑	SROCC↑	PLCC↑	SROCC↑
TANet^[44]	13.88	2.16	125.6	1.195	0.992	0.786	0.786	0.510	0.502
MobileNetV2^[40]	2.23	0.33	225.6	1.170	0.970	0.807	0.805	0.665	0.548
MAMIQA^[45]	39.22	5.45	68.4	0.975	0.880	0.865	0.869	0.937	0.928
TReSNet^[43]	34.46	8.39	43.3	0.944	0.858	0.878	0.881	0.883	0.863
DBCNN^[42]	15.31	16.50	175.1	0.808	0.802	0.886	0.888	0.865	0.816
ResNet18^[21]	11.44	1.82	380.0	0.768	0.776	0.904	0.904	0.843	0.810
HyperIQA^[41]	27.38	4.34	171.1	0.774	0.767	0.904	0.905	0.858	0.840
SDENet(本文)	11.46	1.83	351.5	0.667	0.717	0.922	0.923	0.902	0.876

模型	Param/M	Flops/G	Speed/FPS	合成缺陷数据集				TID2013数据集
模型	Param/M	Flops/G	Speed/FPS	RMSE↓	RMAE↓	PLCC↑	SROCC↑	PLCC↑	SROCC↑
TANet^[44]	13.88	2.16	125.6	1.195	0.992	0.786	0.786	0.510	0.502
MobileNetV2^[40]	2.23	0.33	225.6	1.170	0.970	0.807	0.805	0.665	0.548
MAMIQA^[45]	39.22	5.45	68.4	0.975	0.880	0.865	0.869	0.937	0.928
TReSNet^[43]	34.46	8.39	43.3	0.944	0.858	0.878	0.881	0.883	0.863
DBCNN^[42]	15.31	16.50	175.1	0.808	0.802	0.886	0.888	0.865	0.816
ResNet18^[21]	11.44	1.82	380.0	0.768	0.776	0.904	0.904	0.843	0.810
HyperIQA^[41]	27.38	4.34	171.1	0.774	0.767	0.904	0.905	0.858	0.840
SDENet(本文)	11.46	1.83	351.5	0.667	0.717	0.922	0.923	0.902	0.876

方法	RMSE↓	RMAE↓	PLCC↑	SROCC↑
Baseline	0.768	0.776	0.904	0.904
Baseline+MQP	0.730	0.747	0.909	0.911
Baseline+MAQP	0.667	0.717	0.922	0.923

方法	RMSE↓	RMAE↓	PLCC↑	SROCC↑
Baseline	0.768	0.776	0.904	0.904
Baseline+MQP	0.730	0.747	0.909	0.911
Baseline+MAQP	0.667	0.717	0.922	0.923

操作	AFE				RMSE↓	RMAE↓	PLCC↑	SROCC↑
操作	L₁	L₂	L₃	L₄	RMSE↓	RMAE↓	PLCC↑	SROCC↑
NULL-NULL-NULL-NULL	-	-	-	-	0.730	0.747	0.909	0.911
NULL-A-NULL-NULL	-	√	-	-	0.716	0.743	0.914	0.914
NULL-A-A-NULL	-	√	√	-	0.691	0.726	0.918	0.918
NULL-A-A-A(本文)	-	√	√	√	0.667	0.717	0.922	0.923
A-A-A-A	√	√	√	√	0.676	0.717	0.919	0.919