融合双重注意力与加权动态卷积的车辆损伤分类模型

doi:10.11996/JG.j.2095-302X.2026010017

摘要/Abstract

摘要：

针对车险理赔客户上传的车辆损伤图像中存在损伤类型形态相似、分类困难的问题，提出了一种适用于车辆损伤分类的模型ResAWDNet。首先，为有效增强模型对损伤特征的提取能力，使用加权动态卷积代替原有的下采样操作，依据输入特征动态调整卷积核权重，提高模型对不同尺度和方向特征的适应性，从而更准确地捕捉损伤的细微差异。其次，为了使模型关注图像中的显著性判别区域和特征通道，在主干网络的卷积层后嵌入了双重注意力机制，同时学习空间和通道维度上的重要权重，提升模型对关键信息的捕捉能力，进一步提升模型在损伤分类任务中的决策准确性。最后，基于真实事故案例的车辆损伤图片数据集进行实验验证。实验结果表明，ResAWDNet模型在车辆损伤分类任务中切实可行且优势显著，整体分类准确率达到73.79%。与基线模型相比，ResAWDNet在多类损伤类型的分类上均展现出更高的准确率，有力地证明了该模型的有效性。

关键词: 智能定损, 图像分类, 深度学习, 注意力机制, 动态卷积

Abstract:

To address the challenges of morphological similarity and the resulting difficulty in classifying vehicle damage images uploaded by clients for auto insurance claims, a model named ResAWDNet was proposed for vehicle damage classification. Firstly, to effectively augment the model’s capacity for extracting damage features, the traditional down sampling operation was replaced with weighted dynamic convolution. This approach dynamically adjusted the weights of convolutional kernels based on the input features, thereby enhancing the model’s adaptability to features of varying scales and orientations. As a result, it enabled more precise capture of the subtle differences in vehicle damage. Secondly, to ensure that the model could concentrate on the salient discriminative regions and feature channels within the images, a dual attention mechanism was embedded after the convolutional layers of the backbone network. This mechanism concurrently learned the important weights in both spatial and channel dimensions, significantly enhancing the model’s ability to capture crucial information. Consequently, it further enhanced the decision-making accuracy of the model in the task of vehicle damage classification. Finally, experimental validation was conducted based on a dataset of vehicle damage images sourced from real accident cases. The experimental results demonstrated that the ResAWDNet model was feasible and offered significant advantages for vehicle damage classification tasks, achieving an accuracy rate of 73.79%. Compared with baseline models, ResAWDNet achieved higher accuracy in classifying multiple types of damages, robustly validating the effectiveness of the proposed model.

Key words: intelligent damage assessment, image classification, deep learning, attention mechanism, dynamic convolution

中图分类号:

TP391.41

翟永杰, 王紫萱, 张祯琪, 周迅琪, 王乾铭. 融合双重注意力与加权动态卷积的车辆损伤分类模型[J]. 图学学报, 2026, 47(1): 17-28.

ZHAI Yongjie, WANG Zixuan, ZHANG Zhenqi, ZHOU Xunqi, WANG Qianming. A vehicle damage classification model incorporating dual attention and weighted dynamic convolution[J]. Journal of Graphics, 2026, 47(1): 17-28.

图/表 15

参考文献 49

[1]	赵子豪, 申颖, 李薇. 基于图像识别的车辆智能定损应用研究[J]. 保险职业学院学报, 2019, 33(3): 73-77.
	ZHAO Z H, SHEN Y, LI W. Application and value research about apps of vehicle survey and loss assessment based on image recognition[J]. Journal of Insurance Professional College, 2019, 33(3): 73-77 (in Chinese).
[2]	翟永杰, 李佳蔚, 陈年昊, 等. 融合改进Transformer的车辆部件检测方法[J]. 图学学报, 2024, 45(5): 930-940. DOI
	ZHAI Y J, LI J W, CHEN N H, et al. The vehicle parts detection method enhanced with Transformer integration[J]. Journal of Graphics, 2024, 45(5): 930-940 (in Chinese). DOI
[3]	武兵, 田莹. 基于注意力机制的多尺度道路损伤检测算法研究[J]. 图学学报, 2024, 45(4): 770-778. DOI
	WU B, TIAN Y. Research on multi-scale road damage detection algorithm based on attention mechanism[J]. Journal of Graphics, 2024, 45(4): 770-778 (in Chinese). DOI
[4]	LIU Q, HUANG X H, SHAO X Y, et al. Industrial cylinder liner defect detection using a transformer with a block division and mask mechanism[J]. Scientific Reports, 2022, 12(1): 10689. DOI PMID
[5]	PARK J K, KWON B K, PARK J H, et al. Machine learning-based imaging system for surface defect inspection[J]. International Journal of Precision Engineering and Manufacturing-Green Technology, 2016, 3(3): 303-310. DOI URL
[6]	WU S Q, ZHAO S Y, ZHANG Q Q, et al. Steel surface defect classification based on small sample learning[J]. Applied Sciences, 2021, 11(23): 11459. DOI URL
[7]	LIU W, QIU J, WANG Y, et al. Multiscale feature fusion convolutional neural network for surface damage detection in retired steel shafts[J]. Journal of Computing and Information Science in Engineering, 2024, 24(4): 041005.. DOI URL
[8]	王瑞芳. 基于字典学习的图像分类算法研究[D]. 重庆: 重庆邮电大学, 2020.
	WANG R F. Research on image classification algorithm based on dictionary learning[D]. Chongqing: Chongqing University of Posts and Telecommunications, 2020 (in Chinese).
[9]	张鹏飞, 石志良, 李晓垚, 等. 基于深度学习的主轴承盖分类识别算法[J]. 图学学报, 2021, 42(4): 572-580.
	ZHANG P F, SHI Z L, LI X Y, et al. Classification algorithm of main bearing cap based on deep learning[J]. Journal of Graphics, 2021, 42(4): 572-580 (in Chinese). DOI
[10]	董潇. 卷积神经网络的图像分类优化算法研究[D]. 淮南: 安徽理工大学, 2020.
	DONG X. Research on image classification optimization algorithm of convolutional neural network[D]. Huainan: Anhui University of Science & Technology, 2020 (in Chinese).
[11]	贺敏雪, 余烨, 程茹秋. 特征增强策略驱动的车标识别[J]. 中国图象图形学报, 2021, 26(5): 1030-1040.
	HE M X, YU Y, CHENG R Q. Vehicle logo recognition method based on feature enhancement[J]. Journal of Image and Graphics, 2021, 26(5): 1030-1040 (in Chinese). DOI URL
[12]	ANANDA B, PUTRI R A. K-nearest neighbor algorithm and case base reasoning on xenia car damage detection expert system[J]. Journal of Computer Networks, Architecture and High Performance Computing, 2024, 6(2): 633-646. DOI URL
[13]	MISHRA S, KAMAL D, SENTHIL KUMAR K. Vehicle damage identification using deep learning techniques[C]// 2024 IEEE International Students' Conference on Electrical, Electronics and Computer Science. New York: IEEE Press, 2024: 1-6.
[14]	PENG J B, DONG S B, YUAN H, et al. Car damage detection based on multi-view fusion and alignment: dataset and method[J]. IEEE Transactions on Intelligent Transportation Systems, 2025, 26(4): 4717-4730. DOI URL
[15]	SHUBHAM, BANERJEE D. Robust car damage identification through CNN and SVM techniques[C]// The 4th International Conference on Technological Advancements in Computational Sciences. New York: IEEE Press, 2024: 101-107.
[16]	王心旷. 基于深度学习的车辆外观损伤识别及其图像生成方法研究[D]. 合肥: 中国科学技术大学, 2024.
	WANG X K. Research on car exterior damage recognition and image generation based on deep learning[D]. Hefei: University of Science and Technology of China, 2024 (in Chinese).
[17]	HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 770-778.
[18]	REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. DOI PMID
[19]	CHEN L C, PAPANDREOU G, KOKKINOS I, et al. Semantic image segmentation with deep convolutional nets and fully connected CRFs[EB/OL]. [2024-12-28]. https://arxiv.org/abs/1412.7062.
[20]	KARRAS T, AILA T, LAINE S, et al. Progressive growing of GANs for improved quality, stability, and variation[EB/OL]. [2024-12-28]. https://openreview.net/forum?id=Hk99zCeAb.
[21]	顾正华, 刘嘎琼, 邵长斌, 等. 深度检测方法中融合大小感受野机制的下采样算法[J]. 计算机科学与探索, 2024, 18(10): 2727-2737. DOI
	GU Z H, LIU G Q, SHAO C B, et al. Downsampling algorithm with fusion of different receptive field sizes in deep detection methods[J]. Journal of Frontiers of Technology, 2024, 18(10): 2727-2737 (in Chinese).
[22]	谢东升. 基于深度学习的车辆智能定损算法研究[D]. 天津: 天津大学, 2019.
	XIE D S. Research on vehicle intelligent damage location algorithm based on deep learning[D]. Tianjin: Tianjin University, 2019 (in Chinese).
[23]	HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 7132-7141.
[24]	JADERBERG M, SIMONYAN K, ZISSERMAN A. Spatial transformer networks[C]// The 29th International Conference on Neural Information Processing Systems. Cambridge: MIT Press, 2015: 2017-2025.
[25]	WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]// The 15th European Conference on Computer Vision. Cham: Springer, 2018: 3-19.
[26]	GAO Y, ZENG Z, DU D, et al. SeerAttention: learning intrinsic sparse attention in your LLM[EB/OL]. [2024-12-28]. https://arxiv.org/abs/2410.13276.
[27]	OUYANG D L, HE S, ZHANG G Z, et al. Efficient multi-scale attention module with cross-spatial learning[C]// ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing. New York: IEEE Press, 2023: 1-5.
[28]	ZHANG H, ZU K K, LU J, et al. EPSANet: an efficient pyramid squeeze attention block on convolutional neural network[C]// The 16th Asian Conference on Computer Vision. Cham: Springer, 2022: 541-557.
[29]	WANG Q L, WU B G, ZHU P F, et al. ECA-Net: efficient channel attention for deep convolutional neural networks[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 11531-11539.
[30]	CHEN L W, FU Y, WEI K X, et al. Instance segmentation in the dark[J]. International Journal of Computer Vision, 2023, 131(8): 2198-2218. DOI
[31]	WANG F, JIANG M Q, QIAN C, et al. Residual attention network for image classification[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 3156-3164.
[32]	WANG X K, LI W J, WU Z C. CarDD: a new dataset for vision-based car damage detection[J]. IEEE Transactions on Intelligent Transportation Systems, 2023, 24(7): 7202-7214. DOI URL
[33]	ZHANG Z Z, LAN C L, ZENG W J, et al. Relation-aware global attention for person re-identification[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 3183-3192.
[34]	HUANG H J, CHEN Z G, ZOU Y, et al. Channel prior convolutional attention for medical image segmentation[J]. Computers in Biology and Medicine, 2024, 178: 108784. DOI URL
[35]	SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-cam: visual explanations from deep networks via gradient-based localization[C]// 2017 IEEE International Conference on Computer Vision. New York: IEEE Press, 2017: 618-626.
[36]	KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]// The 26th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2012: 1097-1105.
[37]	SZEGEDY C, LIU W, JIA Y Q, et al. Going deeper with convolutions[C]// 2015 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2015: 1-9.
[38]	HOWARD A G, ZHU M L, CHEN B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications[EB/OL]. [2024-12-28]. https://arxiv.org/abs/1704.04861.
[39]	ZHANG X Y, ZHOU X Y, LIN M X, et al. ShuffleNet: an extremely efficient convolutional neural network for mobile devices[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 6848-6856.
[40]	HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 2261-2269.
[41]	TAN M X, LE Q. EfficientNet: rethinking model scaling for convolutional neural networks[EB/OL]. [2024-12-28]. https://proceedings.mlr.press/v97/tan19a.html.
[42]	XU J, PAN Y, PAN X L, et al. RegNet: self-regulated network for image classification[J]. IEEE Transactions on Neural Networks and Learning Systems, 2023, 34(11): 9562-9567. DOI URL
[43]	TAN M X, LE Q. EfficientNetV2:smaller models and faster training[EB/OL]. [2024-12-28]. https://proceedings.mlr.press/v139/tan21a.
[44]	CHEN J R, KAO S H, HE H, et al. Run, don't walk: chasing higher FLOPS for faster neural networks[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 12021-12031.
[45]	DING X H, ZHANG X Y, HAN J G, et al. Scaling up your kernels to 31×31: revisiting large kernel design in CNNs[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 11953-11965.
[46]	MA X, DAI X Y, BAI Y, et al. Rewrite the stars[C]// 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2024: 5694-5703.
[47]	DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words:transformers for image recognition at scale[EB/OL]. [2024-12-28]. https://openreview.net/forum?id=YicbFdNTTy.
[48]	LIU Z, LIN Y T, CAO Y, et al. Swin transformer: hierarchical vision transformer using shifted windows[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 9992-10002.
[49]	MEHTA S, RASTEGARI M. MobileViT: light-weight, general-purpose, and mobile-friendly vision transformer[EB/OL]. [2024-12-28]. https://openreview.net/forum?id=vh-0sUt8HlG.

损伤类型	训练集/张	测试集/张	图片数量/张
错位	2 749	916	3 665
玻璃破损	913	304	1 217
玻璃裂痕	933	311	1 244
中度变形	1 493	497	1 990
轻微变形	1 488	496	1 984
丢失	2 683	894	3 577
车身划痕	4 336	1445	5 781
车身刮擦	4 116	1371	5 487
重度变形	1 455	485	1 940
撕裂	2 684	894	3 578

损伤类型	训练集/张	测试集/张	图片数量/张
错位	2 749	916	3 665
玻璃破损	913	304	1 217
玻璃裂痕	933	311	1 244
中度变形	1 493	497	1 990
轻微变形	1 488	496	1 984
丢失	2 683	894	3 577
车身划痕	4 336	1445	5 781
车身刮擦	4 116	1371	5 487
重度变形	1 455	485	1 940
撕裂	2 684	894	3 578

学习率	Acc_1%
0.01	51.67
0.005	53.58
0.001	62.45
0.000 5	69.50
0.000 1	73.79
0.000 05	73.19
0.000 01	72.81

学习率	Acc_1%
0.01	51.67
0.005	53.58
0.001	62.45
0.000 5	69.50
0.000 1	73.79
0.000 05	73.19
0.000 01	72.81

模型	Acc_1%	Acc_5%
Baseline	71.88	97.24
Baseline+ WDConv	73.05	97.16
Baseline+DAM	72.97	97.14
ResAWDNet（本文模型）	73.79	97.68