The vehicle parts detection method enhanced with Transformer integration

doi:10.11996/JG.j.2095-302X.2024050930

Abstract

Abstract:

To effectively address issues such as false detections and missed detections caused by insufficient feature extraction and inadequate utilization of candidate boxes in vehicle component detection models, an improved Transformer-based method for vehicle component detection was proposed. Firstly, by combining multi-head self-attention and bi-layer routing attention, a key region multi-head self-attention (KR-MHSA) mechanism was introduced. Secondly, the final layer of ResNet in the baseline model (Mask R-CNN) was integrated with KR-MHSA using residual fusion, enhancing the basic feature extraction capabilities of the model. Finally, the improved Swin Transformer was employed for feature learning on the candidate boxes generated by the model, enabling the model to better understand the differences and similarities between various candidate boxes. Experiments conducted on a constructed dataset of 59 vehicle component categories demonstrated that the proposed model outperformed other state-of-the-art instance segmentation models in both detection and segmentation performance. Compared to the baseline model, the detection accuracy improved by 4.47%, and the segmentation accuracy improved by 4.4%. This effectively resolved the issues of insufficient feature extraction and inadequate utilization of candidate boxes in vehicle component detection, leading to more accurate and efficient replacement of damaged parts by insurance companies, thus improving claims processing efficiency.

Key words: vehicle parts, deep learning, instance segmentation, Mask R-CNN, feature extraction, multi-head self-attention, bi-level routing attention

CLC Number:

ZHAI Yongjie, LI Jiawei, CHEN Nianhao, WANG Qianming, WANG Xinying. The vehicle parts detection method enhanced with Transformer integration[J]. Journal of Graphics, 2024, 45(5): 930-940.

Figures/Tables 8

References 35

[1]	黄春梅, 彭昊. 车辆远程定损系统——以中国人民财产保险公司为例[J]. 内燃机与配件, 2021(6): 197-198.
	HUANG C M, PENG H. Vehicle remote loss assessment system—taking China people's property insurance company as an example[J]. Internal Combustion Engine & Parts, 2021(6): 197-198 (in Chinese).
[2]	高文婷, 刘越. 面向移动增强现实的实时深度学习目标检测方法综述[J]. 图学学报, 2021, 42(4): 525-534.
	GAO W T, LIU Y. Review of real-time deep learning-based object detection for mobile augmented reality[J]. Journal of Graphics, 2021, 42(4): 525-534 (in Chinese).
[3]	汪丹丹, 张旭东, 范之国, 等. 基于RGB-D的反向融合实例分割算法[J]. 图学学报, 2021, 42(5): 767-774.
	WANG D D, ZHANG X D, FAN Z G, et al. A reverse fusion instance segmentation algorithm based on RGB-D[J]. Journal of Graphics, 2021, 42(5): 767-774 (in Chinese).
[4]	荆修平, 田莹. 采用长距离依赖和多尺度表达的轻量化车辆检测[J]. 光学精密工程, 2023, 31(6): 950-961.
	JING X P, TIAN Y. Lightweight vehicle detection using long-distance dependence and multi-scale representation[J]. Optics and Precision Engineering, 2023, 31(6): 950-961 (in Chinese).
[5]	赵璐璐, 王学营, 张翼, 等. 基于YOLOv5s融合SENet的车辆目标检测技术研究[J]. 图学学报, 2022, 43(5): 776-782.
	ZHAO L L, WANG X Y, ZHANG Y, et al. Vehicle target detection based on YOLOv5s fusion SENet[J]. Journal of Graphics, 2022, 43(5): 776-782 (in Chinese).
[6]	周金治, 景瑞琦, 吴静, 等. 改进YOLOv5s的车辆目标检测算法研究与实现[J]. 计算机与数字工程, 2023, 51(11): 2546-2552, 2579.
	ZHOU J Z, JING R Q, WU J, et al. Research and implementation of vehicle target detection algorithm based on improved YOLOv5s[J]. Computer & Digital Engineering, 2023, 51(11): 2546-2552, 2579 (in Chinese).
[7]	BAI T G, SHI F B, WANG Z P, et al. Vehicle target detection in aerial images based on improved YOLOv5[C]// 2023 15th International Conference on Intelligent Human-Machine Systems and Cybernetics. New York: IEEE Press, 2023: 190-193.
[8]	丛眸, 张平, 王宁. 改进YOLOv3算法及其在航拍图像车辆检测中的应用[J]. 计算机应用与软件, 2023, 40(1): 228-233.
	CONG M, ZHANG P, WANG N. Improved YOLOv3 algorithm and Its application on aerial image vehicle object detection[J]. Computer Applications and Software, 2023, 40(1): 228-233 (in Chinese).
[9]	龙赛, 宋晓凤, 张苏, 等. 改进YOLOv5s的航拍图像车辆检测研究[J]. 激光杂志, 2022, 43(10): 22-29.
	LONG S, SONG X F, ZHANG S, et al. Research on vehicle detection in aerial images with improved YOLOv5s[J]. Laser Journal, 2022, 43(10): 22-29 (in Chinese).
[10]	谢东升. 基于深度学习的车辆智能定损算法研究[D]. 天津: 天津大学, 2019.
	XIE D S. Research on vehicle intelligent damage location algorithm based on deep learning[D]. Tianjin: Tianjin University, 2019 (in Chinese).
[11]	ZHAI Y J, CHEN N H, ZHANG Z Q, et al. SU-VPDN: a scene understanding method for vehicle part detection[J]. Engineering Applications of Artificial Intelligence, 2024, 132: 107956.
[12]	PAPAGEORGIOU C P, OREN M, POGGIO T. A general framework for object detection[C]// The 6th International Conference on Computer Vision. New York: IEEE Press, 1998: 555-562.
[13]	施海. 一种基于图像识别的车辆智能定损系统[J]. 科学技术创新, 2020(10): 48-50.
	SHI H. An image Recognition-based vehicle intelligent damage assessment system[J]. Scientific and Technological Innovation, 2020(10): 48-50 (in Chinese).
[14]	DALAL N, TRIGGS B. Histograms of oriented gradients for human detection[C]// 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2005: 886-893.
[15]	CORTES C, VAPNIK V. Support-vector networks[J]. Machine Learning, 1995, 20(2): 273-297.
[16]	GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]// The IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2014: 580-587.
[17]	GIRSHICK R. Fast R-CNN[C]// The IEEE International Conference on Computer Vision. New York: IEEE Press, 2015: 1440-1448.
[18]	REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[C]// The 28th International Conference on Neural Information Processing Systems. New York: ACM, 2015: 91-99.
[19]	HE K M, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN[C]// The IEEE International Conference on Computer Vision. New York: IEEE Press, 2017: 2980-2988.
[20]	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]// The IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 779-788.
[21]	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// The 31st International Conference on Neural Information Processing Systems. New York: ACM, 2017: 6000-6010.
[22]	DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: Transformers for image recognition at scale[EB/OL]. [2024-01-11]. https://arxiv.org/abs/2010.11929v1.
[23]	LIU Z, LIN Y T, CAO Y, et al. Swin transformer: hierarchical vision transformer using shifted windows[C]// The IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 9992-10002.
[24]	吴建雄. 基于卷积神经网络的车辆部件检测[D]. 武汉: 华中科技大学, 2017.
	WU J X. Detection of vehicle parts based on convolution neural network[D]. Wuhan: Huazhong University of Science and Technology, 2017 (in Chinese).
[25]	舒娟. 基于深度学习的车辆部件检测[D]. 武汉: 华中科技大学, 2017.
	SHU J. Vehicle component detection based on deep learning[D]. Wuhan: Huazhong University of Science and Technology, 2017 (in Chinese).
[26]	REDMON J, FARHADI A. Yolov3: An incremental improvement[EB/OL]. [2024-01-11]. https://arxiv.org/abs/1804.02767.
[27]	ZHU L, WANG X J, KE Z H, et al. BiFormer: vision transformer with bi-level routing attention[C]// The IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 10323-10333.
[28]	HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// The IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 770-778.
[29]	REN S C, ZHOU D Q, HE S F, et al. Shunted self-attention via multi-scale token aggregation[C]// The IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 10843-10852.
[30]	WANG X L, ZHANG R F, KONG T, et al. SOLOv2: dynamic and fast instance segmentation[C]// The 34th International Conference on Neural Information Processing Systems. New York: ACM, 2020: 1487.
[31]	CAI Z W, VASCONCELOS N. Cascade R-CNN: High quality object detection and instance segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(5): 1483-1498.
[32]	FANG Y X, YANG S S, WANG X G, et al. Instances as queries[C]// The IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 6890-6899.
[33]	HUANG Z J, HUANG L C, GONG Y C, et al. Mask scoring R-CNN[C]// The IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 6402-6411.
[34]	CHEN K, PANG J M, WANG J Q, et al. Hybrid task cascade for instance segmentation[C]// The IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 4969-4978.
[35]	KIRILLOV A, WU Y X, HE K M, et al. PointRend: image segmentation as rendering[C]// The IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 9796-9805.

部件类别	部件类别	部件类别
1.车顶外板(Roof_outer_panel)	21.后门外拉手(右) (Back_door_handle (right))	41.前门饰条(右) (Front_door_trim (right))
2.倒车镜(右) (Outer_mirror (right))	22.后门外拉手(左) (Back_door_handle (left))	42.前门饰条(左) (Front_door_trim (left))
3.倒车镜(左) (Outer_mirror (left))	23.后叶子板(右) (Rear_fender (right))	43.前门外拉手(右) (Front_door_handle (right))
4.倒车镜护盖(右) (Mirror_cover (right))	24.后叶子板(左) (Rear_fender (left))	44.前门外拉手(左) (Front_door_handle (left))
5.倒车镜护盖(左) (Mirror_cover (left))	25.后叶子板轮眉(右) (Rear_fender_wheel_eyebrow (right))	45.前雾灯(右) (Fog_lamp (right))
6.底大边(右) (Bottom_edge (right))	26.后叶子板轮眉(左) (Rear_fender_wheel_eyebrow (left))	46.前雾灯(左) (Fog_lamp (left))
7.底大边(左) (Bottom_edge (left))	27.举升门玻璃 (liftgate_glass)	47.前叶子板(右) (Front_fender (right))
8.钢圈 (Steel_ring)	28.举升门壳 (liftgate_shell)	48.前叶子板(左) (Front_fender (left))
9.行李箱盖(Baggage_cover)	29.轮胎(Tire)	49.前叶子板轮眉(右) (Front_fender_wheel_eyebrow (right))
10.后保险杠电眼 (Rear_bumper_electric_eye)	30.内尾灯(右) (Inner_tail_light (right))	50.前叶子板轮眉(左) (Front_fender_wheel_eyebrow (left))
11.后保险杠皮 (Rear_bumper_skin)	31.内尾灯(左) (Inner_tail_light (left))	51.外尾灯(右) (Exterior_tail_light (right))
12.后保险杠装饰灯(右) (Rear_bumper_decorative_light (right))	32.前保险杠皮 (Front_bumper_skin)	52.外尾灯(左) (Exterior_tail_light (left))
13.后保险杠装饰灯(左) (Rear_bumper_decorative_light (left))	33.前保险杠下格栅 (Front_bumper_lower_grille)	53.尾灯(右) (Tail_light (right))
14.后风挡玻璃(Rear_window_glass)	34.前大灯(右) (Head_lamp (right))	54.尾灯(左) (Tail_light (left))
15.后门玻璃(右) (Rear_door_glass (right))	35.前大灯(左) (Head_lamp (left))	55.油箱盖 (Fuel_tank_cap)
16.后门玻璃(左) (Rear_door_glass (left))	36.前风挡玻璃 (Front_window_glass)	56.中网 (Grille)
17.后门壳(右) (Back_door_shell (right))	37.前门玻璃(右) (Front_door_glass (right))	57.中网徽标 (Grille_logo)
18.后门壳(左) (Back_door_shell (left))	38.前门玻璃(左) (Front_door_glass (left))	58.发动机罩 (Engine_cover)
19.后门饰条(右) (Rear_door_trim (right))	39.前门壳(右) (Car_right_door)	59.车牌 (License_plate)
20.后门饰条(右) (Rear_door_trim (left))	40.前门壳(左) (Car_left_door)

部件类别	部件类别	部件类别
1.车顶外板(Roof_outer_panel)	21.后门外拉手(右) (Back_door_handle (right))	41.前门饰条(右) (Front_door_trim (right))
2.倒车镜(右) (Outer_mirror (right))	22.后门外拉手(左) (Back_door_handle (left))	42.前门饰条(左) (Front_door_trim (left))
3.倒车镜(左) (Outer_mirror (left))	23.后叶子板(右) (Rear_fender (right))	43.前门外拉手(右) (Front_door_handle (right))
4.倒车镜护盖(右) (Mirror_cover (right))	24.后叶子板(左) (Rear_fender (left))	44.前门外拉手(左) (Front_door_handle (left))
5.倒车镜护盖(左) (Mirror_cover (left))	25.后叶子板轮眉(右) (Rear_fender_wheel_eyebrow (right))	45.前雾灯(右) (Fog_lamp (right))
6.底大边(右) (Bottom_edge (right))	26.后叶子板轮眉(左) (Rear_fender_wheel_eyebrow (left))	46.前雾灯(左) (Fog_lamp (left))
7.底大边(左) (Bottom_edge (left))	27.举升门玻璃 (liftgate_glass)	47.前叶子板(右) (Front_fender (right))
8.钢圈 (Steel_ring)	28.举升门壳 (liftgate_shell)	48.前叶子板(左) (Front_fender (left))
9.行李箱盖(Baggage_cover)	29.轮胎(Tire)	49.前叶子板轮眉(右) (Front_fender_wheel_eyebrow (right))
10.后保险杠电眼 (Rear_bumper_electric_eye)	30.内尾灯(右) (Inner_tail_light (right))	50.前叶子板轮眉(左) (Front_fender_wheel_eyebrow (left))
11.后保险杠皮 (Rear_bumper_skin)	31.内尾灯(左) (Inner_tail_light (left))	51.外尾灯(右) (Exterior_tail_light (right))
12.后保险杠装饰灯(右) (Rear_bumper_decorative_light (right))	32.前保险杠皮 (Front_bumper_skin)	52.外尾灯(左) (Exterior_tail_light (left))
13.后保险杠装饰灯(左) (Rear_bumper_decorative_light (left))	33.前保险杠下格栅 (Front_bumper_lower_grille)	53.尾灯(右) (Tail_light (right))
14.后风挡玻璃(Rear_window_glass)	34.前大灯(右) (Head_lamp (right))	54.尾灯(左) (Tail_light (left))
15.后门玻璃(右) (Rear_door_glass (right))	35.前大灯(左) (Head_lamp (left))	55.油箱盖 (Fuel_tank_cap)
16.后门玻璃(左) (Rear_door_glass (left))	36.前风挡玻璃 (Front_window_glass)	56.中网 (Grille)
17.后门壳(右) (Back_door_shell (right))	37.前门玻璃(右) (Front_door_glass (right))	57.中网徽标 (Grille_logo)
18.后门壳(左) (Back_door_shell (left))	38.前门玻璃(左) (Front_door_glass (left))	58.发动机罩 (Engine_cover)
19.后门饰条(右) (Rear_door_trim (right))	39.前门壳(右) (Car_right_door)	59.车牌 (License_plate)
20.后门饰条(右) (Rear_door_trim (left))	40.前门壳(左) (Car_left_door)

方法	KR-MHSA	WSWformer	Layer3	Layer4	AP50 (Bbox)/%	AP50 (Segm)/%
基线模型					37.07	36.05
	√			√	40.70	39.37
		√			39.26	37.95
	√	√	√		38.50	36.90
本文模型	√	√		√	41.54	40.45

方法	KR-MHSA	WSWformer	Layer3	Layer4	AP50 (Bbox)/%	AP50 (Segm)/%
基线模型					37.07	36.05
	√			√	40.70	39.37
		√			39.26	37.95
	√	√	√		38.50	36.90
本文模型	√	√		√	41.54	40.45

方法	双层路由注意力	多头自注意力	WSWformer	AP50 (Bbox)/%	AP50 (Segm)/%
基线模型				37.07	36.05
	√		√	40.53	39.42
		√	√	40.51	39.41
本文模型	√	√	√	41.54	40.45