Welcome to Journal of Graphics share: 

Journal of Graphics ›› 2024, Vol. 45 ›› Issue (5): 930-940.DOI: 10.11996/JG.j.2095-302X.2024050930

• Image Processing and Computer Vision • Previous Articles     Next Articles

The vehicle parts detection method enhanced with Transformer integration

ZHAI Yongjie(), LI Jiawei, CHEN Nianhao, WANG Qianming(), WANG Xinying   

  1. Department of Automation, North China Electric Power University, Baoding Hebei 071003, China
  • Received:2024-05-30 Revised:2024-07-23 Online:2024-10-31 Published:2024-10-31
  • Contact: WANG Qianming
  • About author:First author contact:

    ZHAI Yongjie (1972-), professor, Ph.D. His main research interest covers power vision. E-mail:zhaiyongjie@ncepu.edu.cn

  • Supported by:
    National Natural Science Foundation of China Funded Project(62373151);Hebei Provincial Natural Science Foundation General Project(F2023502010);Fundamental Research Funds for the Central Universities(2023JC006);Fundamental Research Funds for the Central Universities(2024MS136)

Abstract:

To effectively address issues such as false detections and missed detections caused by insufficient feature extraction and inadequate utilization of candidate boxes in vehicle component detection models, an improved Transformer-based method for vehicle component detection was proposed. Firstly, by combining multi-head self-attention and bi-layer routing attention, a key region multi-head self-attention (KR-MHSA) mechanism was introduced. Secondly, the final layer of ResNet in the baseline model (Mask R-CNN) was integrated with KR-MHSA using residual fusion, enhancing the basic feature extraction capabilities of the model. Finally, the improved Swin Transformer was employed for feature learning on the candidate boxes generated by the model, enabling the model to better understand the differences and similarities between various candidate boxes. Experiments conducted on a constructed dataset of 59 vehicle component categories demonstrated that the proposed model outperformed other state-of-the-art instance segmentation models in both detection and segmentation performance. Compared to the baseline model, the detection accuracy improved by 4.47%, and the segmentation accuracy improved by 4.4%. This effectively resolved the issues of insufficient feature extraction and inadequate utilization of candidate boxes in vehicle component detection, leading to more accurate and efficient replacement of damaged parts by insurance companies, thus improving claims processing efficiency.

Key words: vehicle parts, deep learning, instance segmentation, Mask R-CNN, feature extraction, multi-head self-attention, bi-level routing attention

CLC Number: