Welcome to Journal of Graphics share: 

Journal of Graphics ›› 2024, Vol. 45 ›› Issue (1): 35-46.DOI: 10.11996/JG.j.2095-302X.2024010035

• Image Processing and Computer Vision • Previous Articles     Next Articles

Lightweight multi-modal pedestrian detection algorithm based on YOLO

YUAN Chao1(), ZHAO Yadong1, ZHANG Yao1, WANG Jiaxuan1, XU Dawei1,2(), ZHAI Yongjie1, ZHU Songsong3   

  1. 1. Department of Automation, North China Electric Power University, Baoding Hebei 071033, China
    2. The State Key Laboratory for Management and Control of Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
    3. Tianjin Siasun Intelligent Technology Co., Ltd, Tianjin 301800, China
  • Received:2023-07-11 Accepted:2023-10-18 Online:2024-02-29 Published:2024-02-29
  • Contact: XU Dawei (1990-), lecturer, Ph.D. His main research interests cover modeling and control of rope drive manipulator, Ultra-redundant robotic arm motion planning. E-mail:xudawei@ncepu.edu.cn
  • About author:

    YUAN Chao (1985-), lecturer, Ph.D. His main research interests cover robotics and sensor system design. E-mail:chaoyuan@ncepu.edu.cn

  • Supported by:
    The National Natural Science Foundation of China Joint Fund of China is a Key Support Project(U21A20486);Projects of the National Key Laboratory of Complex System Management and Control of the Institute of Automation of the Chinese Academy of Sciences(20220102);Funded by the Central University Basic Research Business Fund(2022MS100)

Abstract:

To address the problems of low accuracy in pedestrian detection and the large number of model parameters in low-light environments, a lightweight multi-modal pedestrian detection algorithm named EF-DEM-YOLO was proposed based on the YOLO framework. This algorithm employed the lightweight ES-MobileNet as the backbone feature extraction network and integrated ECA and SE-ECA attention mechanism modules in this network to enhance the important channel features, thereby elevating the detection accuracy for small-target pedestrians. A DBL module based on depth-separable convolution was also designed in the neck network to further reduce the number of parameters in the model. In addition, to improve the detection accuracy of pedestrians under low-light conditions, a weighted fusion method of visible and infrared modes based on image entropy was proposed. This method utilized the complementary features of visible and infrared modes under different lighting conditions, and the fusion module EWF is designed. In comparison to baseline methods: the proposed algorithm yielded significant improvements for pedestrian targets under different lighting conditions. The model’s mAP was increased by 55.5%, the MR was reduced by 85.9%, and the inference speed reached 33.4 frames per second, outperforming other classical object detection algorithms. This algorithm provided the possibility for real-time detection of pedestrian targets in edge computing and low-light scenes.

Key words: pedestrian detection, YOLO, lightweighting, multi-modality, depth separability, image entropy

CLC Number: