欢迎访问《图学学报》 分享到:

图学学报 ›› 2024, Vol. 45 ›› Issue (1): 35-46.DOI: 10.11996/JG.j.2095-302X.2024010035

• 图像处理与计算机视觉 • 上一篇    下一篇

基于YOLO轻量化的多模态行人检测算法

苑朝1(), 赵亚冬1, 张耀1, 王嘉璇1, 徐大伟1,2(), 翟永杰1, 朱松松3   

  1. 1.华北电力大学自动化系,河北 保定 071003
    2.中国科学院自动化研究所复杂系统管理与控制国家重点实验室,北京 100190
    3.天津新松智能科技有限公司,天津 301800
  • 收稿日期:2023-07-11 接受日期:2023-10-18 出版日期:2024-02-29 发布日期:2024-02-29
  • 通讯作者:徐大伟(1990-),男,讲师,博士。主要研究方向为绳驱机械臂建模与控制、超冗余机械臂运动规划。E-mail:xudawei@ncepu.edu.cn
  • 第一作者:苑朝(1985-),男,讲师,博士。主要研究方向为机器人学、传感器系统设计。E-mail:chaoyuan@ncepu.edu.cn
  • 基金资助:
    国家自然科学基金联合基金项目重点支持项目(U21A20486);中国科学院自动化研究所复杂系统管理与控制国家重点实验室开放课题(20220102);中央高校基本科研业务费专项资金资助(2022MS100)

Lightweight multi-modal pedestrian detection algorithm based on YOLO

YUAN Chao1(), ZHAO Yadong1, ZHANG Yao1, WANG Jiaxuan1, XU Dawei1,2(), ZHAI Yongjie1, ZHU Songsong3   

  1. 1. Department of Automation, North China Electric Power University, Baoding Hebei 071033, China
    2. The State Key Laboratory for Management and Control of Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
    3. Tianjin Siasun Intelligent Technology Co., Ltd, Tianjin 301800, China
  • Received:2023-07-11 Accepted:2023-10-18 Published:2024-02-29 Online:2024-02-29
  • First author:YUAN Chao (1985-), lecturer, Ph.D. His main research interests cover robotics and sensor system design. E-mail:chaoyuan@ncepu.edu.cn
  • Supported by:
    The National Natural Science Foundation of China Joint Fund of China is a Key Support Project(U21A20486);Projects of the National Key Laboratory of Complex System Management and Control of the Institute of Automation of the Chinese Academy of Sciences(20220102);Funded by the Central University Basic Research Business Fund(2022MS100)

摘要:

针对低光照环境下行人检测精度低和模型参数量大的问题,基于YOLO框架,提出一种轻量化的多模态行人检测算法EF-DEM-YOLO。采用轻量的ES-MobileNet作为主干特征提取网络,并在该网络中引入ECA和SE-ECA注意力机制模块,增强重要的通道特征,提高小目标行人的检测精度。在颈部网络中设计了基于深度可分离卷积的DBL模块,进一步缩减模型的参数量。另外,为了提高低光照条件下行人的检测精度,利用可见光模态和红外模态在不同光照条件下特征互补的特点,提出了基于图像熵的可见光与红外模态加权融合方法,并设计了融合模块EWF。相比与基准方法,该算法对于不同光照条件下的行人目标,模型的mAP提高55.5%,MR降低85.9%,模型的推理速度达到33.4帧/秒,并且均优于其他经典的目标检测算法,为边缘计算和低光照场景下的行人目标的实时检测提供了可能。

华北电力大学苑朝博士及其学生赵亚冬等针对低光照环境下行人检测精度低和模型参数量大的问题,基于YOLO框架,提出一种轻量化的多模态行人检测算法EF-DEM-YOLO,该算法采用轻量的ES-MobileNet作为主干特征提取网络,为了提高低光照条件下行人的检测精度,提出了基于图像熵的可见光与红外模态加权融合方法,并设计了融合模块EWF。相比与基准方法,模型的mAP提高55.5%,模型的推理速度达到33.4帧/秒。该算法为边缘计算和低光照场景下的行人目标的实时检测提供了可能。

关键词: 行人检测, YOLO, 轻量化, 多模态, 深度可分离, 图像熵

Abstract:

To address the problems of low accuracy in pedestrian detection and the large number of model parameters in low-light environments, a lightweight multi-modal pedestrian detection algorithm named EF-DEM-YOLO was proposed based on the YOLO framework. This algorithm employed the lightweight ES-MobileNet as the backbone feature extraction network and integrated ECA and SE-ECA attention mechanism modules in this network to enhance the important channel features, thereby elevating the detection accuracy for small-target pedestrians. A DBL module based on depth-separable convolution was also designed in the neck network to further reduce the number of parameters in the model. In addition, to improve the detection accuracy of pedestrians under low-light conditions, a weighted fusion method of visible and infrared modes based on image entropy was proposed. This method utilized the complementary features of visible and infrared modes under different lighting conditions, and the fusion module EWF is designed. In comparison to baseline methods: the proposed algorithm yielded significant improvements for pedestrian targets under different lighting conditions. The model’s mAP was increased by 55.5%, the MR was reduced by 85.9%, and the inference speed reached 33.4 frames per second, outperforming other classical object detection algorithms. This algorithm provided the possibility for real-time detection of pedestrian targets in edge computing and low-light scenes.

Key words: pedestrian detection, YOLO, lightweighting, multi-modality, depth separability, image entropy

中图分类号: