图学学报 ›› 2022, Vol. 43 ›› Issue (2): 230-238.DOI: 10.11996/JG.j.2095-302X.2022020230

  1. 1. 大连大学软件工程学院先进设计与智能计算省部共建教育部重点实验室,辽宁 大连 116622;
    2. 大连理工大学计算机科学与技术学院,辽宁 大连 116024
  • 出版日期:2022-04-30 发布日期:2022-05-07
Efficient pedestrian detector combining depthwise separable convolution and standard convolution

  1. 1. Key Laboratory of Advanced Design and Intelligent Computing (Dalian University), Ministry of Education, Dalian Liaoning, 116622, China;
    2. School of Computer Science and Technology, Dalian University of Technology, Dalian Liaoning, 116024, China
  • Online:2022-04-30 Published:2022-05-07
  • Supported by:

    Key Program of Natural Science Foundation of China (U1908214);

     Program for the Liaoning Distinguished Professor; Special Project of Central Government Guiding Local Science and Technology Development (2021JH6/10500140); 

    Program for Innovative Research Team in University of Liaoning Province; Dalian and Dalian University, and in Part by the Science and Technology Innovation Fund of Dalian (2020JJ25CY001)

摘要: 行人检测器对算法的速度和精确度有很高的要求。虽然基于深度卷积神经网络(DCNN)的行人检
上表现出了优越的性能,与基准模型相比,EPDNet 在速度和精确度之间获得了更好的权衡,EPDNet 的速度和

关键词: 标准卷积, 深度可分离卷积, 特征融合, 轻量化, 行人检测

Abstract: Pedestrian detectors require the algorithm to be fast and accurate. Although pedestrian detectors based on deep
convolutional neural networks (DCNN) have high detection accuracy, such detectors require higher capacity of
calculation. Therefore, such pedestrian detectors cannot be deployed well on lightweight systems, such as mobile devices,
embedded devices, and autonomous driving systems. Considering these problems, a lightweight and effective pedestrian detector (EPDNet) was proposed, which can better balance speed and accuracy. First, the shallow convolution layers of the backbone network employed depthwise separable convolution to compress the parameters of model, and the deeper convolution layers utilized standard convolution to extract high-level semantic features. In addition, in order to further improve the performance of the model, the backbone network adopted a feature fusion method to enhance the expression ability of its output features. Through comparative experiments, EPDNet has shown superior performance on two
challenging pedestrian datasets, Caltech and CityPersons. Compared with the benchmark model, EPDNet has obtained a
better trade-off between speed and accuracy, improving the speed and accuracy of EPDNet at the same time.

Key words:  standard convolution, depthwise separable convolution, feature fusion, lightweight, pedestrian detection
