欢迎访问《图学学报》 分享到:

图学学报 ›› 2022, Vol. 43 ›› Issue (2): 230-238.DOI: 10.11996/JG.j.2095-302X.2022020230

• 图像处理与计算机视觉 • 上一篇    下一篇

深度可分离卷积和标准卷积相结合的高效行人检测器

  

  1. 1. 大连大学软件工程学院先进设计与智能计算省部共建教育部重点实验室,辽宁 大连 116622;
    2. 大连理工大学计算机科学与技术学院,辽宁 大连 116024
  • 出版日期:2022-04-30 发布日期:2022-05-07
  • 基金资助:

    国家自然科学基金重点项目(U1908214);

    辽宁特聘教授资助计划;辽宁省中央指导地方科技发展专项(2021JH6/10500140);

    辽宁省高等学校、大连市及大连大学创新团队资助计划;大连市双重项目(2020JJ25CY001)

Efficient pedestrian detector combining depthwise separable convolution and standard convolution

  1. 1. Key Laboratory of Advanced Design and Intelligent Computing (Dalian University), Ministry of Education, Dalian Liaoning, 116622, China;
    2. School of Computer Science and Technology, Dalian University of Technology, Dalian Liaoning, 116024, China
  • Online:2022-04-30 Published:2022-05-07
  • Supported by:

    Key Program of Natural Science Foundation of China (U1908214);

     Program for the Liaoning Distinguished Professor; Special Project of Central Government Guiding Local Science and Technology Development (2021JH6/10500140); 

    Program for Innovative Research Team in University of Liaoning Province; Dalian and Dalian University, and in Part by the Science and Technology Innovation Fund of Dalian (2020JJ25CY001)

摘要: 行人检测器对算法的速度和精确度有很高的要求。虽然基于深度卷积神经网络(DCNN)的行人检
测器具有较高的检测精度,但是这类检测器对硬件设备的计算能力要求较高,因此,这类行人检测器无法很好
地部署到诸如移动设备、嵌入式设备和自动驾驶系统等轻量化系统中。基于此,提出了一种更好地平衡速度和
精度的轻量级行人检测器(EPDNet)。首先,主干网络的浅层卷积使用深度可分离卷积以压缩模型的参数量,深
层卷积使用标准卷积以提取高级语义特征。另外,为了进一步提高模型的性能,主干网络采用特征融合方法来
增强其输出特征的表达能力。通过实验对比分析,EPDNet在2个具有挑战性的行人数据集Caltech和CityPersons
上表现出了优越的性能,与基准模型相比,EPDNet 在速度和精确度之间获得了更好的权衡,EPDNet 的速度和
精确度同时得到了提高。

关键词: 标准卷积, 深度可分离卷积, 特征融合, 轻量化, 行人检测

Abstract: Pedestrian detectors require the algorithm to be fast and accurate. Although pedestrian detectors based on deep
convolutional neural networks (DCNN) have high detection accuracy, such detectors require higher capacity of
calculation. Therefore, such pedestrian detectors cannot be deployed well on lightweight systems, such as mobile devices,
embedded devices, and autonomous driving systems. Considering these problems, a lightweight and effective pedestrian detector (EPDNet) was proposed, which can better balance speed and accuracy. First, the shallow convolution layers of the backbone network employed depthwise separable convolution to compress the parameters of model, and the deeper convolution layers utilized standard convolution to extract high-level semantic features. In addition, in order to further improve the performance of the model, the backbone network adopted a feature fusion method to enhance the expression ability of its output features. Through comparative experiments, EPDNet has shown superior performance on two
challenging pedestrian datasets, Caltech and CityPersons. Compared with the benchmark model, EPDNet has obtained a
better trade-off between speed and accuracy, improving the speed and accuracy of EPDNet at the same time.

Key words:  standard convolution, depthwise separable convolution, feature fusion, lightweight, pedestrian detection

中图分类号: