欢迎访问《图学学报》 分享到:

图学学报 ›› 2022, Vol. 43 ›› Issue (2): 239-246.DOI: 10.11996/JG.j.2095-302X.2022020239

• 图像处理与计算机视觉 • 上一篇    下一篇

基于轻量级网络的人脸检测及嵌入式实现

  

  1. 1. 北京交通大学信息科学研究所,北京 100044;
    2. 深圳市光点智能科技有限公司,广东 深圳 518000;
    3. 贵州大学机械工程学院,贵州 贵阳 550025
  • 出版日期:2022-04-30 发布日期:2022-05-07
  • 基金资助:

    中央高校基本科研业务费(2021YJS025);

    国家自然科学基金项目(62062021,61872034,62011530042);

    北京市自然科学基金项目(4202055);

    广西自然科学基金资助(2018GXNSFBA281086)

Face detection and embedded implementation of lightweight network

  1. 1. School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China;
    2. Shenzhen Bryture Co. Ltd., Shenzhen Guangdong 518000, China;
    3. School of Mechanical Engineering, Guizhou University, Guiyang Guizhou 550025, China
  • Online:2022-04-30 Published:2022-05-07
  • Supported by:

    Fundamental Research Funds for the Central Universities (2021YJS025); 

    National Natural Science Foundation of China under Grant (62062021, 61872034, 62011530042); 

    Beijing Municipal Natural Science Foundation under Grant (4202055); 

    Guangxi Natural Science Foundation under Grant (2018GXNSFBA281086)


摘要: 尽管基于卷积神经网络(CNN)的人脸检测器在精度上已经有了很大提升,但所需的计算量和模
型复杂度越来越高,如何在计算能力有限的嵌入式设备上应用人脸检测模型是一个很大的挑战。针对 320×240
分辨率输入图像的人脸检测在嵌入式系统上的应用问题,提出了一种基于轻量级网络的低分辨率人脸检测算
法。该算法使用注意力机制、结合了 Distance-IoU (DIoU)与非极大值抑制(NMS)、使用 Mish 激活函数,同时针
对人脸特征比例设置合适的先验框,实现了精度和速度的平衡,并部署到嵌入式平台中。具体地,用深度可分
离卷积替代普通卷积,并在卷积块后加入注意力模块(CBAM),使网络更关注待识别的目标物体;代替 ReLU
激活函数,采用了 Mish 激活函数来提高模型推理速度;通过结合 DIoU 与 NMS,提高模型对小人脸的检测能
力。实验在 WIDER FACE 数据集的结果证明,该方法不仅能实时高精度地进行人脸检测,而且在小分辨率输
入上,精度高于传统算法。扩充数据集之后,模型在复杂光照下的泛化性得到提高。

关键词: 人脸检测, 轻量级网络, 注意力机制, 激活函数, 非极大值抑制

Abstract:  In recent years, face detection based on convolutional neural networks (CNN) has dominated this field, and
the detection results on the public benchmark set have also been significantly improved. However, the computational
cost and model complexity are on the rise. It remains a challenge to apply face detection model to embedded devices
with limited computing power and memory capacity. Aiming at the application of face detection of 320×240
resolution input images in embedded systems, a low-resolution face detection algorithm based on lightweight network
was proposed. The backbone network employed the attention module, combined Distance-IoU (DIoU) and Non-Maximum Suppression (NMS), and adopted the Mish activation function. Meanwhile, an appropriate a priori box
was set for the face feature ratio. In doing so, the balance could be achieved between precision and speed, and it could
be deployed to the embedded platform. Specifically, deep separable convolution was used to replace ordinary
convolution, and an attention convolutional block attention module (CBAM) was added after the convolution block to
keep the network’s focus on the target object to be recognized. Instead of the ReLU activation function, the Mish
activation function was used to improve the model inference speed. By combining DIoU and NMS, the algorithm’s
detection accuracy for small faces was enhanced. The results of experiments on the WIDER FACE dataset prove that
the proposed method not only can detect human faces with high accuracy in real time, but also has higher accuracy
than traditional algorithms in small resolution input. After expanding the dataset, the proposed model also improves
the detection accuracy under complex illuminations.

Key words: face detection, lightweight network, attention module, activation function, non-maximum suppression

中图分类号: