欢迎访问《图学学报》 分享到:

图学学报 ›› 2022, Vol. 43 ›› Issue (2): 333-341.DOI: 10.11996/JG.j.2095-302X.2022020333

• 计算机图形学与虚拟现实 • 上一篇    下一篇

基于全局姿态感知的轻量级人体姿态估计

  

  1. 1. 中国石油大学(华东)计算机科学与技术学院,山东 青岛 266580;
    2. 中国石油大学胜利学院,山东 东营 257061;
    3. 中国科学院计算技术研究所智能信息处理重点实验室,北京 100190;
    4. 中国科学院大学计算机科学与技术学院,北京 100049
  • 出版日期:2022-04-30 发布日期:2022-05-07
  • 基金资助:

    国家重点研发计划项目(2019YFF0301800);

    国家自然科学基金项目(61379106);

    山东省自然科学基金项目(ZR2013FM036,ZR2015FM011)

Lightweight human pose estimation with global pose perception

  1. 1. College of Computer Science and Technology, China University of Petroleum, Qingdao Shandong 266580, China;
    2. Shengli College of China University of Petroleum, Dongying Shandong 257061, China;
    3. Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China;
    4. College of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100049, China
  • Online:2022-04-30 Published:2022-05-07
  • Supported by:

    National Key Research and Development Program of China (2019YFF0301800); 

    National Natural Science Foundation of China (61379106); 

    Natural Science Foundation of Shandong Province (ZR2013FM036, ZR2015FM011)

摘要: 人体姿态估计是近年来人机交互领域的热点话题。当前,常见的人体姿态估计方法集中在通过增
加网络的复杂性来提高精度,却忽视了模型的效益问题,导致模型在实际应用中精度高但计算资源消耗巨大。针
对这一问题设计了一个基于全局姿态感知的轻量级人体姿态估计模型,其在MSCOCO数据集上精度达68.2% AP,
速度保持在 255 fps,参数量和 FLOPS 仅为 OpenPose 方法的 10%和 0.9%。在人体姿态估计任务中,根据预测的
关键节点数量来设置网络的输出通道数,导致对每个关键点的检测都是独立的。关键点之间的相对位置、整体布
局等全局信息在困难样本的姿态估计任务中非常重要,但是在之前的研究中未考虑到。为了利用全局姿态信息,
设计了一个全局姿态感知模块来提取全局姿态特征,并利用双分支网络融合全局和局部姿态特征。实验表明,利
用全局姿态感知模块的轻量级人体姿态估计网络在 MPII 和 MSCOCO 数据集上精度分别提高了 1.5%和 1.3%。

关键词: 人体姿态估计, 轻量级, 全局姿态感知, 双分支网络, 特征融合

Abstract: Human pose estimation has been a hot topic in the field of human-computer interaction in recent years. At
present, the common methods for human pose estimation focus on improving the accuracy by increasing the network
complexity. However, the cost-effectiveness of the model was ignored, resulting in high accuracy of the model in
practice but huge consumption of computational resources. In this paper, a model for lightweight hu-man pose estimation
with global pose perception was designed. It has an accuracy of 68.2% AP on the MSCOCO dataset, and the speed
remains at 255 fps, and the parameter amount and FLOPS are 10% and 0.9% that of the OpenPose method, respectively.
In the human pose estimation task, the number of output channels of the network will be set according to the number of
predicted key joints, leading to independent detection of each key joint. Global information, such as the relative position
between key points and the overall layout, is of great significance to the pose estimation task for difficult samples, in which was absent from previous studies. In order to utilize the global pose information, a global pose perception module
was designed to extract the global pose features, and the two-branch network was employed to fuse the global and local
pose features. Experiments show that the lightweight human pose estimation network with global pose perception can
increase the accuracy by 1.5% and 1.3% on the MPII and MSCOCO datasets, respectively.

Key words: human pose estimation, lightweight, global pose perception, two-branch network, feature fusion

中图分类号: