欢迎访问《图学学报》 分享到:

图学学报 ›› 2022, Vol. 43 ›› Issue (1): 44-52.DOI: 10.11996/JG.j.2095-302X.2022010044

• 图像处理与计算机视觉 • 上一篇    下一篇

基于人体姿态估计与聚类的特定运动帧获取方法

  

  1. 上海师范大学信息与机电工程学院,上海 200234
  • 出版日期:2022-02-28 发布日期:2022-02-16
  • 基金资助:
    国家自然科学基金项目(61775139);上海市地方能力建设项目(19070502900) 

Acquisition method of specific motion frame based on human attitude estimation and clustering 

  1. School of Information and Electromechanical Engineering, Shanghai Normal University, Shanghai 200234, China
  • Online:2022-02-28 Published:2022-02-16
  • Supported by:
    National Natural Science Foundation of China (61775139); Shanghai Local Capacity Building Project (19070502900) 

摘要: 运动视频中特定运动帧的获取是运动智能化教学实现的重要环节,为了得到视频中的特定运动 帧以便进一步地对视频进行分析,并利用姿态估计和聚类的相关知识,提出了一种对运动视频提取特定运动帧 的方法。首先选用 HRNet 姿态估计模型作为基础,该模型精度高但模型规模过大,为了实际运用的需求,对 该模型进行轻量化处理并与 DARK 数据编码相结合,提出了 Small-HRNet 网络模型,在基本保持精度不变的情 况下参数量减少了 82.0%。然后利用 Small-HRNet 模型从视频中提取人体关节点,将每一视频帧中的人体骨架特 征作为聚类的样本点,最终以标准运动帧的骨架特征为聚类中心,对整个视频进行聚类得到视频的特定运动帧, 在武术运动数据集上进行实验。该方法对武术动作帧的提取准确率为 87.5%,能够有效地提取武术动作帧。

关键词: 特定运动帧, 姿态估计, 数据编解码, 运动特征, 聚类

Abstract: The acquisition of specific motion frames in motion video was an important part of intelligent teaching. In order to obtain specific motion frames in video for further analysis, a method of extracting specific motion frames from motion video was proposed using the knowledge of pose estimation and clustering. Firstly, the HRNet attitude estimation model was adopted as the basis, which was of high precision but large scale. To meet the needs of practical application, this paper proposed a Small-HRNet network model by combining it with the data encoding of DARK. The parameters were reduced by 82.0% while the precision was kept unchanged. Then, the Small-HRNet model was employed to extract human joint points from the video. The human skeleton feature in each video frame served as the sample point of clustering, and finally the whole video was clustered by the skeleton feature of the standard motion frame as the clustering center to produce the specific motion frame of the video. The experiment was carried out on the martial arts data set, and the accuracy rate of the martial arts action frame extraction was 87.5%, which can effectively extract the martial arts action frame. 

Key words: specific motion frame, attitude estimation, data encoding and decoding, movement characteristics, clustering 

中图分类号: