欢迎访问《图学学报》 分享到:

图学学报 ›› 2025, Vol. 46 ›› Issue (2): 393-401.DOI: 10.11996/JG.j.2095-302X.2025020393

• 计算机图形学与虚拟现实 • 上一篇    下一篇

基于半监督学习的单视角点云三维人体姿态与形状估计

方程浩(), 王康侃()   

  1. 南京理工大学高维信息智能感知与系统教育部重点实验室,江苏 南京 210094
  • 收稿日期:2024-07-05 接受日期:2024-11-27 出版日期:2025-04-30 发布日期:2025-04-24
  • 通讯作者:王康侃(1988-),男,副教授,博士。主要研究方向为计算机视觉、虚拟现实、三维重建等。E-mail:wangkangkan@njust.edu.cn
  • 第一作者:方程浩(1999-),男,硕士研究生。主要研究方向为计算机图形学、计算机视觉、三维重建。E-mail:121106022661@njust.edu.cn
  • 基金资助:
    国家自然科学基金(62472224);中央高校基础研究基金(NJ2023032);浙江大学计算机辅助设计与图形系统全国重点实验室开放课题(A2311);南京大学计算机软件新技术全国重点实验室开放课题(KFKT2024B37)

3D human pose and shape estimation from single-view point clouds with semi-supervised learning

FANG Chenghao(), WANG Kangkan()   

  1. Key Laboratory of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education, Nanjing University of Science and Technology, Nanjing Jiangsu 210094, China
  • Received:2024-07-05 Accepted:2024-11-27 Published:2025-04-30 Online:2025-04-24
  • First author:FANG Chenghao (1999-), master student. His main research interests cover computer graphics, computer vision and 3D reconstruction. E-mail:121106022661@njust.edu.cn
  • Supported by:
    The Natural Science Foundation of China(62472224);The Fundamental Research Funds for the Central Universities(NJ2023032);The Open Project Program of the State Key Laboratory of CAD&CG of Zhejiang University(A2311);The Open Project Program of the State Key Laboratory of Novel Software Technology of Nanjing University(KFKT2024B37)

摘要:

在有限标签样本的条件下,单视角点云的三维人体姿态和形状估计一直存在模型估计精度低、泛化能力弱等问题。现有的方法通常采用微调方法优化模型,但对新样本的微调步骤大大增加了运行复杂度,本质上没有提高模型的泛化能力。为解决以上问题,提出了一种基于半监督学习的三维人体姿态与形状估计方法,在有限的标签数据条件下,利用大量无标签人体点云数据提高模型估计精度和泛化能力。具体地,首先对无标签数据进行弱增强和强增强,同时估计2种增强样本的三维人体参数模型。然后对弱增强样本的预测结果进行伪标签准确性判断,并基于一致性正则化思想约束强增强样本的预测结果,以迭代方式逐步优化伪标签质量和增加用于训练的伪标签数量,进而提升模型的估计精度。该算法在多种公开数据集上做了充分的定量和定性实验,实验结果证明该算法在有限标签样本的条件下提高了三维人体姿态和形状的估计精度,并增强了模型的泛化性能。

关键词: 三维人体姿态与形状估计, 单视角点云, 半监督学习, 伪标签, 点云数据增强

Abstract:

Under the condition of limited labeled samples, estimating 3D human pose and shape from single-view point clouds has consistently encountered issues such as low model estimation accuracy and weak generalization capability. Existing methods typically use a fine-tuning step to optimize the models for limited labeled samples, but this fine-tuning process significantly increases computational complexity and without fundamentally enhancing model generalization. To address these issues, a semi-supervised learning-based method was proposed for 3D human pose and shape estimation. Under conditions of limited labeled data, the proposed method utilized a large amount of unlabeled human point clouds to improve model accuracy and generalization capability. Specifically, weak and strong augmentations were applied to the unlabeled data, and 3D human parameter models were estimated for both types of augmented samples. Then, the accuracy of pseudo-labels for weakly-augmented samples was evaluated, and the predictions of strongly augmented samples were constrained based on consistency regularization. The procedure above was applied iteratively to gradually refine the quality of pseudo-labels and increase the number of pseudo-labels for training, thereby enhancing the model’s estimation accuracy. Extensive quantitative and qualitative experiments on various public datasets demonstrate that the proposed method enhanced the accuracy of 3D human pose and shape estimation under conditions of limited labeled samples and enhanced model generalization performance.

Key words: 3D human pose and shape estimation, single-view point clouds, semi-supervised learning, pseudo-label, data augmentation of point cloud

中图分类号: