Journal of Graphics ›› 2025, Vol. 46 ›› Issue (2): 415-424.DOI: 10.11996/JG.j.2095-302X.2025020415
• Computer Graphics and Virtual Reality • Previous Articles Next Articles
QIU Jiaxin(), SONG Qianyun, XU Dan(
)
Received:
2024-08-17
Accepted:
2025-01-21
Online:
2025-04-30
Published:
2025-04-24
Contact:
XU Dan
About author:
First author contact:QIU Jiaxin (1998-), master student. Her main research interest covers 3D reconstruction. E-mail:12022215169@mail.ynu.edu.cn
Supported by:
CLC Number:
QIU Jiaxin, SONG Qianyun, XU Dan. A neural radiation field-based approach to ethnic dance reconstruction[J]. Journal of Graphics, 2025, 46(2): 415-424.
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.txxb.com.cn/EN/10.11996/JG.j.2095-302X.2025020415
Fig. 3 Posture comparison after optimization ((a) The original attitude; (b) Estimation of VIBE attitude; (c) Neuman optimized attitude estimation; (d) Ours optimizes attitude estimation)
方法 | 民族 | PSNR↑ | SSIM↑ | LPIPS↓ |
---|---|---|---|---|
HumanNeRF | 撒尼族 | 28.53 | 0.975 8 | 21.05 |
白族 | 27.36 | 0.974 0 | 24.02 | |
彝族 | 28.58 | 0.974 9 | 23.56 | |
Ours | 撒尼族 | 29.05 | 0.977 3 | 17.89 |
白族 | 27.86 | 0.976 8 | 20.20 | |
彝族 | 28.83 | 0.975 6 | 21.17 |
Table 1 Comparison of metrics with HumanNeRF
方法 | 民族 | PSNR↑ | SSIM↑ | LPIPS↓ |
---|---|---|---|---|
HumanNeRF | 撒尼族 | 28.53 | 0.975 8 | 21.05 |
白族 | 27.36 | 0.974 0 | 24.02 | |
彝族 | 28.58 | 0.974 9 | 23.56 | |
Ours | 撒尼族 | 29.05 | 0.977 3 | 17.89 |
白族 | 27.86 | 0.976 8 | 20.20 | |
彝族 | 28.83 | 0.975 6 | 21.17 |
方法 | PSNR↑ | SSIM↑ | LPIPS↓ |
---|---|---|---|
HumanNeRF | 28.15 | 0.974 9 | 22.88 |
Ours (无注意力模块) | 28.42 | 0.976 0 | 21.78 |
Ours (完整模型) | 28.58 | 0.976 6 | 19.75 |
Table 2 Comparison of average indicators for the folk dance dataset
方法 | PSNR↑ | SSIM↑ | LPIPS↓ |
---|---|---|---|
HumanNeRF | 28.15 | 0.974 9 | 22.88 |
Ours (无注意力模块) | 28.42 | 0.976 0 | 21.78 |
Ours (完整模型) | 28.58 | 0.976 6 | 19.75 |
Fig. 5 Ablation experiment ((a) Photos of people from a real perspective; (b) Complete rendering results under model conditions; (c) Render results without attention module conditions)
Fig. 6 ZJU-mocap render result comparison ((a) Neural Body character rendering result; (b) HumanNeRF rendering results; (c) Character rendering results of this method)
类型 | 方法 | PSNR↑ | SSIM↑ | LPIPS↓ |
---|---|---|---|---|
Subject 387 | Neural Body | 31.36 | 0.976 0 | 43.35 |
HumanNeRF | 33.17 | 0.984 7 | 21.24 | |
Ours | 33.21 | 0.985 9 | 19.09 | |
Subject 393 | Neural Body | 32.43 | 0.961 3 | 53.12 |
HumanNeRF | 33.75 | 0.986 1 | 21.69 | |
Ours | 33.72 | 0.986 4 | 21.55 | |
Subject 313 | Neural Body | 27.37 | 0.960 0 | 41.92 |
HumanNeRF | 29.00 | 0.981 3 | 19.23 | |
Ours | 29.07 | 0.984 1 | 18.20 | |
Subject 377 | Neural Body | 32.11 | 0.973 5 | 40.40 |
HumanNeRF | 33.95 | 0.980 7 | 22.44 | |
Ours | 33.97 | 0.984 1 | 19.74 |
Table 3 Comparison of ZJU-mocap dataset metrics
类型 | 方法 | PSNR↑ | SSIM↑ | LPIPS↓ |
---|---|---|---|---|
Subject 387 | Neural Body | 31.36 | 0.976 0 | 43.35 |
HumanNeRF | 33.17 | 0.984 7 | 21.24 | |
Ours | 33.21 | 0.985 9 | 19.09 | |
Subject 393 | Neural Body | 32.43 | 0.961 3 | 53.12 |
HumanNeRF | 33.75 | 0.986 1 | 21.69 | |
Ours | 33.72 | 0.986 4 | 21.55 | |
Subject 313 | Neural Body | 27.37 | 0.960 0 | 41.92 |
HumanNeRF | 29.00 | 0.981 3 | 19.23 | |
Ours | 29.07 | 0.984 1 | 18.20 | |
Subject 377 | Neural Body | 32.11 | 0.973 5 | 40.40 |
HumanNeRF | 33.95 | 0.980 7 | 22.44 | |
Ours | 33.97 | 0.984 1 | 19.74 |
[1] | PARK K, SINHA U, HEDMAN P, et al. HyperNeRF: a higher-dimensional representation for topologically varying neural radiance fields[J]. ACM Transactions on Graphics, 2021, 40(6): 238. |
[2] | PENG S D, ZHANG Y Q, XU Y H, et al. Neural body: implicit neural representations with structured latent codes for novel view synthesis of dynamic humans[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 9050-9059. |
[3] | WENG C Y, CURLESS B, SRINIVASAN P P, et al. HumanNeRF: free-viewpoint rendering of moving people from monocular video[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 16189-16199. |
[4] | JIANG W, YI K M, SAMEI G, et al. NeuMan: neural human radiance field from a single video[C]// The 17th European Conference on Computer Vision. Cham: Springer, 2022: 402-418. |
[5] | LIU L J, HABERMANN M, RUDNEV V, et al. Neural actor: neural free-view synthesis of human actors with pose control[J]. ACM Transactions on Graphics, 2021, 40(6): 219. |
[6] | PENG S D, DONG J T, WANG Q Q, et al. Animatable neural radiance fields for modeling dynamic human bodies[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 14294-14303. |
[7] | LOPER M, MAHMOOD N, ROMERO J, et al. SMPL: a skinned multi-person linear model[J]. Seminal Graphics Papers: Pushing the Boundaries, 2023, 2: 88. |
[8] | ZHANG J K, LIU X H, YE X Y, et al. Editable free-viewpoint video using a layered neural representation[J]. ACM Transactions on Graphics, 2021, 40(4): 149. |
[9] | KANADE T, RANDER P, NARAYANAN P J. Virtualized reality: constructing virtual worlds from real scenes[J]. IEEE Multimedia, 1997, 4(1): 34-47. |
[10] | CARRANZA J, THEOBALT C, MAGNOR M A, et al. Free-viewpoint video of human actors[J]. ACM Transactions on Graphics, 2003, 22(3): 569-577. |
[11] | SU S Y, YU F, ZOLLHÖFER M, et al. A-NeRF: articulated neural radiance fields for learning human shape, appearance, and pose[C]// The 35th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2021: 939. |
[12] | HU S K, HU T, LIU Z W. GauHuman: articulated Gaussian splatting from monocular human videos[C]// 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2024: 20418-20431. |
[13] | KERBL B, KOPANAS G, LEIMKÜHLER T, et al. 3D Gaussian splatting for real-time radiance field rendering[J]. ACM Transactions on Graphics, 2023, 42(4): 139. |
[14] | ANGUELOV D, SRINIVASAN P, KOLLER D, et al. SCAPE: shape completion and animation of people[J]. ACM Transactions on Graphics, 2005, 24(3): 408-416. |
[15] | PAVLAKOS G, CHOUTAS V, GHORBANI N, et al. Expressive body capture: 3D hands, face, and body from a single image[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 10967-10977. |
[16] | BOGO F, KANAZAWA A, LASSNER C, et al. Keep it SMPL: automatic estimation of 3D human pose and shape from a single image[C]// The 14th European Conference on Computer Vision. Cham: Springer, 2016: 561-578. |
[17] | LASSNER C, ROMERO J, KIEFEL M, et al. Unite the people: closing the loop between 3D and 2D human representations[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 4704-4713. |
[18] | GÜLER R A, KOKKINOS I. HoloPose: holistic 3D human reconstruction in-the-wild[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 10876-10886. |
[19] | KANAZAWA A, BLACK M J, JACOBS D W, et al. End-to-end recovery of human shape and pose[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 7122-7131. |
[20] | OMRAN M, LASSNER C, PONS-MOLL G, et al. Neural body fitting:unifying deep learning and model based human pose and shape estimation[C]// 2018 International Conference on 3D Vision (3DV). New York: IEEE Press, 2018: 484-494. |
[21] | PAVLAKOS G, ZHU L Y, ZHOU X W, et al. Learning to estimate 3D human pose and shape from a single color image[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 459-468. |
[22] | PISHCHULIN L, INSAFUTDINOV E, TANG S Y, et al. DeepCut: joint subset partition and labeling for multi person pose estimation[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 4929-4937. |
[23] | KOLOTOUROS N, PAVLAKOS G, BLACK M, et al. Learning to reconstruct 3D human pose and shape via model-fitting in the loop[C]// 2019 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2019: 2252-2261. |
[24] | KOCABAS M, ATHANASIOU N, BLACK M J. VIBE: video inference for human body pose and shape estimation[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 5252-5262. |
[25] | MAHMOOD N, GHORBANI N, TROJE N F, et al. AMASS: archive of motion capture as surface shapes[C]// 2019 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2019: 5441-5450. |
[26] | DONG J T, SHUAI Q, ZHANG Y Q, et al. Motion capture from internet videos[C]// The 16th European Conference on Computer Vision. Cham: Springer, 2020: 210-227. |
[27] | ZENG A L, JU X, YANG L, et al. DeciWatch: a simple baseline for 10× efficient 2D and 3D pose estimation[C]// The 17th European Conference on Computer Vision. Cham: Springer, 2022: 607-624. |
[28] | SONG Q Y, ZHANG H, LIU Y N, et al. Hybrid attention adaptive sampling network for human pose estimation in videos[J]. Computer Animation & Virtual Worlds, 2024, 35(4): e2244. |
[29] | ZHANG Y X, WANG Y, CAMPS O, et al. Key frame proposal network for efficient pose estimation in videos[C]// The 16th European Conference on Computer Vision. Cham: Springer, 2020: 609-625. |
[30] | MILDENHALL B, SRINIVASAN P P, TANCIK M, et al. NeRF: representing scenes as neural radiance fields for view synthesis[J]. Communications of the ACM, 2021, 65(1): 99-106. |
[31] | CHEN X, ZHENG Y F, BLACK M J, et al. SNARF: differentiable forward skinning for animating non-rigid neural implicit shapes[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 11574-11584. |
[32] | TANCIK M, SRINIVASAN P P, MILDENHALL B, et al. Fourier features let networks learn high frequency functions in low dimensional domains[C]// The 34th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2020: 632. |
[33] | PARK K, SINHA U, BARRON J T, et al. Nerfies: deformable neural radiance fields[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 5845-5854. |
[34] | HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 7132-7141. |
[35] | NAIR V, HINTON G E. Rectified linear units improve restricted boltzmann machines[C]// The 27th International Conference on Machine Learning (ICML-10). Madison: Omnipress, 2010: 807-814. |
[36] | MAX N. Optical models for direct volume rendering[J]. IEEE Transactions on Visualization and Computer Graphics, 1995, 1(2): 99-108. |
[37] | ZHANG R, ISOLA P, EFROS A A, et al. The unreasonable effectiveness of deep features as a perceptual metric[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 586-595. |
[38] |
IONESCU C, PAPAVA D, OLARU V, et al. Human3.6M: large scale datasets and predictive methods for 3D human sensing in natural environments[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(7): 1325-1339.
DOI PMID |
[39] | SCHÖNBERGER J L, PRICE T, SATTLER T, et al. A vote-and-verify strategy for fast spatial verification in image retrieval[C]// The 13th Asian Conference on Computer Vision. Cham: Springer, 2016: 321-337. |
[1] | SUN Heyi, LI Yixiao, TIAN Xi, ZHANG Songhai. Image to 3D vase generation technology combining procedural content generation and diffusion models [J]. Journal of Graphics, 2025, 46(2): 332-344. |
[2] | ZHOU Wei, CANG Minnan, CHENG Haozong. Research on the method of 3D image reconstruction for cultural relics based on AR technology [J]. Journal of Graphics, 2025, 46(2): 369-381. |
[3] | XIONG Chao, WANG Yunyan, LUO Yuhao. Multi-view stereo network reconstruction with feature alignment and context-guided [J]. Journal of Graphics, 2024, 45(5): 1008-1016. |
[4] | HUANG Jiahui, MU Taijiang. A survey of dynamic 3D scene reconstruction [J]. Journal of Graphics, 2024, 45(1): 14-25. |
[5] | SHI Min, WANG Bingqi, LI Zhaoxin, ZHU Dengming. A seamless texture mapping method with highlight processing [J]. Journal of Graphics, 2024, 45(1): 148-158. |
[6] | ZHOU Jingyi, ZHANG Qitong, FENG Jieqing. Hybrid-structure based multi-view 3D scene reconstruction [J]. Journal of Graphics, 2024, 45(1): 199-208. |
[7] | WANG Jiang’an, HUANG Le, PANG Dawei, QIN Linzhen, LIANG Wenqian. Dense point cloud reconstruction network based on adaptive aggregation recurrent recursion [J]. Journal of Graphics, 2024, 45(1): 230-239. |
[8] | CHENG Huan, WANG Shuo, LI Meng, QIN Lun-ming, ZHAO Fang. A review of neural radiance field for autonomous driving scene [J]. Journal of Graphics, 2023, 44(6): 1091-1103. |
[9] | XUE Hao-wei, WANG Mei-li. Hand reconstruction incorporating biomechanical constraints and multi-modal data [J]. Journal of Graphics, 2023, 44(4): 794-800. |
[10] | WANG Jiang-an, PANG Da-wei, HUANG Le, QING Lin-zhen. Dense point cloud reconstruction network using multi-scale feature recursive convolution [J]. Journal of Graphics, 2022, 43(5): 875-883. |
[11] | CHEN Tian-xiang, CHEN Bin . A fast construction method of 6-DOF field virtual environment based on panoramic video image [J]. Journal of Graphics, 2022, 43(5): 901-908. |
[12] | BAI Jing, MENG Qing-liang, XU Hao, FAN You-fu, YANG Zhan-yuan. ST-Rec3D: a structure and target-aware 3D reconstruction [J]. Journal of Graphics, 2022, 43(3): 469-477. |
[13] | ZHANG Xiao-meng, FANG Xian-yong, WANG Lin-bo, TIAN Li-li, SUN You-wei. Human body reconstruction based on improved piecewise hinge transformation [J]. Journal of Graphics, 2020, 41(1): 108-115. |
[14] | ZOU Xiao1, CHEN Zhengming1,2, ZHU Hongqiang1, TONG Jing1,2. Implementation and Application of 3D Virtual Hairstyle Try-on System Based on Mobile Platform [J]. Journal of Graphics, 2018, 39(2): 309-316. |
[15] | MIAO Yongwei1,2, FENG Xiaohong2, LIN Haibin2, ZHANG Xudong2, GAO Fei2. 2D to 3D: 3D Complex Mechanical Objects Generation from 2D Line Drawings [J]. Journal of Graphics, 2017, 38(2): 162-169. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||