图学学报 ›› 2024, Vol. 45 ›› Issue (1): 1-13.DOI: 10.11996/JG.j.2095-302X.2024010001
收稿日期:
2023-09-26
接受日期:
2023-12-11
出版日期:
2024-02-29
发布日期:
2024-02-29
通讯作者:
潘成伟(1989-),男,副教授,博士。主要研究方向为计算机图形学、计算机视觉等。E-mail:pancw@buaa.edu.cn第一作者:
王稚儒(2001-),男,硕士研究生。主要研究方向为计算机图形学与深度学习。E-mail:19241085@buaa.edu.cn
基金资助:
WANG Zhiru1(), CHANG Yuan2, LU Peng3, PAN Chengwei1(
)
Received:
2023-09-26
Accepted:
2023-12-11
Published:
2024-02-29
Online:
2024-02-29
First author:
WANG Zhiru (2001-), master student. His main research interests cover computer graphics and deep learning. E-mail:19241085@buaa.edu.cn
Supported by:
摘要:
近年来,神经辐射场(NeRF)已成为计算机图形学和计算机视觉领域中一个重要的研究方向,因其高逼真的视觉合成效果,在真实感渲染、虚拟现实、人体建模、城市地图等领域得到了广泛的应用。NeRF利用神经网络从输入图片集中学习三维场景的隐式表征,并合成高逼真的新视角图像。然而原始NeRF模型的训练和推理速度都很慢,难以在真实环境下部署与应用。针对NeRF的加速问题,研究者们从场景建模方法、光线采样策略等方面展开对NeRF进行提速的研究。该类工作大致可分为以下研究方向:烘焙模型、与离散表示方法结合、提高采样效率、利用哈希编码降低MLP网络复杂度、引入场景泛化性、引入深度监督信息和分解方法。通过介绍NeRF模型提出的背景,对上述思路的代表方法的优势与特点进行了讨论和分析,最后总结了NeRF相关工作在加速方面所取得的进展和对于未来的展望。
中图分类号:
王稚儒, 常远, 鲁鹏, 潘成伟. 神经辐射场加速算法综述[J]. 图学学报, 2024, 45(1): 1-13.
WANG Zhiru, CHANG Yuan, LU Peng, PAN Chengwei. A review on neural radiance fields acceleration[J]. Journal of Graphics, 2024, 45(1): 1-13.
图7 不同的采样方式:(a)均匀采样;(b)重要区域采样;(c)稀疏体素采样
Fig. 7 Different sampling approaches ((a) Uniform sampling; (b) Importance sampling; (c) Sampling approach based on sparse voxels)
Method | PSNR↑ /dB | SSIM↑ | LPIPS↓ | 训练轮数/k | 训练时长 | 推理速度 |
---|---|---|---|---|---|---|
Baseline NeRF[ | 31.01 | 0.947 | 0.081 | 100~300 | >12 h | 1 |
SNeRG[ | 30.38 | 0.950 | 0.050 | 250 | >12 h | ~9 000 |
PlenOctree[ | 31.71 | 0.958 | 0.053 | 2000 | >12 h | ~3 000 |
NSVF[ | 31.74 | 0.953 | 0.047 | 100~150 | - | ~10 |
FastNeRF[ | 29.97 | 0.941 | 0.053 | 300 | >12 h | ~4 000 |
Plenoxels[ | 31.71 | 0.958 | 0.049 | 128 | ~20 min | 45 |
Instant-NGP[ | 33.18 | - | - | 256 | ~5 min | - |
MVSNeRF[ | 27.07 | 0.931 | 0.163 | 10 | ~15 min | ~1 |
DS-NeRF[ | 24.90 | 0.72 | 0.34 | 150~200 | - | ~1 |
TensoRF[ | 33.14 | 0.963 | - | 30 | 17 min | ~100 |
KiloNeRF[ | 31.00 | 0.95 | 0.03 | 1 750 | >12 h | ~2 000 |
3D-Gaussian[ | 33.32 | - | - | 30 | 1 h | ~550 |
表1 文中提到的一些NeRF模型在NeRF synthetic数据集上的比较
Table 1 Comparison of some of the NeRF models mentioned in the paper on the NeRF synthetic dataset
Method | PSNR↑ /dB | SSIM↑ | LPIPS↓ | 训练轮数/k | 训练时长 | 推理速度 |
---|---|---|---|---|---|---|
Baseline NeRF[ | 31.01 | 0.947 | 0.081 | 100~300 | >12 h | 1 |
SNeRG[ | 30.38 | 0.950 | 0.050 | 250 | >12 h | ~9 000 |
PlenOctree[ | 31.71 | 0.958 | 0.053 | 2000 | >12 h | ~3 000 |
NSVF[ | 31.74 | 0.953 | 0.047 | 100~150 | - | ~10 |
FastNeRF[ | 29.97 | 0.941 | 0.053 | 300 | >12 h | ~4 000 |
Plenoxels[ | 31.71 | 0.958 | 0.049 | 128 | ~20 min | 45 |
Instant-NGP[ | 33.18 | - | - | 256 | ~5 min | - |
MVSNeRF[ | 27.07 | 0.931 | 0.163 | 10 | ~15 min | ~1 |
DS-NeRF[ | 24.90 | 0.72 | 0.34 | 150~200 | - | ~1 |
TensoRF[ | 33.14 | 0.963 | - | 30 | 17 min | ~100 |
KiloNeRF[ | 31.00 | 0.95 | 0.03 | 1 750 | >12 h | ~2 000 |
3D-Gaussian[ | 33.32 | - | - | 30 | 1 h | ~550 |
[1] | SCHÖNBERGER J L, FRAHM J M. Structure-from-motion revisited[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 4104-4113. |
[2] | SEITZ S M, CURLESS B, DIEBEL J, et al. A comparison and evaluation of multi-view stereo reconstruction algorithms[C]// 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2006: 519-528. |
[3] | LOMBARDI S, SIMON T, SARAGIH J, et al. Neural volumes: learning dynamic renderable volumes from images[EB/OL]. [2023-08-27]. http://arxiv.org/abs/1906.07751.pdf. |
[4] | NIEMEYER M, MESCHEDER L, OECHSLE M, et al. Differentiable volumetric rendering: learning implicit 3D representations without 3D supervision[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 3501-3512. |
[5] | GENOVA K, COLE F, SUD A, et al. Local deep implicit functions for 3D shape[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 4856-4865. |
[6] | PARK J J, FLORENCE P, STRAUB J, et al. DeepSDF: learning continuous signed distance functions for shape representation[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 165-174. |
[7] | CHEN W Z, GAO J, LING H, et al. Learning to predict 3D objects with an interpolation-based differentiable renderer[EB/OL]. [2023-08-27]. http://arxiv.org/abs/1908.01210.pdf. |
[8] | CHEN W Z, GAO J, LING H, et al. Learning to predict 3D objects with an interpolation-based differentiable renderer[EB/OL]. [2023-08-27]. http://arxiv.org/abs/1908.01210.pdf. |
[9] | LOPER M M, BLACK M J. OpenDR: an approximate differentiable renderer[M]// FLEET D, PAJDLA T, SCHIELE B, et al., Eds. Computer Vision - ECCV 2014. Cham: Springer International Publishing, 2014: 154-169. |
[10] | MILDENHALL B, SRINIVASAN P P, TANCIK M, et al. NeRF: representing scenes as neural radiance fields for view synthesis[C]// European Conference on Computer Vision. Cham: Springer, 2020: 405-421. |
[11] | CORONA-FIGUEROA A, FRAWLEY J, TAYLOR S B, et al. MedNeRF: medical neural radiance fields for reconstructing 3D-aware CT-projections from a single X-ray[C]// 2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society. New York: IEEE Press, 2022: 3843-3848. |
[12] | ZHAO F Q, YANG W, ZHANG J K, et al. HumanNeRF: efficiently generated human radiance field from sparse inputs[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 7733-7743. |
[13] | ZHANG J K, LIU X H, YE X Y, et al. Editable free-viewpoint video using a layered neural representation[J]. ACM Transactions on Graphics, 40(4): 149:1-149:18. |
[14] | ZHU Z H, PENG S Y, LARSSON V, et al. NICE-SLAM: neural implicit scalable encoding for SLAM[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 12776-12786. |
[15] |
LI Z P, LI L, ZHU J K. READ: large-scale neural scene rendering for autonomous driving[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2023, 37(2): 1522-1529.
DOI URL |
[16] | TANCIK M, CASSER V, YAN X C, et al. Block-NeRF: scalable large scene neural view synthesis[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 8238-8248. |
[17] | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all You need[C]// The 31st International Conference on Neural Information Processing Systems. New York: ACM, 2017: 6000-6010. |
[18] | ZHANG K, RIEGLER G, SNAVELY N, et al. NeRF++: analyzing and improving neural radiance fields[EB/OL]. [2023-08-27]. http://arxiv.org/abs/2010.07492.pdf. |
[19] | HEDMAN P, SRINIVASAN P P, MILDENHALL B, et al. Baking neural radiance fields for real-time view synthesis[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2022: 5855-5864. |
[20] | REISER C, SZELISKI R, VERBIN D, et al. MERF: memory-efficient radiance fields for real-time view synthesis in unbounded scenes[J]. ACM Transactions on Graphics, 42(4): 89:1-89:12. |
[21] | GARBIN S J, KOWALSKI M, JOHNSON M, et al. FastNeRF: high-fidelity neural rendering at 200FPS[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2022: 14326-14335. |
[22] | WADHWANI K, KOJIMA T. SqueezeNeRF: further factorized FastNeRF for memory-efficient inference[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. New York: IEEE Press, 2022: 2716-2724. |
[23] | FRIDOVICH-KEIL S, YU A, TANCIK M, et al. Plenoxels: radiance fields without neural networks[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 5491-5500. |
[24] | XU Q G, XU Z X, PHILIP J, et al. Point-NeRF: point-based neural radiance fields[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 5428-5438. |
[25] | YU A, LI R L, TANCIK M, et al. PlenOctrees for real-time rendering of neural radiance fields[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2022: 5732-5741. |
[26] | CHEN Z Q, FUNKHOUSER T, HEDMAN P, et al. MobileNeRF: exploiting the polygon rasterization pipeline for efficient neural field rendering on mobile architectures[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 16569-16578. |
[27] | WIZADWONGSA S, PHONGTHAWEE P, YENPHRAPHAI J, et al. NeX: real-time view synthesis with neural basis expansion[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 8530-8539. |
[28] | TUCKER R, SNAVELY N. Single-view view synthesis with multiplane images[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 548-557. |
[29] | KERBL B, KOPANAS G, LEIMKUEHLER T, et al. 3D Gaussian splatting for real-time radiance field rendering[J]. ACM Transactions on Graphics, 2023, 42(4): 139:1-139:14. |
[30] | KNAPITSCH A, PARK J, ZHOU Q Y, et al. Tanks and temples: benchmarking large-scale scene reconstruction[J]. ACM Transactions on Graphics, 36(4): 78:1-78:13. |
[31] | LIU L J, GU J T, LIN K Z, et al. Neural sparse voxel fields[EB/OL]. [2023-08-27]. https://arxiv.org/abs/2007.11571. . |
[32] | HU T, LIU S, CHEN Y L, et al. EfficientNeRF - efficient neural radiance fields[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 12892-12901. |
[33] | MÜLLER T, EVANS A, SCHIED C, et al. Instant neural graphics primitives with a multiresolution hash encoding[J]. ACM Transactions on Graphics, 2022, 41(4): 1-15. |
[34] | CHEN A P, XU Z X, ZHAO F Q, et al. MVSNeRF: fast generalizable radiance field reconstruction from multi-view stereo[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2022: 14104-14113. |
[35] | YAO Y, LUO Z X, LI S W, et al. MVSNet: depth inference for unstructured multi-view stereo[C]// European Conference on Computer Vision. Cham: Springer, 2018: 785-801. |
[36] | JENSEN R, DAHL A, VOGIATZIS G, et al. Large scale multi-view stereopsis evaluation[C]// 2014 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2014: 406-413. |
[37] | ZHANG X S, BI S, SUNKAVALLI K, et al. NeRFusion: fusing radiance fields for large-scale scene reconstruction[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 5439-5448. |
[38] | DAI A, CHANG A X, SAVVA M, et al. ScanNet: richly-annotated 3D reconstructions of indoor scenes[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 2432-2443. |
[39] | WANG Q Q, WANG Z C, GENOVA K, et al. IBRNet: learning multi-view image-based rendering[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 4688-4697. |
[40] | LIN H T, PENG S D, XU Z, et al. Efficient neural radiance fields for interactive free-viewpoint video[C]// SA '22: SIGGRAPH Asia 2022 Conference Papers. New York: ACM, 2022: 1-9. |
[41] | ZHU H Y. X-NeRF: explicit neural radiance field for multi-scene 360° insufficient RGB-D views[C]// 2023 IEEE/CVF Winter Conference on Applications of Computer Vision. New York: IEEE Press, 2023: 5755-5764. |
[42] | DENG K L, LIU A, ZHU J Y, et al. Depth-supervised NeRF: fewer views and faster training for free[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 12872-12881. |
[43] | WEI Y, LIU S H, RAO Y M, et al. NerfingMVS: guided optimization of neural radiance fields for indoor multi-view stereo[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2022: 5590-5599. |
[44] | CHEN A P, XU Z X, GEIGER A, et al. TensoRF: tensorial radiance fields[C]// European Conference on Computer Vision. Cham: Springer, 2022: 333-350. |
[45] | REISER C, PENG S Y, LIAO Y Y, et al. KiloNeRF: speeding up neural radiance fields with thousands of tiny MLPs[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2022: 14315-14325. |
[46] | CAO A, JOHNSON J. HexPlane: a fast representation for dynamic scenes[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 130-141. |
[47] | PUMAROLA A, CORONA E, PONS-MOLL G, et al. D-NeRF: neural radiance fields for dynamic scenes[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 10313-10322. |
[48] | FRIDOVICH-KEIL S, MEANTI G, WARBURG F R, et al. K-planes: explicit radiance fields in space, time, and appearance[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 12479-12488. |
[49] | JANG H, KIM D. D-TensoRF: tensorial radiance fields for dynamic scenes[EB/OL]. [2023-08-27]. http://arxiv.org/abs/2212.02375.pdf. |
[50] | SHAO R Z, ZHENG Z R, TU H Z, et al. Tensor4D:efficient neural 4D decomposition for high-fidelity dynamic reconstruction and rendering[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 16632-16642. |
[51] | PENG S D, YAN Y Z, SHUAI Q, et al. Representing volumetric videos as dynamic MLP maps[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 4252-4262. |
[52] | SHADE J, GORTLER S, HE L W, et al. Layered depth images[C]// The 25th annual conference on Computer graphics and interactive techniques. New York: ACM, 1998: 231-242. |
[1] | 董相涛, 马鑫, 潘成伟, 鲁鹏. 室外大场景神经辐射场综述[J]. 图学学报, 2024, 45(4): 631-649. |
[2] | 胡欣, 常娅姝, 秦皓, 肖剑, 程鸿亮. 基于改进YOLOv8和GMM图像点集匹配的双目测距方法[J]. 图学学报, 2024, 45(4): 714-725. |
[3] | 牛为华, 郭迅. 基于改进YOLOv8的船舰遥感图像旋转目标检测算法[J]. 图学学报, 2024, 45(4): 726-735. |
[4] | 李滔, 胡婷, 武丹丹. 结合金字塔结构和注意力机制的单目深度估计[J]. 图学学报, 2024, 45(3): 454-463. |
[5] | 朱光辉, 缪君, 胡宏利, 申基, 杜荣华. 基于自增强注意力机制的室内单图像分段平面三维重建[J]. 图学学报, 2024, 45(3): 464-471. |
[6] | 王欣雨, 刘慧, 朱积成, 盛玉瑞, 张彩明. 基于高低频特征分解的深度多模态医学图像融合网络[J]. 图学学报, 2024, 45(1): 65-77. |
[7] | 李佳琦, 王辉, 郭宇. 基于Transformer的三角形网格分类分割网络[J]. 图学学报, 2024, 45(1): 78-89. |
[8] | 韩亚振, 尹梦晓, 马伟钊, 杨诗耕, 胡锦飞, 朱丛洋. DGOA:基于动态图和偏移注意力的点云上采样[J]. 图学学报, 2024, 45(1): 219-229. |
[9] | 王江安, 黄乐, 庞大为, 秦林珍, 梁温茜. 基于自适应聚合循环递归的稠密点云重建网络[J]. 图学学报, 2024, 45(1): 230-239. |
[10] | 成欢, 王硕, 李孟, 秦伦明, 赵芳. 面向自动驾驶场景的神经辐射场综述[J]. 图学学报, 2023, 44(6): 1091-1103. |
[11] | 范腾, 杨浩, 尹稳, 周冬明. 基于神经辐射场的多尺度视图合成研究[J]. 图学学报, 2023, 44(6): 1140-1148. |
[12] | 周锐闯, 田瑾, 闫丰亭, 朱天晓, 张玉金. 融合外部注意力和图卷积的点云分类模型[J]. 图学学报, 2023, 44(6): 1162-1172. |
[13] | 王吉, 王森, 蒋智文, 谢志峰, 李梦甜. 基于深度条件扩散模型的零样本文本驱动虚拟人生成方法[J]. 图学学报, 2023, 44(6): 1218-1226. |
[14] | 杨陈成, 董秀成, 侯兵, 张党成, 向贤明, 冯琪茗. 基于参考的Transformer纹理迁移深度图像超分辨率重建[J]. 图学学报, 2023, 44(5): 861-867. |
[15] | 党宏社, 许怀彪, 张选德. 融合结构信息的深度学习立体匹配算法[J]. 图学学报, 2023, 44(5): 899-906. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||