A review on neural radiance fields acceleration

doi:10.11996/JG.j.2095-302X.2024010001

Abstract

Abstract:

Neural radiance field (NeRF) has become an important research area in computer graphics and computer vision in recent years. Due to its highly realistic visual synthesis effects, NeRF has been widely used in photorealistic rendering, virtual reality, human body modeling, urban mapping, and other domains. NeRF employs neural networks to learn implicit representations of 3D scenes from input image sets and to synthesize highly realistic novel view images. However, the training and inference speeds of the primitive NeRF model are very slow, posing challenges for real-condition deployment and application. To address the acceleration problem of NeRF, researchers have studied the acceleration of NeRF from the aspects of scene modeling methods and ray sampling strategies. Those works can be categorized into the following research directions: baking model, integrating models with discrete representation methods, enhancing sampling efficiency, using hash coding to reduce the complexity of MLP network, introducing scene generalization, and introducing deep supervision information and field decomposition methods. After introducing the background of the NeRF model, the advantages and characteristics of the representative methods of the above ideas were discussed and analyzed. Finally, the progress made in the acceleration of NeRF-related work and future prospects were summarized.

Key words: neural radiance field, view synthesis, neural rendering, NeRF acceleration, deep learning

CLC Number:

TP391

WANG Zhiru, CHANG Yuan, LU Peng, PAN Chengwei. A review on neural radiance fields acceleration[J]. Journal of Graphics, 2024, 45(1): 1-13.

Figures/Tables 20

Fig. 1 Pipeline of NeRF[10]

Fig. 2 Pipeline of SNeRG[19]

Fig. 3 Model comparisons of NeRF[10], FastNeRF[21] and SqueezeNeRF[22]

Fig. 4 PointNeRF Representation with Volume rendering[24]

Fig. 5 Conversion of PlenOctrees[25]

Fig. 6 Pipeline of NeX[27]

Fig. 7 Different sampling approaches ((a) Uniform sampling; (b) Importance sampling; (c) Sampling approach based on sparse voxels)

Fig. 8 EfficientNeRF's sampling strategy during training and testing[32]

Fig. 9 Illustration of the multiresolution hash coding in 2D[33]

Fig. 10 Pipeline of MVSNeRF[34]

Fig. 11 Pipeline of NeRFusion[37]

Fig. 12 Architecture of X-NeRF model[41]

Fig. 13 Pipeline of DS-NeRF[42]

Fig. 14 Pipeline of NerfingMVS[43]

Fig. 15 Decomposition of TensoRF[44]

Fig. 16 Comparison of KiloNeRF[45]with NeRF[10] ((a) NeRF[10]; (b) KiloNeRF[45])

Fig. 17 Six feature plane decomposition method of HexPlane[46]

Fig. 18 Illustration of tri-projection decomposition method in Tensor4D[50]

Fig. 19 Pipeline of dynamic MLP maps[51]

Table 1 Comparison of some of the NeRF models mentioned in the paper on the NeRF synthetic dataset

Method	PSNR↑ /dB	SSIM↑	LPIPS↓	训练轮数/k	训练时长	推理速度
Baseline NeRF^[10]	31.01	0.947	0.081	100~300	>12 h	1
SNeRG^[19]	30.38	0.950	0.050	250	>12 h	~9 000
PlenOctree^[25]	31.71	0.958	0.053	2000	>12 h	~3 000
NSVF^[31]	31.74	0.953	0.047	100~150	-	~10
FastNeRF^[21]	29.97	0.941	0.053	300	>12 h	~4 000
Plenoxels^[23]	31.71	0.958	0.049	128	~20 min	45
Instant-NGP^[33]	33.18	-	-	256	~5 min	-
MVSNeRF^[34]	27.07	0.931	0.163	10	~15 min	~1
DS-NeRF^[42]	24.90	0.72	0.34	150~200	-	~1
TensoRF^[44]	33.14	0.963	-	30	17 min	~100
KiloNeRF^[45]	31.00	0.95	0.03	1 750	>12 h	~2 000
3D-Gaussian^[29]	33.32	-	-	30	1 h	~550

References 52

[1]	SCHÖNBERGER J L, FRAHM J M. Structure-from-motion revisited[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 4104-4113.
[2]	SEITZ S M, CURLESS B, DIEBEL J, et al. A comparison and evaluation of multi-view stereo reconstruction algorithms[C]// 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2006: 519-528.
[3]	LOMBARDI S, SIMON T, SARAGIH J, et al. Neural volumes: learning dynamic renderable volumes from images[EB/OL]. [2023-08-27]. http://arxiv.org/abs/1906.07751.pdf.
[4]	NIEMEYER M, MESCHEDER L, OECHSLE M, et al. Differentiable volumetric rendering: learning implicit 3D representations without 3D supervision[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 3501-3512.
[5]	GENOVA K, COLE F, SUD A, et al. Local deep implicit functions for 3D shape[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 4856-4865.
[6]	PARK J J, FLORENCE P, STRAUB J, et al. DeepSDF: learning continuous signed distance functions for shape representation[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 165-174.
[7]	CHEN W Z, GAO J, LING H, et al. Learning to predict 3D objects with an interpolation-based differentiable renderer[EB/OL]. [2023-08-27]. http://arxiv.org/abs/1908.01210.pdf.
[8]	CHEN W Z, GAO J, LING H, et al. Learning to predict 3D objects with an interpolation-based differentiable renderer[EB/OL]. [2023-08-27]. http://arxiv.org/abs/1908.01210.pdf.
[9]	LOPER M M, BLACK M J. OpenDR: an approximate differentiable renderer[M]// FLEET D, PAJDLA T, SCHIELE B, et al., Eds. Computer Vision - ECCV 2014. Cham: Springer International Publishing, 2014: 154-169.
[10]	MILDENHALL B, SRINIVASAN P P, TANCIK M, et al. NeRF: representing scenes as neural radiance fields for view synthesis[C]// European Conference on Computer Vision. Cham: Springer, 2020: 405-421.
[11]	CORONA-FIGUEROA A, FRAWLEY J, TAYLOR S B, et al. MedNeRF: medical neural radiance fields for reconstructing 3D-aware CT-projections from a single X-ray[C]// 2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society. New York: IEEE Press, 2022: 3843-3848.
[12]	ZHAO F Q, YANG W, ZHANG J K, et al. HumanNeRF: efficiently generated human radiance field from sparse inputs[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 7733-7743.
[13]	ZHANG J K, LIU X H, YE X Y, et al. Editable free-viewpoint video using a layered neural representation[J]. ACM Transactions on Graphics, 40(4): 149:1-149:18.
[14]	ZHU Z H, PENG S Y, LARSSON V, et al. NICE-SLAM: neural implicit scalable encoding for SLAM[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 12776-12786.
[15]	LI Z P, LI L, ZHU J K. READ: large-scale neural scene rendering for autonomous driving[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2023, 37(2): 1522-1529. DOI URL
[16]	TANCIK M, CASSER V, YAN X C, et al. Block-NeRF: scalable large scene neural view synthesis[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 8238-8248.
[17]	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all You need[C]// The 31st International Conference on Neural Information Processing Systems. New York: ACM, 2017: 6000-6010.
[18]	ZHANG K, RIEGLER G, SNAVELY N, et al. NeRF++: analyzing and improving neural radiance fields[EB/OL]. [2023-08-27]. http://arxiv.org/abs/2010.07492.pdf.
[19]	HEDMAN P, SRINIVASAN P P, MILDENHALL B, et al. Baking neural radiance fields for real-time view synthesis[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2022: 5855-5864.
[20]	REISER C, SZELISKI R, VERBIN D, et al. MERF: memory-efficient radiance fields for real-time view synthesis in unbounded scenes[J]. ACM Transactions on Graphics, 42(4): 89:1-89:12.
[21]	GARBIN S J, KOWALSKI M, JOHNSON M, et al. FastNeRF: high-fidelity neural rendering at 200FPS[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2022: 14326-14335.
[22]	WADHWANI K, KOJIMA T. SqueezeNeRF: further factorized FastNeRF for memory-efficient inference[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. New York: IEEE Press, 2022: 2716-2724.
[23]	FRIDOVICH-KEIL S, YU A, TANCIK M, et al. Plenoxels: radiance fields without neural networks[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 5491-5500.
[24]	XU Q G, XU Z X, PHILIP J, et al. Point-NeRF: point-based neural radiance fields[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 5428-5438.
[25]	YU A, LI R L, TANCIK M, et al. PlenOctrees for real-time rendering of neural radiance fields[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2022: 5732-5741.
[26]	CHEN Z Q, FUNKHOUSER T, HEDMAN P, et al. MobileNeRF: exploiting the polygon rasterization pipeline for efficient neural field rendering on mobile architectures[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 16569-16578.
[27]	WIZADWONGSA S, PHONGTHAWEE P, YENPHRAPHAI J, et al. NeX: real-time view synthesis with neural basis expansion[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 8530-8539.
[28]	TUCKER R, SNAVELY N. Single-view view synthesis with multiplane images[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 548-557.
[29]	KERBL B, KOPANAS G, LEIMKUEHLER T, et al. 3D Gaussian splatting for real-time radiance field rendering[J]. ACM Transactions on Graphics, 2023, 42(4): 139:1-139:14.
[30]	KNAPITSCH A, PARK J, ZHOU Q Y, et al. Tanks and temples: benchmarking large-scale scene reconstruction[J]. ACM Transactions on Graphics, 36(4): 78:1-78:13.
[31]	LIU L J, GU J T, LIN K Z, et al. Neural sparse voxel fields[EB/OL]. [2023-08-27]. https://arxiv.org/abs/2007.11571. .
[32]	HU T, LIU S, CHEN Y L, et al. EfficientNeRF - efficient neural radiance fields[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 12892-12901.
[33]	MÜLLER T, EVANS A, SCHIED C, et al. Instant neural graphics primitives with a multiresolution hash encoding[J]. ACM Transactions on Graphics, 2022, 41(4): 1-15.
[34]	CHEN A P, XU Z X, ZHAO F Q, et al. MVSNeRF: fast generalizable radiance field reconstruction from multi-view stereo[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2022: 14104-14113.
[35]	YAO Y, LUO Z X, LI S W, et al. MVSNet: depth inference for unstructured multi-view stereo[C]// European Conference on Computer Vision. Cham: Springer, 2018: 785-801.
[36]	JENSEN R, DAHL A, VOGIATZIS G, et al. Large scale multi-view stereopsis evaluation[C]// 2014 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2014: 406-413.
[37]	ZHANG X S, BI S, SUNKAVALLI K, et al. NeRFusion: fusing radiance fields for large-scale scene reconstruction[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 5439-5448.
[38]	DAI A, CHANG A X, SAVVA M, et al. ScanNet: richly-annotated 3D reconstructions of indoor scenes[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 2432-2443.
[39]	WANG Q Q, WANG Z C, GENOVA K, et al. IBRNet: learning multi-view image-based rendering[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 4688-4697.
[40]	LIN H T, PENG S D, XU Z, et al. Efficient neural radiance fields for interactive free-viewpoint video[C]// SA '22: SIGGRAPH Asia 2022 Conference Papers. New York: ACM, 2022: 1-9.
[41]	ZHU H Y. X-NeRF: explicit neural radiance field for multi-scene 360° insufficient RGB-D views[C]// 2023 IEEE/CVF Winter Conference on Applications of Computer Vision. New York: IEEE Press, 2023: 5755-5764.
[42]	DENG K L, LIU A, ZHU J Y, et al. Depth-supervised NeRF: fewer views and faster training for free[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 12872-12881.
[43]	WEI Y, LIU S H, RAO Y M, et al. NerfingMVS: guided optimization of neural radiance fields for indoor multi-view stereo[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2022: 5590-5599.
[44]	CHEN A P, XU Z X, GEIGER A, et al. TensoRF: tensorial radiance fields[C]// European Conference on Computer Vision. Cham: Springer, 2022: 333-350.
[45]	REISER C, PENG S Y, LIAO Y Y, et al. KiloNeRF: speeding up neural radiance fields with thousands of tiny MLPs[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2022: 14315-14325.
[46]	CAO A, JOHNSON J. HexPlane: a fast representation for dynamic scenes[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 130-141.
[47]	PUMAROLA A, CORONA E, PONS-MOLL G, et al. D-NeRF: neural radiance fields for dynamic scenes[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 10313-10322.
[48]	FRIDOVICH-KEIL S, MEANTI G, WARBURG F R, et al. K-planes: explicit radiance fields in space, time, and appearance[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 12479-12488.
[49]	JANG H, KIM D. D-TensoRF: tensorial radiance fields for dynamic scenes[EB/OL]. [2023-08-27]. http://arxiv.org/abs/2212.02375.pdf.
[50]	SHAO R Z, ZHENG Z R, TU H Z, et al. Tensor4D:efficient neural 4D decomposition for high-fidelity dynamic reconstruction and rendering[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 16632-16642.
[51]	PENG S D, YAN Y Z, SHUAI Q, et al. Representing volumetric videos as dynamic MLP maps[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 4252-4262.
[52]	SHADE J, GORTLER S, HE L W, et al. Layered depth images[C]// The 25th annual conference on Computer graphics and interactive techniques. New York: ACM, 1998: 231-242.