Large scene reconstruction method based on voxel grid feature of NeRF

doi:10.11996/JG.j.2095-302X.2025030502

Abstract

Abstract:

To address the problems of blurred rendering and missing details problems in neural radiation fields for large scenes, a rendering method suitable for large scenes was proposed that was guided by voxel mesh features and driven by ray sampling. This method can effectively enhance the accuracy of 3D models, which was particularly crucial for large-scale scene reconstruction and can be applicable to various scenarios such as architectural design and urban planning. Firstly, grid processing was performed on the reconstructed scene by allocating scene boundaries based on scene size and refining voxel units. Secondly, tensor decomposition was conducted on the information contained in the voxels, and gridded scene features were extracted. Neural radiance fields then focused on sampling based on the extracted features. Finally, the sampling results were fed into a neural network, and a Multilayer Perceptron renderer converted the features into color and density information, synthesizing view rendering results from various new perspectives. Multiple datasets were used for validation in the experiment. The experimental results demonstrated that, compared with other methods, the proposed approach achieved an average improvement of approximately 11% in PSNR, an average increase of about 12% in SSIM, and an average reduction of around 15% in LPIPS, with significantly enhanced visual effects.

Key words: neural radiation fields, large scene, 3D reconstruction, deep learning, image rendering

CLC Number:

TP391

WANG Daolei, DING Zijian, YANG Jun, ZHENG Shaokai, ZHU Rui, ZHAO Wenbin. Large scene reconstruction method based on voxel grid feature of NeRF[J]. Journal of Graphics, 2025, 46(3): 502-509.

Figures/Tables 11

Fig. 1 VoxelNeRF flow diagram

Fig. 2 Camera pose figure ((a) Open data sets; (b) Self-made data sets)

Fig. 3 Occupancy grid prediction model

Fig. 4 Network structure comparison figure ((a) NeRF network structure; (b) VoxelNeRF network structure)

Fig. 5 MatrixCity public dataset

Fig. 6 Custom dataset

Fig. 7 The reconstruction effect of VoxelNeRF on different dataset ((a)~(b) Public data sets; (c)~(d) Self-made data sets)

Fig. 8 Comparison of visual effects between VoxelNeRF and other methods ((a) Ground truth; (b) MipNeRF; (c) TensoRF; (d) Instant-NGP; (e) Ours)

Table 1 Comparison of evaluation indicators between VoxelNeRF and other methods

Method	PSNR↑	SSIM↑	LPIPS(VGG) ↓	LPIPS(Alex) ↓	Size	Time
NeRF	23.15	0.56	0.65	0.64	8 M	>36 h
MipNeRF	24.64	0.69	0.55	0.53	5 M	>25 h
TensoRF	25.96	0.72	0.46	0.49	412 M	18 h
Instant-NGP	27.21	0.79	0.38	0.36	15.9 G	10 min
Ours	28.17	0.76	0.37	0.33	2.3 G	6 h

Table 2 Comparison of ablation experiment performance

Model	PSNR/db	SSIM	LPIPS	Time/h
NeRF	16.54	0.52	0.65	8
VoxelFeature+NeRF	22.53	0.67	0.53	3
VoxelFeature+NeRF+Occupancy	24.07	0.69	0.50	3

Fig. 9 Visual comparison of ablation experiments ((a) Ground truth; (b) NeRF; (c) Voxel feature and NeRF; (d) Voxel feature and Occupancy prediction and NeRF)

References 16

[1]	马汉声, 祝玉华, 李智慧, 等. 神经辐射场多视图合成技术综述[J]. 计算机工程与应用, 2024, 60(4): 21-38. DOI
	MA H S, ZHU Y H, LI Z H, et al. Survey of neural radiance fields for multi-view synthesis technologies[J]. Computer Engineering and Applications, 2024, 60(4): 21-38 (in Chinese). DOI
[2]	董相涛, 马鑫, 潘成伟, 等. 室外大场景神经辐射场综述[J]. 图学学报, 2024, 45(4): 631-649. DOI
	DONG X T, MA X, PAN C W, et al. A review of neural radiance fields for outdoor large scenes[J]. Journal of Graphics, 2024, 45(4): 631-649 (in Chinese). DOI
[3]	范腾, 杨浩, 尹稳, 等. 基于神经辐射场的多尺度视图合成研究[J]. 图学学报, 2023, 44(6): 1140-1148. DOI
	FAN T, YANG H, YIN W, et al. Multi-scale view synthesis based on neural radiance field[J]. Journal of Graphics, 2023, 44(6): 1140-1148 (in Chinese). DOI
[4]	LIU L J, GU J T, LIN K Z, et al. Neural sparse voxel fields[C]// The 34th International Conference on Neural Information Processing Systems. New York: ACM, 2020: 1313.
[5]	TANCIK M, CASSER V, YAN X C, et al. Block-NeRF: scalable large scene neural view synthesis[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 8238-8248.
[6]	TURKI H, RAMANAN D, SATYANARAYANAN M. Mega-NeRF: scalable construction of large-scale NeRFs for virtual fly-throughs[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 12912-12921.
[7]	CHEN A P, XU Z X, GEIGER A, et al. TensoRF: tensorial radiance fields[C]// The 17th European Conference on Computer Vision. Cham: Springer, 2022: 333-350.
[8]	XU L N, XIANGLI Y B, PENG S D, et al. Grid-guided neural radiance fields for large urban scenes[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 8296-8306.
[9]	KERBL B, KOPANAS G, LEIMKÜEHLER T, et al. 3D Gaussian splatting for real-time radiance field rendering[J]. ACM Transactions on Graphics, 2023, 42(4): 139.
[10]	LIN J Q, LI Z H, TANG X, et al. VastGaussian: vast 3D Gaussians for large scene reconstruction[C]// 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2024: 5166-5175.
[11]	LIU Y, LUO C C, FAN L, et al. CityGaussian: real-time high-quality large-scale scene rendering with Gaussians[C]// The 18th European Conference on Computer Vision. Cham: Springer, 2025: 265-282.
[12]	MILDENHALL B, SRINIVASAN P P, TANCIK M, et al. NeRF: representing scenes as neural radiance fields for view synthesis[J]. Communications of the ACM, 2022, 65(1): 99-106.
[13]	BARRON J T, MILDENHALL B, VERBIN D, et al. Mip-NeRF 360: unbounded anti-aliased neural radiance fields[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 5460-5469.
[14]	MÜLLER T, EVANS A, SCHIED C, et al. Instant neural graphics primitives with a multiresolution hash encoding[J]. ACM Transactions on Graphics, 2022, 41(4): 102.
[15]	LIU Z, ZHANG F. BALM: Bundle adjustment for lidar mapping[J]. IEEE Robotics and Automation Letters, 2021, 6(2): 3184-3191.
[16]	LI Y X, JIANG L H, XU L N, et al. MatrixCity: a large-scale city dataset for city-scale neural rendering and beyond[C]// 2023 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2023: 3182-3192.