欢迎访问《图学学报》 分享到:

图学学报 ›› 2025, Vol. 46 ›› Issue (3): 502-509.DOI: 10.11996/JG.j.2095-302X.2025030502

• 图像处理与计算机视觉 • 上一篇    下一篇

基于体素网格特征的NeRF大场景重建方法

王道累(), 丁子健, 杨君, 郑劭恺, 朱瑞, 赵文彬()   

  1. 上海电力大学能源与机械工程学院,上海 201306
  • 收稿日期:2024-09-17 接受日期:2024-12-16 出版日期:2025-06-30 发布日期:2025-06-13
  • 通讯作者:赵文彬(1978-),男,副教授,博士。主要研究方向为深度学习、图像处理、CAD/CAM。E-mail:zhaowenbin@sheip.edu.cn
  • 第一作者:王道累(1981-),男,教授,博士。主要研究方向为计算机视觉、图像处理、CAD/CAM。E-mail:alfredwdl@shiep.edu.cn

Large scene reconstruction method based on voxel grid feature of NeRF

WANG Daolei(), DING Zijian, YANG Jun, ZHENG Shaokai, ZHU Rui, ZHAO Wenbin()   

  1. College of Energy and Mechanical Engineering, Shanghai University of Electric Power, Shanghai 201306, China
  • Received:2024-09-17 Accepted:2024-12-16 Published:2025-06-30 Online:2025-06-13
  • Contact: ZHAO Wenbin (1978-), associate professor, Ph.D. His main research interests cover deep learning, image processing, CAD/CAM. E-mail:zhaowenbin@sheip.edu.cn
  • First author:WANG Daolei (1981-), professor, Ph.D. His main research interests cover computer vision, image processing, CAD/CAM. E-mail:alfredwdl@shiep.edu.cn

摘要:

针对神经辐射场(NeRF)在大场景下的渲染模糊、细节缺失等问题,提出了一种以体素网格特征为指导,并驱动光线采样的适用于大场景的渲染方法,可以有效提升三维模型的精度,对大场景重建尤为重要,可用于建筑设计、城市规划等多种应用场景。首先,对于重建的场景进行网格化处理,根据场景大小分配场景边界并细化体素单元。其次,对体素包含的信息进行张量分解,并提取网格化后的场景特征,NeRF将根据提取的特征进行侧重采样。最后,将采样结果传入神经网络,MLP渲染器将特征转换为色彩和密度信息,并合成各种新视角下的视图渲染结果。实验通过多种数据集验证,实验结果表明,与其他方法相比,该方法PSNR和SSIM分别平均提高了11%和12%左右,LPIPS则平均降低了15%左右,且视觉效果有明显提升。

关键词: 神经辐射场, 大场景, 三维重建, 深度学习, 图像渲染

Abstract:

To address the problems of blurred rendering and missing details problems in neural radiation fields for large scenes, a rendering method suitable for large scenes was proposed that was guided by voxel mesh features and driven by ray sampling. This method can effectively enhance the accuracy of 3D models, which was particularly crucial for large-scale scene reconstruction and can be applicable to various scenarios such as architectural design and urban planning. Firstly, grid processing was performed on the reconstructed scene by allocating scene boundaries based on scene size and refining voxel units. Secondly, tensor decomposition was conducted on the information contained in the voxels, and gridded scene features were extracted. Neural radiance fields then focused on sampling based on the extracted features. Finally, the sampling results were fed into a neural network, and a Multilayer Perceptron renderer converted the features into color and density information, synthesizing view rendering results from various new perspectives. Multiple datasets were used for validation in the experiment. The experimental results demonstrated that, compared with other methods, the proposed approach achieved an average improvement of approximately 11% in PSNR, an average increase of about 12% in SSIM, and an average reduction of around 15% in LPIPS, with significantly enhanced visual effects.

Key words: neural radiation fields, large scene, 3D reconstruction, deep learning, image rendering

中图分类号: