欢迎访问《图学学报》 分享到:

图学学报 ›› 2025, Vol. 46 ›› Issue (3): 551-557.DOI: 10.11996/JG.j.2095-302X.2025030551

• 图像处理与计算机视觉 • 上一篇    下一篇

基于RGB特征的下一个最优视图导航技术

周峥(), 戴亚桥, 易任娇, 蓝龙, 朱晨阳()   

  1. 国防科技大学计算机学院,湖南 长沙 410000
  • 收稿日期:2024-08-23 接受日期:2025-03-03 出版日期:2025-06-30 发布日期:2025-06-13
  • 通讯作者:朱晨阳(1990-),男,副教授,博士。主要研究方向为计算机图形学、计算机视觉等。E-mail:zhuchenyang07@nudt.edu.cn
  • 第一作者:周峥(1997-),男,硕士研究生。主要研究方向为计算机图形学。E-mail:zhouzheng@nudt.edu.cn
  • 基金资助:
    国家自然科学基金(62325221);国家自然科学基金(62132021);国家自然科学基金(62372457);中国科学院青年精英科学家资助项目(2023QNRC001);湖南省自然科学基金(2021RC3071);湖南省自然科学基金(2022RC1104);国防科技大学研究资助项目(ZK22-52);高性能计算国家重点实验室基金(2023KJWHPCL02)

The next best view navigation technology based on RGB features

ZHOU Zheng(), DAI Yaqiao, YI Renjiao, LAN Long, ZHU Chenyang()   

  1. School of Computer Science, National University of Defense Technology, Changsha Hunan 410000, China
  • Received:2024-08-23 Accepted:2025-03-03 Published:2025-06-30 Online:2025-06-13
  • Contact: ZHU Chenyang (1990-), associate professor, Ph.D. His main research interests cover computer graphics, computer vision, etc. E-mail:zhuchenyang07@nudt.edu.cn
  • First author:ZHOU Zheng (1997-), master student. His main research interest covers computer graphics. E-mail:zhouzheng@nudt.edu.cn
  • Supported by:
    National Natural Science Foundation of China(62325221);National Natural Science Foundation of China(62132021);National Natural Science Foundation of China(62372457);Young Elite Scientists Sponsorship Program by CAST(2023QNRC001);Natural Science Foundation of Hunan Province of China(2021RC3071);Natural Science Foundation of Hunan Province of China(2022RC1104);NUDT Research Grants(ZK22-52);State Key Laboratory of High Performance Computing Foundation(2023KJWHPCL02)

摘要:

神经辐射场(NeRF)在二维图像到三维场景重建领域展现出优异的性能,使用二维图像作为训练数据,能够重建出场景的三维结构,并能进行高质量的新视图渲染。尽管NeRF在三维场景重建领域是十分有效的,但也存在训练速度慢、推理时间长的问题,并且样本质量与三维场景重建质量密切关联。为解决NeRF在低样本质量情况下的高质量三维重建问题,本文使用2组不同哈希编码的NeRF来学习同一个场景,评估候选视图信息增益之间的差距来引导视图采样。提出一种基于RGB特征的下一个最优视图(next best view)导航技术新框架,该框架在稀疏训练数据上具有很强的鲁棒性,能够通过RGB特征评估捕获高信息增益的下一个最优视图,并优化NeRF训练,可以用最少的额外视图来提高新视图合成质量。通过对NeRF训练流程的优化,网络收敛速度提升大约10倍,显存占用降低39.8%,大量实验验证了该模型的有效性和鲁棒性。

关键词: 神经辐射场, 哈希编码, 稀疏重建, 信息增益, 主动学习

Abstract:

Neural radiance field (NeRF) has shown excellent performance in reconstructing 3D scenes from 2D images. Using 2D images as training data, the 3D structure of scenes could be reconstructed and new views could be rendered with high quality. Although NeRF is very effective in reconstructing 3D scenes, issues of slow training speed and long inference time are encountered, and the sample quality is closely related to the quality of 3D scene reconstruction. In order to address the challenge of high-quality 3D reconstruction of NeRF under conditions of low sample quality, two sets of NeRFs with different hash codes were employed to learn the same scene and to evaluate the gap between the information gain of candidate views to guide view sampling. A new framework of Next Best View navigation technology based on RGB features was proposed. This framework exhibited strong robustness with sparse training data, was capable of capturing the next best view with high information gain through RGB feature evaluation, and optimized NeRF training, thereby improving the quality of new view synthesis with a minimal number of additional views. By optimizing the NeRF training process, the network convergence speed was increased by approximately 10 times, and the memory usage was reduced by 39.8%. A large number of experiments have verified the effectiveness and robustness of the proposed model.

Key words: neural radiance field, hash coding, sparse reconstruction, information gain, active learning

中图分类号: