图学学报 ›› 2023, Vol. 44 ›› Issue (6): 1091-1103.DOI: 10.11996/JG.j.2095-302X.2023061091
收稿日期:
2023-06-27
接受日期:
2023-09-08
出版日期:
2023-12-31
发布日期:
2023-12-17
通讯作者:
秦伦明(1983-),男,副教授,博士。主要研究方向为计算机视觉与图像分割等。E-mail:作者简介:
成欢(1999-),女,硕士研究生。主要研究方向为计算机视觉与计算机图形学。E-mail:chenghuan0116@gmail.com
基金资助:
CHENG Huan1(), WANG Shuo2, LI Meng2, QIN Lun-ming1(
), ZHAO Fang3
Received:
2023-06-27
Accepted:
2023-09-08
Online:
2023-12-31
Published:
2023-12-17
Contact:
QIN Lun-ming (1983-), associate professor, Ph.D. His main research interests cover computer vision and image segmentation, etc. E-mail:About author:
CHENG Huan (1999-), master student. Her main research interests cover computer vision and computer graphics.
E-mail:chenghuan0116@gmail.com
Supported by:
摘要:
神经辐射场(NeRF)是一种可用于重建真实的视觉效果以及合成新颖视角的关键技术,其主要是通过摄像机捕获的二维图像数据来渲染合成三维场景,将已知视角推理到未知视角下使得用户可从不同视角观看合成视图,增强人机交互感。作为一种新颖视角合成的三维重建方法,神经辐射场技术在机器人、自动驾驶、虚拟现实和数字孪生等领域具有重要的研究与应用价值。通过NeRF技术与自动驾驶场景相结合,可实现对复杂驾驶场景的高质量重建,并模拟恶劣情况下的不同场景,从而丰富自动驾驶的训练数据,以较低成本提高自动驾驶系统的准确性和安全性,并验证自动驾驶算法的有效性。鉴于目前NeRF在自动驾驶场景有重要的应用前景以及现有的相关综述较少,首先,从传统的显式三维场景表征方法出发,引入场景的隐式表征方法——NeRF,并介绍了NeRF技术的原理;其次,对于NeRF与自动驾驶场景相结合所面临的挑战进行了探讨和分析,其中包括稀疏视角重建、大尺度场景重建、运动场景、训练加速以及合成自动驾驶场景的问题;最后,对NeRF技术进行总结,以及展望其未来发展方向。
中图分类号:
成欢, 王硕, 李孟, 秦伦明, 赵芳. 面向自动驾驶场景的神经辐射场综述[J]. 图学学报, 2023, 44(6): 1091-1103.
CHENG Huan, WANG Shuo, LI Meng, QIN Lun-ming, ZHAO Fang. A review of neural radiance field for autonomous driving scene[J]. Journal of Graphics, 2023, 44(6): 1091-1103.
图1 不同的场景表征方法((a)体素[6];(b)点云[7];(c)网格[8];(d)占用预测网络[9])
Fig. 1 Scene representation methods based ((a) Voxel[6]; (b) Point[7]; (c) Mesh[8]; (d) Occupancy networks[9])
Method | PSNR (dB) | SSIM | LPIPS | Datasets |
---|---|---|---|---|
NeRF[ | 18.56 | 0.557 | 0.554 | KITTI |
NSG[ | 21.53 | 0.673 | 0.254 | KITTI |
pixelNeRF[ | 20.1 | 0.761 | 0.175 | KITTI |
SUDS[ | 22.77 | 0.797 | 0.171 | KITTI |
MARS[ | 24.23 | 0.845 | 0.160 | KITTI |
Urban-NeRF[ | 21.49 | 0.661 | 0.491 | NuScenes |
Mip-NeRF[ | 18.22 | 0.655 | 0.421 | NuScenes |
表1 NeRF及其扩展研究在驾驶场景的性能比较
Table 1 Performance comparison of NeRF and its extensions in driving scene
Method | PSNR (dB) | SSIM | LPIPS | Datasets |
---|---|---|---|---|
NeRF[ | 18.56 | 0.557 | 0.554 | KITTI |
NSG[ | 21.53 | 0.673 | 0.254 | KITTI |
pixelNeRF[ | 20.1 | 0.761 | 0.175 | KITTI |
SUDS[ | 22.77 | 0.797 | 0.171 | KITTI |
MARS[ | 24.23 | 0.845 | 0.160 | KITTI |
Urban-NeRF[ | 21.49 | 0.661 | 0.491 | NuScenes |
Mip-NeRF[ | 18.22 | 0.655 | 0.421 | NuScenes |
Method | Enocode | PSNR (dB) | SSIM | LPIPS | Train time | Iteration (K) |
---|---|---|---|---|---|---|
NeRF[ | 位置编码 | 31.01 | 0.947 | 0.081 | >12 h | 300 |
pixelNeRF[ | 位置编码 | - | - | - | >12 h | 400 |
Mip-NeRF[ | 集成位置编码 | 33.09 | 0.961 | 0.043 | ≈6 h | 612 |
GRF[ | 位置编码 | 27.07 | 0.924 | 0.090 | - | - |
Point-NeRF[ | 位置编码 | 33.00 | 0.978 | 0.055 | ≈7 h | 200 |
Instant NGP[ | 哈希编码 | 33.18 | - | - | ≈5 m | 256 |
Plenoxels[ | 位置编码 | 31.71 | 0.958 | 0.050 | ≈11 m | 10 |
DVGO[ | 位置编码 | 31.95 | 0.957 | 0.053 | ≈15 m | 20 |
PlenOctree[ | 位置编码 | 31.71 | 0.958 | 0.053 | >12 h | - |
表2 NeRF及其扩展研究的性能比较
Table 2 Performance comparison of NeRF and its extended reseraches
Method | Enocode | PSNR (dB) | SSIM | LPIPS | Train time | Iteration (K) |
---|---|---|---|---|---|---|
NeRF[ | 位置编码 | 31.01 | 0.947 | 0.081 | >12 h | 300 |
pixelNeRF[ | 位置编码 | - | - | - | >12 h | 400 |
Mip-NeRF[ | 集成位置编码 | 33.09 | 0.961 | 0.043 | ≈6 h | 612 |
GRF[ | 位置编码 | 27.07 | 0.924 | 0.090 | - | - |
Point-NeRF[ | 位置编码 | 33.00 | 0.978 | 0.055 | ≈7 h | 200 |
Instant NGP[ | 哈希编码 | 33.18 | - | - | ≈5 m | 256 |
Plenoxels[ | 位置编码 | 31.71 | 0.958 | 0.050 | ≈11 m | 10 |
DVGO[ | 位置编码 | 31.95 | 0.957 | 0.053 | ≈15 m | 20 |
PlenOctree[ | 位置编码 | 31.71 | 0.958 | 0.053 | >12 h | - |
图3 不同场景采集的相机轨迹((a)前向场景;(b)环绕场景;(c)驾驶场景)
Fig. 3 Camera trajectories in different scenes ((a) Forward-facing scene; (b) Surrounding scene; (c) Driving scene)
图4 自动驾驶场景不同视角下的重建((a)固定视角;(b)稀疏视角)
Fig. 4 Reconstruction of the autonomous driving scene in different viewpoints ((a) Fixed viewpoint; (b) Sparse viewpoint)
图7 NeRF及其扩展研究在运动场景重建效果((a)输入图像;(b) NeRF;(c) NeRF+Time;(d) NSG;(e) SUDS)
Fig. 7 Performance of NeRF and its extensions in reconstructing motion scene ((a) Input image; (b) NeRF; (c) NeRF+Time; (d) NSG; (e) SUDS)
图8 NSG[17]对自动驾驶场景的重建((a)输入场景;(b)场景前景;(c)场景背景;(d)场景重建)
Fig. 8 Reconstruction of autonomous driving scenarios by NSG[17] ((a) Input scene (b) Scene foreground; (c) Scene background; (d) Scene reconstruction)
图9 多分辨率哈希编码[25] ((a)体素顶点的哈希值;(b)查找;(c)线性插值;(d)串联;(e)神经网络)
Fig. 9 Multiresolution Hash encoding[25] ((a) Hashing of voxel vertices; (b) Lookup; (c) Linear interpolation; (d) Concatenation; (e) Neural network)
[1] | MILDENHALL B, SRINIVASAN P P, TANCIK M, et al. NeRF: representing scenes as neural radiance fields for view synthesis[C]// European Conference on Computer Vision. Cham: Springer, 2020: 405-421. |
[2] | MILDENHALL B, SRINIVASAN P P, ORTIZ-CAYON R, et al. Local light field fusion[J]. ACM Transactions on Graphics, 2019, 38(4): 1-14. |
[3] | SITZMANN V, ZOLLHÖFER M, WETZSTEIN G. Scene representation networks: continuous 3D-structure-aware neural scene representations[EB/OL]. [2022-12-12]. https://arxiv.org/abs/1906.01618. |
[4] | KAJIYA J T, VON HERZEN B P. Ray tracing volume densities[C]// The 11th Annual Conference on Computer Graphics and Interactive Techniques. New York: ACM, 1984: 165-174. |
[5] |
KALRA N, PADDOCK S M. Driving to safety: how many Miles of driving would it take to demonstrate autonomous vehicle reliability?[J]. Transportation Research Part A: Policy and Practice, 2016, 94: 182-193.
DOI URL |
[6] | SITZMANN V, ZOLLHÖFER M, WETZSTEIN G. Scene representation networks: continuous 3D-structure-aware neural scene representations[EB/OL]. [2022-12-12]. https://arxiv.org/abs/1906.01618. |
[7] | LIN C H, KONG C, LUCEY S. Learning efficient point cloud generation for dense 3D object reconstruction[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2018, 32(1): 7114-7121. |
[8] | KANAZAWA A, TULSIANI S, EFROS A A, et al. Learning category-specific mesh reconstruction from image collections[C]// European Conference on Computer Vision. Cham: Springer, 2018: 371-386. |
[9] | MESCHEDER L, OECHSLE M, NIEMEYER M, et al. Occupancy networks: learning 3D reconstruction in function space[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 4460-4470. |
[10] | PARK J J, FLORENCE P R, STRAUB J, et al. DeepSDF: learning continuous signed distance functions for shape representation[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 165-174. |
[11] | SHAN Q, SUSSKIND J, SANKAR A, et al. Neural rendering: US20210248811[P]. 2021-08-12. |
[12] | TANCIK M, SRINIVASAN P P, MILDENHALL B, et al. Fourier features let networks learn high frequency functions in low dimensional domains[EB/OL]. [2022-12-22]. https://arxiv.org/abs/2006.10739. |
[13] | TEWARI A, THIES J, MILSENHALL B, et al. Advances in neural rendering[J]. Computer Graphics Forum: Journal of the European Association for Computer Graphics, 2022(2):41. |
[14] | SCHÖNBERGER J L, FRAHM J M. Structure-from-motion revisited[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 4104-4113. |
[15] | GEIGER A, LENZ P, URTASUN R. Are we ready for autonomous driving? The KITTI vision benchmark suite[C]// 2012 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2012: 3354-3361. |
[16] | CAESAR H, BANKITI V, LANG A H, et al. nuScenes: a multimodal dataset for autonomous driving[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 11621-11631. |
[17] | OST J, MANNAN F, THUEREY N, et al. Neural scene graphs for dynamic scenes[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 2855-2864. |
[18] | YU A, YE V, TANCIK M, et al. pixelNeRF: neural radiance fields from one or few images[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 4576-4585. |
[19] | TURKI H, ZHANG J Y, FERRONI F, et al. SUDS: scalable urban dynamic scenes[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 12375-12385. |
[20] | WU Z R, LIU T Y, LUO L Y, et al. MARS: an instance-aware, modular and realistic simulator for autonomous driving[EB/OL]. [2023-02-13]. https://arxiv.org/abs/2307.15058. |
[21] | REMATAS K, LIU A, SRINIVASAN P, et al. Urban radiance fields[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 12932-12942 |
[22] | BARRON J T, MILDENHALL B, TANCIK M, et al. Mip-NeRF: a multiscale representation for anti-aliasing neural radiance fields[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2022: 5835-5844. |
[23] | TREVITHICK A, YANG B. GRF: learning a general radiance field for 3D representation and rendering[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 15182-15192. |
[24] | XU Q G, XU Z X, PHILIP J, et al. Point-NeRF: point-based neural radiance fields[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 5428-5438. |
[25] | MÜLLER T, EVANS A, SCHIED C, et al. Instant neural graphics primitives with a multiresolution hash encoding[EB/OL]. [2022-12-12]. https://arxiv.org/abs/2201.05989. |
[26] | YU A, FRIDOVICH-KEIL S, TANCIK M, et al. Plenoxels: radiance fields without neural networks[EB/OL]. [2022-12-12]. https://arxiv.org/abs/2112.05131. |
[27] | SUN C, SUN M, CHEN H T. Direct voxel grid optimization: super-fast convergence for radiance fields reconstruction[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 5459-5469. |
[28] | YU A, LI R L, TANCIK M, et al. PlenOctrees for real-time rendering of neural radiance fields[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 5752-5761. |
[29] | BARRON J T, MILDENHALL B, VERBIN D, et al. Mip-NeRF 360: unbounded anti-aliased neural radiance fields[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 5460-5469. |
[30] | CHEN A P, XU Z X, ZHAO F Q, et al. MVSNeRF: fast generalizable radiance field reconstruction from multi-view stereo[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2022: 14104-14113. |
[31] | GAO K, GAO Y N, HE H J, et al. NeRF: neural radiance field in 3D vision, A comprehensive review[EB/OL]. [2022-12-22]. https://arxiv.org/abs/2210.00379. |
[32] | TURKI H, RAMANAN D, SATYANARAYANAN M. Mega-NeRF: scalable construction of large-scale NeRFs for virtual fly- throughs[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 12912-12921. |
[33] | TANCIK M, CASSER V, YAN X C, et al. Block-NeRF: scalable large scene neural view synthesis[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 8238-8248. |
[34] | MARTIN-BRUALLA R, RADWAN N, SAJJADI M S M, et al. NeRF in the wild: neural radiance fields for unconstrained photo collections[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 7210-7219. |
[35] | MEULEMAN A, LIU Y L, GAO C, et al. Progressively optimized local radiance fields for robust view synthesis[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 16539-16548. |
[36] | LI Z P, LI L, MA Z Y, et al. READ: large-scale neural scene rendering for autonomous driving[EB/OL]. [2022-12-22]. https://arxiv.org/abs/2205.05509. |
[37] | XIE Z Y, ZHANG J G, LI W Y, et al. S-NeRF: neural radiance fields for street views[EB/OL]. [2023-01-22]. https://arxiv.org/abs/2303.00749. |
[38] | WANG Z A, SHEN T C, GAO J, et al. Neural fields meet explicit geometric representation for inverse rendering of urban scenes[EB/OL]. [2023-01-12]. https://arxiv.org/abs/2304.03266. |
[39] | XIANGLI Y B, XU L N, PAN X G, et al. BungeeNeRF: progressive neural radiance field for extreme multi-scale scene rendering[C]// European Conference on Computer Vision. Cham: Springer, 2022: 106-122. |
[40] | XU L N, XIANGLI Y B, PENG S D, et al. Grid-guided neural radiance fields for large urban scenes[EB/OL]. [2023-01-12]. https://arxiv.org/abs/2303.14001. |
[41] | PUMAROLA A, CORONA E, PONS-MOLL G, et al. D-NeRF: neural radiance fields for dynamic scenes[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021:10318-10327. |
[42] | TRETSCHK E, TEWARI A, GOLYANIK V, et al. Non-rigid neural radiance fields: reconstruction and novel view synthesis of a dynamic scene from monocular video[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2022: 12939-12950. |
[43] | LI Z Q, NIKLAUS S, SNAVELY N, et al. Neural scene flow fields for space-time view synthesis of dynamic scenes[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 6498-6508. |
[44] | DU Y L, ZHANG Y N, YU H X, et al. Neural radiance flow for 4D view synthesis and video processing[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 14304-14314. |
[45] | GAO C, SARAF A, KOPF J, et al. Dynamic view synthesis from dynamic monocular video[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 5712-5721. |
[46] | LIU Y L, GAO C, MEULEMAN A, et al. Robust dynamic radiance fields[EB/OL]. [2023-01-22]. https://arxiv.org/abs/2301.02239. |
[47] | TEED Z, DENG J. RAFT: recurrent all-pairs field transforms for optical flow[C]// European Conference on Computer Vision. Cham: Springer, 2020: 402-419. |
[48] | PARK K, SINHA U, BARRON J T, et al. Nerfies: deformable neural radiance fields[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2022: 5845-5854. |
[49] |
KULIS B, JAIN P, GRAUMAN K. Fast similarity search for learned metrics[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009, 31(12): 2143-2157.
DOI PMID |
[50] | RAMAMOORTHI R, HANRAHAN P. An efficient representation for irradiance environment maps[C]// The 28th Annual Conference on COMPUTER GRAPHICS and Interactive Techniques. New York: ACM, 2001: 497-500. |
[51] | CHEN Z Q, FUNKHOUSER T, HEDMAN P, et al. MobileNeRF: exploiting the polygon rasterization pipeline for efficient neural field rendering on mobile architectures[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 16569-16578. |
[52] | WANG F, TAN S N, LI X H, et al. Mixed neural voxels for fast multi-view video synthesis[EB/OL]. [2022-12-12]. https://arxiv.org/abs/2212.00190. |
[53] | FRIDOVICH-KEIL S, MEANTI G, WARBURG F, et al. K-planes: explicit radiance fields in space, time, and appearance[EB/OL]. [2023-01-22]. https://arxiv.org/abs/2301.10241. |
[54] | CAO A, JOHNSON J. HexPlane: a fast representation for dynamic scenes[EB/OL]. [2023-01-22]. https://arxiv.org/abs/2301.09632. |
[55] | DOSOVITSKIY A, ROS G, CODEVILLA F, et al. CARLA: an open urban driving simulator[EB/OL]. [2022-12-12]. https://arxiv.org/abs/1711.03938. |
[56] | YANG Z, CHEN Y, WANG J K, et al. UniSim: a neural closed-loop sensor simulator[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 1389-1399. |
[57] | LU F, XU Y, CHEN G, et al. Urban radiance field representation with deformable neural mesh primitives[EB/OL]. [2023-01-22]. https://arxiv.org/abs/2307.10776. |
[1] | 王稚儒, 常远, 鲁鹏, 潘成伟 .
神经辐射场加速算法综述
[J]. 图学学报, 2024, 45(1): 1-13. |
[2] | 黄家晖, 穆太江.
动态三维场景重建研究综述
[J]. 图学学报, 2024, 45(1): 14-25. |
[3] | 石敏, 王炳祺, 李兆歆, 朱登明.
一种带高光处理的无缝纹理映射方法
[J]. 图学学报, 2024, 45(1): 148-158. |
[4] | 周婧怡, 张栖桐, 冯结青.
基于混合结构的多视图三维场景重建
[J]. 图学学报, 2024, 45(1): 199-208. |
[5] | 王江安, 黄乐, 庞大为, 秦林珍, 梁温茜.
基于自适应聚合循环递归的稠密点云重建网络
[J]. 图学学报, 2024, 45(1): 230-239. |
[6] | 范腾, 杨浩, 尹稳, 周冬明. 基于神经辐射场的多尺度视图合成研究[J]. 图学学报, 2023, 44(6): 1140-1148. |
[7] | 薛皓玮, 王美丽. 融合生物力学约束与多模态数据的手部重建[J]. 图学学报, 2023, 44(4): 794-800. |
[8] | 葛海明, 张维, 王小龙, 朱晶晶, 贾非, 薛亚东. 基于SfM的城市电缆隧道三维重建方法优化研究[J]. 图学学报, 2023, 44(3): 540-550. |
[9] | 王江安, 庞大为, 黄 乐, 秦林珍. 基于多尺度特征递归卷积的稠密点云重建网络 [J]. 图学学报, 2022, 43(5): 875-883. |
[10] | 白静, 孟庆亮, 徐昊, 范有福, 杨瞻源. ST-Rec3D:基于结构和目标感知的三维重建[J]. 图学学报, 2022, 43(3): 469-477. |
[11] | 常远, 盖孟. 基于神经辐射场的视点合成算法综述[J]. 图学学报, 2021, 42(3): 376-384. |
[12] | 王 浩 1, 邵 堃 1, 霍 星 2, 杨 鹏 1, 檀结庆 2 . 一种基于极大特征点的三维椎骨分割方法[J]. 图学学报, 2019, 40(1): 40-45. |
[13] | 张志林, 苗兰芳 . 基于深度图像的三维场景重建系统[J]. 图学学报, 2018, 39(6): 1123-1129. |
[14] | 邹 晓1, 陈正鸣1,2, 朱红强1, 童 晶1,2. 基于移动平台的三维虚拟试发型系统实现及应用[J]. 图学学报, 2018, 39(2): 309-316. |
[15] | 罗昌平1,2, 夏海波1, 赵 琼3. 基于运动视差的三维重建技术在自导引小车中的应用[J]. 图学学报, 2017, 38(4): 615-622. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||