图学学报 ›› 2023, Vol. 44 ›› Issue (6): 1140-1148.DOI: 10.11996/JG.j.2095-302X.2023061140
收稿日期:
2023-06-27
接受日期:
2023-09-12
出版日期:
2023-12-31
发布日期:
2023-12-17
通讯作者:
周冬明(1963-),男,教授,博士。主要研究方向为基于深度学习的图像处理、基于机器学习的生物信息处理和计算机视觉等。作者简介:
范腾(1995-),男,硕士研究生。主要研究方向为计算机图形学、基于深度学习的图像处理。Email:fanteng@mail.ynu.edu.cn
基金资助:
FAN Teng(), YANG Hao, YIN Wen, ZHOU Dong-ming(
)
Received:
2023-06-27
Accepted:
2023-09-12
Online:
2023-12-31
Published:
2023-12-17
Contact:
ZHOU Dong-ming (1963-), professor, Ph.D. His main research interests cover image processing based on deep learning, biological information processing based on machine learning and compute vision, etc. E-mail:About author:
FAN Teng (1995-), master student. His main research interests cover computer graphics, image processing based on deep learning.
E-mail:fanteng@mail.ynu.edu.cn
Supported by:
摘要:
针对神经辐射场(NeRF)在多尺度的视图合成任务中产生模糊和锯齿的问题,提出一种融合不同尺度的视图特征和视点特征作为先验提高合成目标视图质量的多尺度神经辐射场(MS-NeRF)。首先,对于不同尺度的目标视图,利用多级小波卷积神经网络提取目标视图特征,将视图特征作为先验对网络合成目标场景视图进行监督。其次,扩大视点相机发出的光线在目标视图像素点上的采样面积,避免在每个像素上只对单束光线进行采样导致渲染结果产生模糊和锯齿。最后,在训练时加入不同尺度的视图特征和视点特征,提升网络合成不同尺度视图的泛化能力,并利用渐进式结构的深度神经网络拟合视图特征和视点特征到目标视图的映射关系。实验结果表明,与相关方法相比,MS-NeRF减少了训练成本,提升了合成目标视图的视觉效果。
中图分类号:
范腾, 杨浩, 尹稳, 周冬明. 基于神经辐射场的多尺度视图合成研究[J]. 图学学报, 2023, 44(6): 1140-1148.
FAN Teng, YANG Hao, YIN Wen, ZHOU Dong-ming. Multi-scale view synthesis based on neural radiance field[J]. Journal of Graphics, 2023, 44(6): 1140-1148.
图4 MS-NeRF与MipNeRF[13],BungeeNeRF[14]和MS-NeRF方法合成效果对比((a)~(c)近距离视图;(d)~(e)远距离视图)
Fig. 4 Image quality comparisons between MipNeRF[13], BungeeNeRF[14] and MS-NeRF ((a)~(c) Close-up view; (d)~(e) Remote view)
方法 | Transamerica (PSNR↑) | Transamerica (Avg) | |||||
---|---|---|---|---|---|---|---|
Stage Ⅰ | Stage Ⅱ | Stage Ⅲ | Stage Ⅳ | PSNR↑ | SSIM↑ | LPIPS↓ | |
NeRF[ | 22.71 | 22.81 | 22.97 | 21.58 | 22.64 | 0.69 | 0.59 |
MipNeRF[ | 23.25 | 23.37 | 22.70 | 21.56 | 20.22 | 0.46 | 0.60 |
BungeeNeRF[ | 23.36 | 23.37 | 23.11 | 23.57 | 22.61 | 0.67 | 0.46 |
Ours | 24.19 | 23.99 | 24.18 | 24.75 | 23.53 | 0.74 | 0.39 |
表1 MS-NeRF与MipNeRF[13],BungeeNeRF[14]和MS-NeRF方法评价指标对比
Table 1 Evaluation metrics comparisons between NeRF[5], MipNeRF[13], BungeeNeRF[14] and MS-NeRF
方法 | Transamerica (PSNR↑) | Transamerica (Avg) | |||||
---|---|---|---|---|---|---|---|
Stage Ⅰ | Stage Ⅱ | Stage Ⅲ | Stage Ⅳ | PSNR↑ | SSIM↑ | LPIPS↓ | |
NeRF[ | 22.71 | 22.81 | 22.97 | 21.58 | 22.64 | 0.69 | 0.59 |
MipNeRF[ | 23.25 | 23.37 | 22.70 | 21.56 | 20.22 | 0.46 | 0.60 |
BungeeNeRF[ | 23.36 | 23.37 | 23.11 | 23.57 | 22.61 | 0.67 | 0.46 |
Ours | 24.19 | 23.99 | 24.18 | 24.75 | 23.53 | 0.74 | 0.39 |
网络结构 | PSNR(Avg) ↑ | SSIM(Avg) ↑ | LPIPS(Avg) ↓ |
---|---|---|---|
NeRF[ | 28.99 | 0.86 | 0.18 |
MipNeRF[ | 28.26 | 0.80 | 0.20 |
BungeeNeRF[ | 28.78 | 0.84 | 0.18 |
MS-NeRF | 29.05 | 0.84 | 0.18 |
表2 Blender Synthetic Ship数据集测量指标
Table 2 Evaluation metrics on Blender Synthetic Ship
网络结构 | PSNR(Avg) ↑ | SSIM(Avg) ↑ | LPIPS(Avg) ↓ |
---|---|---|---|
NeRF[ | 28.99 | 0.86 | 0.18 |
MipNeRF[ | 28.26 | 0.80 | 0.20 |
BungeeNeRF[ | 28.78 | 0.84 | 0.18 |
MS-NeRF | 29.05 | 0.84 | 0.18 |
网络结构 | PSNR↑ | SSIM↑ | LPIPS↓ |
---|---|---|---|
NeRF[ | 11.79 | 0.58 | 0.39 |
NeRF(视图/残差块) | 19.84 | 0.81 | 0.21 |
BungeeNeRF[ | 22.61 | 0.66 | 0.45 |
BungeeNeRF(视图) | 23.54 | 0.74 | 0.39 |
表3 加入视图特征和残差块的性能对比
Table 3 Evaluation metrics of using view features and residual blocks
网络结构 | PSNR↑ | SSIM↑ | LPIPS↓ |
---|---|---|---|
NeRF[ | 11.79 | 0.58 | 0.39 |
NeRF(视图/残差块) | 19.84 | 0.81 | 0.21 |
BungeeNeRF[ | 22.61 | 0.66 | 0.45 |
BungeeNeRF(视图) | 23.54 | 0.74 | 0.39 |
残差块数量 | Transamerica (PSNR↑) | |||
---|---|---|---|---|
StageⅠ | Stage Ⅱ | Stage Ⅲ | Stage Ⅳ | |
2 | 22.63 | 23.49 | 23.75 | 24.12 |
3 | 23.09 | 24.00 | 24.03 | 24.18 |
4 | 23.54 | 24.19 | 24.19 | 24.75 |
表4 残差块数量对合成效果的影响
Table 4 Evaluation metrics of different number of residual blocks
残差块数量 | Transamerica (PSNR↑) | |||
---|---|---|---|---|
StageⅠ | Stage Ⅱ | Stage Ⅲ | Stage Ⅳ | |
2 | 22.63 | 23.49 | 23.75 | 24.12 |
3 | 23.09 | 24.00 | 24.03 | 24.18 |
4 | 23.54 | 24.19 | 24.19 | 24.75 |
[1] | SHUM H Y, HE L W. Rendering with concentric mosaics[C]// The 26th Annual Conference on Computer Graphics and Interactive Techniques. New York: ACM, 1999: 299-306. |
[2] | DEBEVEC P, DOWNING G, BOLAS M, et al. Spherical light field environment capture for virtual reality using a motorized pan/tilt head and offset camera[EB/OL]. (2021-01-20) [2023-01-08]. http://dx.doc.org/10.1145/2787626.2787648. |
[3] | SZELISKI R, SHUM H Y. Creating full view panoramic image mosaics and environment maps[C]// The 24th Annual Conference on Computer Graphics and Interactive Techniques. New York: ACM, 1997: 251-258. |
[4] | 常远, 盖孟. 基于神经辐射场的视点合成算法综述[J]. 图学学报, 2021, 42(3): 376-384. |
CHANG Y, GAI M. A review on neural radiance fields based view synthesis[J]. Journal of Graphics, 2021, 42(3): 376-384 (in Chinese). | |
[5] | MILDENHALL B, SRINIVASAN P P, TANCIK M, et al. NeRF: representing scenes as neural radiance fields for view synthesis[C]// European Conference on Computer Vision. Cham: Springer, 2020: 405-421. |
[6] | MÜLLER T, EVANS A, SCHIED C, et al. Instant neural graphics primitives with a multiresolution hash encoding[J]. ACM Transactions on Graphics, 2022, 41(4): 1-15. |
[7] | REISER C, PENG S Y, LIAO Y Y, et al. KiloNeRF: speeding up neural radiance fields with thousands of tiny MLPs[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 14315-14325. |
[8] | TOLSTIKHIN I, HOULSBY N, KOLESNIKOV A, et al. MLP-mixer: an all-MLP architecture for vision[EB/OL]. [2023-01-08]. https://arxiv.org/abs/2105.01601.pdf. |
[9] | GARBIN S J, KOWALSKI M, JOHNSON M, et al. FastNeRF: high-fidelity neural rendering at 200FPS[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 14326-14335. |
[10] | LIU L J, GU J T, LIN K Z, et al. Neural sparse voxel fields[EB/OL]. [2023-01-08]. https://arxiv.org/abs/2007.11571. |
[11] | YU A, LI R L, TANCIK M, et al. PlenOctrees for real-time rendering of neural radiance fields[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 5732-5741. |
[12] | FRIDOVICH-KEIL S, YU A, TANCIK M, et al. Plenoxels: radiance fields without neural networks[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 5491-5500. |
[13] | BARRON J T, MILDENHALL B, TANCIK M, et al. Mip-NeRF: a multiscale representation for anti-aliasing neural radiance fields[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 5835-5844. |
[14] | XIANGLI Y B, XU L N, PAN X G, et al. BungeeNeRF: progressive neural radiance field for extreme multi-scale scene rendering[C]// European Conference on Computer Vision. Cham: Springer, 2022: 106-122. |
[15] | YU A, YE V, TANCIK M, et al. pixelNeRF: neural radiance fields from one or few images[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 4576-4585. |
[16] |
GUARNERA D, GUARNERA G C, GHOSH A, et al. BRDF representation and acquisition[J]. Computer Graphics Forum, 2016, 35(2): 625-650.
DOI URL |
[17] |
ASMAIL C. Bidirectional scattering distribution function (BSDF): a systematized bibliography[J]. Journal of Research of the National Institute of Standards and Technology, 1991, 96(2): 215-223.
DOI PMID |
[18] | RIBARDIÈRE M, BRINGIER B, SIMONOT L, et al. Microfacet BSDFs generated from NDFs and explicit microgeometry[J]. ACM Transactions on Graphics, 2019, 38(5): 143. 1-143.15. |
[19] | WANG Q Q, WANG Z C, GENOVA K, et al. IBRNet: learning multi-view image-based rendering[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 4688-4697. |
[20] | CHEN A P, XU Z X, ZHAO F Q, et al. MVSNeRF: fast generalizable radiance field reconstruction from multi-view stereo[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 14104-14113. |
[21] | YAO Y, ZIXIN L, SHIWEI L, et al. MVSNet: depth inference for unstructured multi-view stereo[C]// IEEE/CVF Conference on International Conference on Computer Vision. New York: IEEE Press, 2018: 767-783. |
[22] | XU D J, JIANG Y F, WANG P H, et al. SinNeRF: training neural radiance fields on complex scenes from a single image[EB/OL]. [2023-01-13]. https://arxiv.org/abs/2204.00928.pdf. |
[23] | HUANG B C, YI H W, HUANG C, et al. M3VSNET: unsupervised multi-metric multi-view stereo network[C]// 2021 IEEE International Conference on Image Processing. New York: IEEE Press, 2021: 3163-3167. |
[24] | DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: transformers for image recognition at scale[EB/OL]. (2020-10-22) [2023-01-08]. https://arxiv.org/abs/2010.11929.pdf. |
[25] | XU L N, XIANGLI Y B, PENG S D, et al. Grid-guided neural radiance fields for large urban scenes[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 8296-8306. |
[26] | BARRON J T, MILDENHALL B, VERBIN D, et al. Zip-NeRF: anti-aliased grid-based neural radiance fields[EB/OL]. (2023-04-13) [2023-05-08]. https://arxiv.org/abs/2304.06706.pdf. |
[27] |
LIU P J, ZHANG H Z, LIAN W, et al. Multi-level wavelet convolutional neural networks[J]. IEEE Access, 2019, 7: 74973-74985.
DOI |
[28] | SRIVASTAVA N, HINTON G E, KRIZHEVSKY A, et al. Dropout: a simple way to prevent neural networks from overfitting[J]. Journal of Machine Learning Research, 2014, 15: 1929-1958. |
[29] | KINGMA D P, BA J. Adam: a method for stochastic optimization[EB/OL]. [2023-01-13]. https://arxiv.org/pdf/1412.6980.pdf. |
[30] | HORÉ A, ZIOU D. Image quality metrics:PSNR vs. SSIM[C]// 2010 20th International Conference on Pattern Recognition. New York: IEEE Press, 2010: 2366-2369. |
[31] |
WANG Z, BOVIK A C, SHEIKH H R, et al. Image quality assessment: from error visibility to structural similarity[J]. IEEE Transactions on Image Processing: a Publication of the IEEE Signal Processing Society, 2004, 13(4): 600-612.
DOI URL |
[32] | ZHANG R, ISOLA P, EFROS A A, et al. The unreasonable effectiveness of deep features as a perceptual metric[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 586-595. |
[1] | 王稚儒, 常远, 鲁鹏, 潘成伟 .
神经辐射场加速算法综述
[J]. 图学学报, 2024, 45(1): 1-13. |
[2] | 成欢, 王硕, 李孟, 秦伦明, 赵芳. 面向自动驾驶场景的神经辐射场综述[J]. 图学学报, 2023, 44(6): 1091-1103. |
[3] | 夏晓华, 刘希恒, 岳鹏举, 邹易清, 蒋立军. 细节增强的多曝光图像融合方法[J]. 图学学报, 2023, 44(6): 1130-1139. |
[4] | 蒋武君, 支力佳, 张少敏, 周涛. 基于通道残差嵌套U结构的CT影像肺结节分割方法[J]. 图学学报, 2023, 44(5): 879-889. |
[5] | 张晨阳, 曹艳华, 杨晓忠. 基于分数阶小波与引导滤波的多聚焦图像融合方法[J]. 图学学报, 2023, 44(1): 77-87. |
[6] | 常远, 盖孟. 基于神经辐射场的视点合成算法综述[J]. 图学学报, 2021, 42(3): 376-384. |
[7] | 常东良 , 尹军辉 , 谢吉洋 , 孙维亚 , 马占宇 . 面向图像分类的基于注意力引导的 Dropout[J]. 图学学报, 2021, 42(1): 32-36. |
[8] | 张胜虎, 马惠敏 . 遮挡对于目标检测的影响分析[J]. 图学学报, 2020, 41(6): 891-896. |
[9] | 谷昱良, 羿旭明 . 基于小波变换的权重自适应图像分割模型[J]. 图学学报, 2020, 41(5): 733-739. |
[10] | 刘建新 1, 曾 嫱 1, 徐 可 2, 王亚威 1 . 基于形态学和小波变换的烟叶病斑分割[J]. 图学学报, 2018, 39(5): 933-938. |
[11] | 纪 峰, 李泽仁, 常 霞, 吴之亮. 基于PCA 和NSCT 变换的遥感图像融合方法[J]. 图学学报, 2017, 38(2): 247-252. |
[12] | 刘 鑫, 车翔玖, 林森乔. 基于倾角校正的地震层位追踪算法[J]. 图学学报, 2015, 36(3): 418-424. |
[13] | 崔汉国, 刘健鑫, 李 彬. 基于DD-DT CWT 和SIFT 的体数据数字水印算法[J]. 图学学报, 2015, 36(2): 148-151. |
[14] | 朱晓临, 李雪艳, 邢 燕, 陈 嫚, 朱园珠. 基于小波和奇异值分解的图像边缘检测[J]. 图学学报, 2014, 35(4): 563-570. |
[15] | 刘 晙, 茹庆云. 基于快速小波变换的高适应性图像检索技术研究[J]. 图学学报, 2014, 35(2): 262-267. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||