图学学报 ›› 2024, Vol. 45 ›› Issue (5): 1030-1039.DOI: 10.11996/JG.j.2095-302X.2024051030
收稿日期:
2024-07-04
修回日期:
2024-08-10
出版日期:
2024-10-31
发布日期:
2024-10-31
通讯作者:
宋滢(1981-),女,副教授,博士。主要研究方向为真实感图形学、智能图形计算等。E-mail:ysong@zstu.edu.cn第一作者:
朱结(1998-),男,硕士研究生。主要研究方向为真实感图形学。E-mail:1904867640@qq.com
基金资助:
Received:
2024-07-04
Revised:
2024-08-10
Published:
2024-10-31
Online:
2024-10-31
Contact:
SONG Ying (1981-), associate professor, Ph.D. Her main research interests cover photorealistic graphics, intelligent graphics computing, etc. E-mail:ysong@zstu.edu.cnFirst author:
ZHU Jie (1998-), master student. His main research interest covers photorealistic graphics. E-mail:1904867640@qq.com
Supported by:
摘要:
针对目前非受控环境下自由视点合成易受高度可变的照明条件、相机参数等因素影响的问题,提出一种近似可微延迟逆渲染管线(ADDIRP),通过在延迟逆渲染管线中加入基于物理的相机模型,实现准确模拟相机的光学成像过程。首先,根据输入图像及对应位姿分别创建光度相机模型和几何相机模型,其中光度相机模型由曝光、白平衡等可学习参数表示,几何相机模型由可学习的内参和外参表示。其次,利用渲染图像与目标图像的图像空间损失对管线各组件进行优化,使延迟逆渲染管线对复杂多变的光照和粗糙拍摄的图像具有较强的鲁棒性。最终,生成与传统图形引擎兼容的3D内容重建。实验结果表明,与已有方法相比,ADDIRP在现实世界数据集上具有更优的性能,在合成数据集上,在保证合成质量相近的前提下,具有更出色的视觉感知一致性。
中图分类号:
朱结, 宋滢. 基于可微渲染的自由视点合成方法[J]. 图学学报, 2024, 45(5): 1030-1039.
ZHU Jie, SONG Ying. A free viewpoint synthesis method based on differentiable rendering[J]. Journal of Graphics, 2024, 45(5): 1030-1039.
名称 | 规格或型号 |
---|---|
CPU | Intel(R) Xeon(R) Gold 6226R CPU @2.90 GHz |
GPU | NVIDIA GeForce RTX 3090 24 GB |
操作系统 | Ubuntu 20.04.1 LTS |
Pytorch | 1.13.1 |
NVDIFFRAST | 0.3.1 |
Tiny-cuda-nn | 1.7 |
表1 本文实验环境
Table 1 Experimental environment
名称 | 规格或型号 |
---|---|
CPU | Intel(R) Xeon(R) Gold 6226R CPU @2.90 GHz |
GPU | NVIDIA GeForce RTX 3090 24 GB |
操作系统 | Ubuntu 20.04.1 LTS |
Pytorch | 1.13.1 |
NVDIFFRAST | 0.3.1 |
Tiny-cuda-nn | 1.7 |
名称 | 数值 |
---|---|
lr_pos | 3e-2 |
lr_material | 1e-2 |
lr_light | 1e-2 |
lr_pose | 5e-3 |
lr_intrinsic | 1e-2 |
lr_exposure | 1e-4 |
表2 学习率设置
Table 2 Learning rate settings
名称 | 数值 |
---|---|
lr_pos | 3e-2 |
lr_material | 1e-2 |
lr_light | 1e-2 |
lr_pose | 5e-3 |
lr_intrinsic | 1e-2 |
lr_exposure | 1e-4 |
场景 | PSNR↑ | SSIM↑ | LPIPS↓ |
---|---|---|---|
GoldCape(NVDIFFREC) | 24.027 | 0.857 | 0.110 |
GoldCape(Ours) | 23.128 | 0.824 | 0.133 |
EthiopianHead(NVDIFFREC) | 25.738 | 0.915 | 0.109 |
EthiopianHead(Ours) | 25.854 | 0.923 | 0.096 |
Gnome(NVDIFFREC) | 15.699 | 0.783 | 0.217 |
Gnome(Ours) | 24.185 | 0.863 | 0.143 |
Statue(NVDIFFREC) | 18.464 | 0.820 | 0.187 |
Statue(Ours) | 20.746 | 0.845 | 0.162 |
MotherChild(NVDIFFREC) | 17.369 | 0.914 | 0.121 |
MotherChild(Ours) | 27.665 | 0.954 | 0.061 |
表3 与NVDIFFREC逐场景定量比较
Table 3 Quantitative comparisons with NVDIFFREC per scene
场景 | PSNR↑ | SSIM↑ | LPIPS↓ |
---|---|---|---|
GoldCape(NVDIFFREC) | 24.027 | 0.857 | 0.110 |
GoldCape(Ours) | 23.128 | 0.824 | 0.133 |
EthiopianHead(NVDIFFREC) | 25.738 | 0.915 | 0.109 |
EthiopianHead(Ours) | 25.854 | 0.923 | 0.096 |
Gnome(NVDIFFREC) | 15.699 | 0.783 | 0.217 |
Gnome(Ours) | 24.185 | 0.863 | 0.143 |
Statue(NVDIFFREC) | 18.464 | 0.820 | 0.187 |
Statue(Ours) | 20.746 | 0.845 | 0.162 |
MotherChild(NVDIFFREC) | 17.369 | 0.914 | 0.121 |
MotherChild(Ours) | 27.665 | 0.954 | 0.061 |
方法 | PSNR↑ | SSIM↑ | LPIPS↓ |
---|---|---|---|
NeRD | 22.508 | 0.829 | 0.159 |
NeROIC | 25.776 | 0.892 | 0.132 |
NVDIFFREC | 20.259 | 0.858 | 0.149 |
Ours | 24.316 | 0.882 | 0.119 |
表4 现实场景的定量比较
Table 4 Quantitative comparisons of real-world scenes
方法 | PSNR↑ | SSIM↑ | LPIPS↓ |
---|---|---|---|
NeRD | 22.508 | 0.829 | 0.159 |
NeROIC | 25.776 | 0.892 | 0.132 |
NVDIFFREC | 20.259 | 0.858 | 0.149 |
Ours | 24.316 | 0.882 | 0.119 |
方法 | PSNR↑ | SSIM↑ | LPIPS↓ |
---|---|---|---|
NeRD | 25.573 | 0.895 | 0.116 |
NVDIFFREC | 26.046 | 0.936 | 0.083 |
Ours | 25.580 | 0.926 | 0.103 |
表5 合成场景的定量比较
Table 5 Quantitative comparisons of synthetic scenes
方法 | PSNR↑ | SSIM↑ | LPIPS↓ |
---|---|---|---|
NeRD | 25.573 | 0.895 | 0.116 |
NVDIFFREC | 26.046 | 0.936 | 0.083 |
Ours | 25.580 | 0.926 | 0.103 |
[1] | MUNKBERG J, CHEN W Z, HASSELGREN J, et al. Extracting triangular 3D models, materials, and lighting from images[C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 8270-8280. |
[2] | HORRY Y, ANJYO K I, ARAI K. Tour into the picture: using a spidery mesh interface to make animation from a single image[C]// The 24th Annual Conference on Computer Graphics and Interactive Techniques. New York: ACM, 1997: 225-232. |
[3] | OH B M, CHEN M, DORSEY J, et al. Image-based modeling and photo editing[C]// The 28th Annual Conference on Computer Graphics and Interactive Techniques. New York: ACM, 2001: 433-442. |
[4] | ZHANG L, DUGAS-PHOCION G, SAMSON J S, et al. Single-view modelling of free-form scenes[J]. The Journal of Visualization and Computer Animation, 2002, 13(4): 225-235. |
[5] | KHOLGADE N, SIMON T, EFROS A, et al. 3D object manipulation in a single photograph using stock 3D models[J]. ACM Transactions on graphics (TOG), 2014, 33(4): 127. |
[6] | MCMILLAN L. An image-based approach to three-dimensional computer graphics[M]. Chapel Hill: University of North Carolina at Chapel Hill, 1997: 30-59. |
[7] | SUTHERLAND I E, SPROULL R F, SCHUMACKER R A. A characterization of ten hidden-surface algorithms[J]. ACM Computing Surveys (CSUR), 1974, 6(1): 1-55. |
[8] | LEE P J, EFFENDI . Nongeometric distortion smoothing approach for depth map preprocessing[J]. IEEE Transactions on Multimedia, 2011, 13(2): 246-254. |
[9] | CHEN S E, WILLIAMS L. View interpolation for image synthesis[C]// The 20th Annual Conference on Computer Graphics and Interactive Techniques. New York: ACM, 1993: 279-288. |
[10] | 杨金钟, 刘政凯, 俞能海, 等. 基于控制点的图象变形方法及其应用[J]. 中国图象图形学报, 2001, 6A(11): 1070-1074. |
YANG J Z, LIU Z K, YU N H, et al. An image warping method based on control points and its applications[J]. Journal of Image and Graphics, 2001, 6A(11): 1070-1074 (in Chinese). | |
[11] | CHEN S E. QuickTime VR: an image-based approach to virtual environment navigation[C]// The 22nd Annual Conference on Computer Graphics and Interactive Techniques. New York: ACM, 1995: 29-38. |
[12] | SHUM H Y, HE L W. Rendering with concentric mosaics[C]// The 26th Annual Conference on Computer Graphics and Interactive Techniques. New York: ACM, 1999: 299-306. |
[13] | GOODFELLOW I J, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial nets[C]// The 27th International Conference on Neural Information Processing Systems. New York: ACM, 2014: 2672-2680. |
[14] |
成欢, 王硕, 李孟, 等. 面向自动驾驶场景的神经辐射场综述[J]. 图学学报, 2023, 44(6): 1091-1103.
DOI |
CHENG H, WANG S, LI M, et al. A review of neural radiance field for autonomous driving scene[J]. Journal of Graphics, 2023, 44(6): 1091-1103 (in Chinese).
DOI |
|
[15] |
王稚儒, 常远, 鲁鹏, 等. 神经辐射场加速算法综述[J]. 图学学报, 2024, 45(1): 1-13.
DOI |
WANG Z R, CHANG Y, LU P, et al. A review on neural radiance fields acceleration[J]. Journal of Graphics, 2024, 45(1): 1-13 (in Chinese).
DOI |
|
[16] | CHOI J, JUNG D, LEE T, et al. TMO: textured mesh acquisition of objects with a mobile device by using differentiable rendering[C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 16674-16684. |
[17] | WU T, WANG J Q, PAN X G, et al. Voxurf: voxel-based efficient and accurate neural surface reconstruction[EB/OL]. (2023-08-13) [2024-06-06]. https://dblp.uni-trier.de/db/conf/iclr/iclr2023.html#WuWPXTLL23. |
[18] | XU Q G, XU Z X, PHILIP J, et al. Point-neRF: point-based neural radiance fields[C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 5428-5438. |
[19] | HU T, XU X G, LIU S, et al. Point2pix: photo-realistic point cloud rendering via neural radiance fields[C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 8349-8358. |
[20] | MILDENHALL B, SRINIVASAN P P, TANCIK M, et al. NeRF: representing scenes as neural radiance fields for view synthesis[J]. Communications of the ACM, 2021, 65(1): 99-106. |
[21] | RÜCKERT D, FRANKE L, STAMMINGER M. ADOP: approximate differentiable one-pixel point rendering[J]. ACM Transactions on Graphics (TOG), 2022, 41(4): 99. |
[22] | MESCHEDER L, OECHSLE M, NIEMEYER M, et al. Occupancy networks: learning 3D reconstruction in function space[C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 4455-4465. |
[23] | FENG Q, LIU Y B, LAI Y K, et al. FOF: learning fourier occupancy field for monocular real-time human reconstruction[C]// The 36th International Conference on Neural Information Processing Systems. New York: ACM, 2022: 537. |
[24] | JIANG H C, XU Y M, ZENG Y H, et al. OpenOcc: open vocabulary 3D scene reconstruction via occupancy representation[EB/OL]. (2024-05-18) [2024-06-06]. https://arxiv.org/abs/2403.11796. |
[25] | SHIM J, KANG C, JOO K. Diffusion-based signed distance fields for 3D shape generation[C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 20887-20897. |
[26] | YENAMANDRA T, TEWARI A, YANG N, et al. FIRe: fast inverse rendering using directional and signed distance functions[C]// IEEE/CVF Winter Conference on Applications of Computer Vision. New York: IEEE Press, 2024: 3065-3075. |
[27] | LIU W X, WU Y W, RUAN S P, et al. Marching-primitives: shape abstraction from signed distance function[C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 8771-8780. |
[28] | SHEN T C, GAO J, YIN K X, et al. Deep marching tetrahedra: a hybrid representation for high-resolution 3D shape synthesis[C]// The 35th International Conference on Neural Information Processing Systems. New York: ACM, 2021: 466. |
[29] | LAINE S, HELLSTEN J, KARRAS T, et al. Modular primitives for high-performance differentiable rendering[J]. ACM Transactions on Graphics (TOG), 2020, 39(6): 194. |
[30] |
ENGEL J, KOLTUN V, CREMERS D. Direct sparse odometry[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(3): 611-625.
DOI PMID |
[31] | RÜCKERT D, STAMMINGER M. Snake-SLAM: efficient global visual inertial SLAM using decoupled nonlinear optimization[C]// 2021 International Conference on Unmanned Aircraft Systems. New York: IEEE Press, 2021: 219-228. |
[32] | BOSS M, BRAUN R, JAMPANI V, et al. Nerd: neural reflectance decomposition from image collections[C]// IEEE/ CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 12664-12674. |
[33] | KUANG Z F, OLSZEWSKI K, CHAI M L, et al. NeROIC: neural rendering of objects from online image collections[J]. ACM Transactions on Graphics (TOG), 2022, 41(4): 56. |
[1] | 许丹丹, 崔勇, 张世倩, 刘雨聪, 林予松. 优化医学影像三维渲染可视化效果:技术综述[J]. 图学学报, 2024, 45(5): 879-891. |
[2] | 王亚茹, 冯利龙, 宋晓轲, 屈卓, 杨珂, 王乾铭, 翟永杰. TFD-YOLOv8:一种用于输电线路的异物检测方法[J]. 图学学报, 2024, 45(5): 901-912. |
[3] | 吴沛宸, 袁立宁, 胡皓, 刘钊, 郭放. 基于注意力特征融合的视频异常行为检测[J]. 图学学报, 2024, 45(5): 922-929. |
[4] | 翟永杰, 李佳蔚, 陈年昊, 王乾铭, 王新颖. 融合改进Transformer的车辆部件检测方法[J]. 图学学报, 2024, 45(5): 930-940. |
[5] | 姜晓恒, 段金忠, 卢洋, 崔丽莎, 徐明亮. 融合先验知识推理的表面缺陷检测[J]. 图学学报, 2024, 45(5): 957-967. |
[6] | 彭文, 林金炜. 基于空间信息关注和纹理增强的短小染色体分类方法[J]. 图学学报, 2024, 45(5): 1017-1029. |
[7] | 孙己龙, 刘勇, 周黎伟, 路鑫, 侯小龙, 王亚琼, 王志丰. 基于DCNv2和Transformer Decoder的隧道衬砌裂缝高效检测模型研究[J]. 图学学报, 2024, 45(5): 1050-1061. |
[8] | 刘宗明, 洪唯, 龙睿, 祝越, 张小宇. 基于自注意机制的乳源瑶绣自动生成与应用研究[J]. 图学学报, 2024, 45(5): 1096-1105. |
[9] | 胡欣, 常娅姝, 秦皓, 肖剑, 程鸿亮. 基于改进YOLOv8和GMM图像点集匹配的双目测距方法[J]. 图学学报, 2024, 45(4): 714-725. |
[10] | 张晨阳, 曹艳华, 杨晓忠. 基于分数阶小波与引导滤波的多聚焦图像融合方法[J]. 图学学报, 2023, 44(1): 77-87. |
[11] | 王佳婧, 王晨, 朱媛媛, 王笑梅. 基于民国纸币的图元素匹配检索[J]. 图学学报, 2023, 44(3): 492-501. |
[12] | 朱天晓, 闫丰亭, 史志才. 特征保持的区域分级网格简化算法[J]. 图学学报, 2023, 44(3): 570-578. |
[13] | 杨柳, 吴晓群. 基于深度学习的三维形状补全研究综述[J]. 图学学报, 2023, 44(2): 201-215. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||