图学学报 ›› 2024, Vol. 45 ›› Issue (5): 1008-1016.DOI: 10.11996/JG.j.2095-302X.2024051008
收稿日期:
2024-04-09
修回日期:
2024-07-22
出版日期:
2024-10-31
发布日期:
2024-10-31
通讯作者:
王云艳(1981-),女,副教授,博士。主要研究方向为图像处理和遥感解译。E-mail:helen9224@126.com第一作者:
熊超(1998-),男,硕士研究生。主要研究方向为图像处理和三维重建。E-mail:102210294@hbut.edu.cn
基金资助:
XIONG Chao1(), WANG Yunyan1,2(
), LUO Yuhao1
Received:
2024-04-09
Revised:
2024-07-22
Published:
2024-10-31
Online:
2024-10-31
Contact:
WANG Yunyan (1981-),associate professor, Ph.D. Her main research interests cover image processing and remote sensing interpretation. E-mail:helen9224@126.comFirst author:
XIONG Chao (1998-), master student. His main research interests cover image processing and 3D reconstruction. E-mail:102210294@hbut.edu.cn
Supported by:
摘要:
针对三维重建对细小特征及边缘区域重建欠佳的问题,提出了一个基于特征对齐与上下文引导的多视图三维重建网络,即AGA-MVSNet。首先,构建了一个特征对齐模块(FA)与特征选择模块(FS),能够将特征金字塔不同层级的特征先对齐之后再进行融合,提高对小尺寸物体和边缘区域的特征提取能力;然后,在代价体正则化中加入了一个上下文引导模块,该模块能够在略微增加运行内存的情况下充分利用周围信息,增强成本体积之间的相关性,提高三维重建的精度与完整度;最后,在DTU数据集上进行了实验,实验结果表明,该方法相比于基准网络CasMVSNet精度提升了2.2%,整体重建质量提升了2.5%。此外,在Tanks and Temples数据集上的表现相较一些已知的方法也十分优异,且在BlendedMVS数据集上也生成了不错的点云效果。
中图分类号:
熊超, 王云艳, 罗雨浩. 特征对齐与上下文引导的多视图三维重建[J]. 图学学报, 2024, 45(5): 1008-1016.
XIONG Chao, WANG Yunyan, LUO Yuhao. Multi-view stereo network reconstruction with feature alignment and context-guided[J]. Journal of Graphics, 2024, 45(5): 1008-1016.
Method | Acc | Comp | Overall |
---|---|---|---|
Gipuma | 0.283 1 | 0.873 2 | 0.578 |
MVSNet | 0.456 2 | 0.646 2 | 0.551 |
R-MVSNet | 0.387 4 | 0.443 5 | 0.415 |
P-MVSNet | 0.396 2 | 0.414 5 | 0.405 |
CVPMVSNet | 0.334 5 | 0.407 5 | 0.371 |
CasMVSNet | 0.383 2 | 0.378 8 | 0.381 |
Ours | 0.361 2 | 0.350 8 | 0.356 |
表1 DTU数据集上的定量评估结果/mm
Table 1 Quantitative evaluation results on DTU dataset/mm
Method | Acc | Comp | Overall |
---|---|---|---|
Gipuma | 0.283 1 | 0.873 2 | 0.578 |
MVSNet | 0.456 2 | 0.646 2 | 0.551 |
R-MVSNet | 0.387 4 | 0.443 5 | 0.415 |
P-MVSNet | 0.396 2 | 0.414 5 | 0.405 |
CVPMVSNet | 0.334 5 | 0.407 5 | 0.371 |
CasMVSNet | 0.383 2 | 0.378 8 | 0.381 |
Ours | 0.361 2 | 0.350 8 | 0.356 |
Method | 内存消耗/MB | 运行时间/s | 整体性误差/mm |
---|---|---|---|
CasMVSNet | 6 208 | 0.784 | 0.381 |
Ours | 7 339 | 0.927 | 0.356 |
表2 训练成本分析
Table 2 Training cost
Method | 内存消耗/MB | 运行时间/s | 整体性误差/mm |
---|---|---|---|
CasMVSNet | 6 208 | 0.784 | 0.381 |
Ours | 7 339 | 0.927 | 0.356 |
Method | Acc | Comp | Overall |
---|---|---|---|
NeuS | 1.202 7 | 0.874 3 | 1.038 |
Ours | 0.361 2 | 0.350 8 | 0.356 |
表3 与NeuS的定量评估结果/mm
Table 3 Quantitative evaluation results with NeuS/mm
Method | Acc | Comp | Overall |
---|---|---|---|
NeuS | 1.202 7 | 0.874 3 | 1.038 |
Ours | 0.361 2 | 0.350 8 | 0.356 |
Methods | Fam | Fra | Hor | Lig | M60 | Pan | Pla | Tra | Mean |
---|---|---|---|---|---|---|---|---|---|
MVSNet | 55.89 | 28.65 | 25.14 | 51.77 | 54.01 | 50.55 | 47.85 | 35.21 | 43.62 |
P-MVSNet | 69.95 | 44.54 | 40.16 | 65.33 | 54.98 | 55.21 | 60.33 | 54.32 | 55.60 |
CVP-MVSNet | 74.15 | 47.68 | 36.44 | 54.18 | 57.33 | 53.28 | 57.44 | 46.21 | 53.33 |
CasMVSNet | 73.33 | 54.45 | 43.17 | 52.43 | 53.16 | 51.07 | 54.25 | 43.36 | 53.15 |
Ours | 76.33 | 58.21 | 46.32 | 55.23 | 55.96 | 54.18 | 58.27 | 46.47 | 56.37 |
表4 Tanks and Temples数据集定量结果
Table 4 Quantitative results for the Tanks and Temples dataset
Methods | Fam | Fra | Hor | Lig | M60 | Pan | Pla | Tra | Mean |
---|---|---|---|---|---|---|---|---|---|
MVSNet | 55.89 | 28.65 | 25.14 | 51.77 | 54.01 | 50.55 | 47.85 | 35.21 | 43.62 |
P-MVSNet | 69.95 | 44.54 | 40.16 | 65.33 | 54.98 | 55.21 | 60.33 | 54.32 | 55.60 |
CVP-MVSNet | 74.15 | 47.68 | 36.44 | 54.18 | 57.33 | 53.28 | 57.44 | 46.21 | 53.33 |
CasMVSNet | 73.33 | 54.45 | 43.17 | 52.43 | 53.16 | 51.07 | 54.25 | 43.36 | 53.15 |
Ours | 76.33 | 58.21 | 46.32 | 55.23 | 55.96 | 54.18 | 58.27 | 46.47 | 56.37 |
Method | Acc | Comp | Overall |
---|---|---|---|
CasMVSNet | 0.383 2 | 0.378 8 | 0.381 |
CasMVSNet+FA&FS | 0.373 2 | 0.369 5 | 0.371 |
CasMVSNet+Guide | 0.365 4 | 0.366 7 | 0.366 |
Ours | 0.361 2 | 0.350 8 | 0.356 |
表5 消融评价指标的比较/mm
Table 5 Comparison of ablation evaluation indicators/mm
Method | Acc | Comp | Overall |
---|---|---|---|
CasMVSNet | 0.383 2 | 0.378 8 | 0.381 |
CasMVSNet+FA&FS | 0.373 2 | 0.369 5 | 0.371 |
CasMVSNet+Guide | 0.365 4 | 0.366 7 | 0.366 |
Ours | 0.361 2 | 0.350 8 | 0.356 |
[1] | LIU Y P, JING T, QU Q, et al. An augmented-reality holographic stereogram based on 3D optical field information manipulation and reconstruction[J]. Frontiers in Physics, 2022, 9: 828825. |
[2] | 顾珈静, 刘春, 周骁腾, 等. 基于视觉的明清古家具数字文化档案高精度三维重建[J]. 文物保护与考古科学, 2022, 34(2): 22-30. |
GU J J, LIU C, ZHOU X T, et al. A vision-based high-accuracy 3D-reconstruction method for digital cultural archives of Ming and Qing furniture[J]. Sciences of Conservation and Archaeology, 2022, 34(2): 22-30 (in Chinese). | |
[3] | 李露, 孟偲, 武永昌, 等. 无人机对地观测的遥感场景三维重建实验方案[J]. 实验室研究与探索, 2023, 42(3): 235-240. |
LI L, MENG C, WU Y C, et al. Experimental scheme of 3D reconstruction of remote sensing scenes based on ground observation by UAV[J]. Research and Exploration in Laboratory, 2023, 42(3): 235-240 (in Chinese). | |
[4] | 段永飞, 倪建鑫, 严奉奇, 等. 肾三维重建在结节性硬化症相关肾血管平滑肌脂肪瘤治疗中的应用综述[J]. 解放军医学院学报, 2023, 44(2): 173-176. |
DUAN Y F, NI J X, YAN F Q, et al. Research advances in three-dimensional renal reconstruction in treatment of tuberous sclerosis associated renal angiomyolipoma[J]. Academic Journal of Chinese PLA Medical School, 2023, 44(2): 173-176 (in Chinese). | |
[5] | 马建红, 王稀瑶, 陈永霞, 等. 自动驾驶中图像与点云融合方法研究综述[J]. 郑州大学学报: 理学版, 2022, 54(6): 24-33. |
MA J H, WANG X Y, CHEN Y X, et al. A review of research on image and point cloud fusion methods in automatic driving[J]. Journal of Zhengzhou University: Natural Science Edition, 2022, 54(6): 24-33 (in Chinese). | |
[6] | FU Q C, XU Q S, ONG Y S, et al. Geo-Neus: geometry-consistent neural implicit surfaces learning for multi-view reconstruction[C]// The 36th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2022: 246. |
[7] | XIAO Y X, XUE N, WU T F, et al. Level-S2fM: structure from motion on neural level set of implicit surfaces[C]// The IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 17205-17214. |
[8] | TRUONG P, RAKOTOSAONA M J, MANHARDT F, et al. SPARF: neural radiance fields from sparse and noisy poses[C]// The IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 4190-4200. |
[9] | HUANG S S, ZOU Z X, ZHANG Y C, et al. SC-NeuS: consistent neural surface reconstruction from sparse and noisy views[C]// The 38th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2024: 2357-2365. |
[10] | YAO Y, LUO Z X, LI S W, et al. MVSNet: depth inference for unstructured multi-view stereo[C]// The 15th European Conference on Computer Vision. Cham: Springer, 2018: 785-801. |
[11] | YAO Y, LUO Z X, LI S W, et al. Recurrent MVSNet for high-resolution multi-view stereo depth inference[C]// The IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 5520-5529. |
[12] | GU X D, FAN Z W, ZHU S Y, et al. Cascade cost volume for high-resolution multi-view stereo and stereo matching[C]// The IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 2492-2501. |
[13] | ZHANG S, WEI Z W, XU W J, et al. DSC-MVSNet: attention aware cost volume regularization based on depthwise separable convolution for multi-view stereo[J]. Complex & Intelligent Systems, 2023, 9(6): 6953-6969. |
[14] | LUO K Y, GUAN T, JU L L, et al. P-MVSNet: learning patch-wise matching confidence aggregation for multi-view stereo[C]// The IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2019: 10451-10460. |
[15] | YANG J Y, MAO W, ALVAREZ J M, et al. Cost volume pyramid based depth inference for multi-view stereo[C]// The IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 4876-4885. |
[16] | YANG R H, MIAO W, ZHANG Z X, et al. SA-MVSNet: self-attention-based multi-view stereo network for 3D reconstruction of images with weak texture[J]. Engineering Applications of Artificial Intelligence, 2024, 131: 107800. |
[17] | VATS V K, JOSHI S, CRANDALL D J, et al. GC-MVSNet: multi-view, multi-scale, geometrically-consistent multi-view stereo[C]// The IEEE/CVF Winter Conference on Applications of Computer Vision. New York: IEEE Press, 2024: 3230-3240. |
[18] | WANG S C, JIANG H, XIANG L. CT-MVSNet: efficient multi-view stereo with cross-scale transformer[C]// The 30th International Conference on Multimedia Modeling. Cham: Springer, 2024: 394-408. |
[19] |
王江安, 黄乐, 庞大为, 等. 基于自适应聚合循环递归的稠密点云重建网络[J]. 图学学报, 2024, 45(1): 230-239.
DOI |
WANG J A, HUANG L, PANG D W, et al. Dense point cloud reconstruction network based on adaptive aggregation recurrent recursion[J]. Journal of Graphics, 2024, 45(1): 230-239 (in Chinese).
DOI |
|
[20] | 王云艳, 朱镇中, 熊超. 结合注意力机制与路径聚合的多视图三维重建[J]. 重庆理工大学学报: 自然科学, 2023, 37(10): 127-135. |
WANG Y Y, ZHU Z Z, XIONG C. Multi-view 3D reconstruction combining attention mechanism and path aggregation[J]. Journal of Chongqing University of Technology: Natural Science, 2023, 37(10): 127-135 (in Chinese). | |
[21] |
周婧怡, 张栖桐, 冯结青. 基于混合结构的多视图三维场景重建[J]. 图学学报, 2024, 45(1): 199-208.
DOI |
ZHOU J Y, ZHANG Q T, FENG J Q. Hybrid-structure based multi-view 3D scene reconstruction[J]. Journal of Graphics, 2024, 45(1): 199-208 (in Chinese).
DOI |
|
[22] | 岳明齐, 张迎春, 吴立杰, 等. 基于可变形卷积技术的街景图像语义分割算法[J]. 计算机仿真, 2024, 41(3): 219-226, 259. |
YUE M Q, ZHANG Y C, WU L J, et al. A semantic segmentation algorithm for street view images based on deformable convolution technique[J]. Computer Simulation, 2024, 41(3): 219-226, 259 (in Chinese). | |
[23] | AANS H, JENSEN R R, VOGIATZIS G, et al. Large-scale data for multiple-view stereopsis[J]. International Journal of Computer Vision, 2016, 120(2): 153-168. |
[24] | KNAPITSCH A, PARK J, ZHOU Q Y, et al. Tanks and temples: benchmarking large-scale scene reconstruction[J]. ACM Transactions on Graphics, 2017, 36(4): 78. |
[25] | YAO Y, LUO Z X, LI S W, et al. BlendedMVS: a large-scale dataset for generalized multi-view stereo networks[C]// The IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 1787-1796. |
[1] | 李琼 , 考月英 , 张莹 , 徐沛 . 面向无人机航拍图像的目标检测研究综述[J]. 图学学报, 2024, 45(6): 1145-1164. |
[2] | 刘灿锋, 孙浩, 东辉. 结合 Transformer 与 Kolmogorov Arnold 网络的分子扩增时序预测研究[J]. 图学学报, 2024, 45(6): 1256-1265. |
[3] | 宋思程, 陈辰, 李晨辉, 王长波. 基于密度图多目标追踪的时空数据可视化[J]. 图学学报, 2024, 45(6): 1289-1300. |
[4] | 王宗继, 刘云飞, 陆峰. Cloud Sphere: 一种基于渐进式变形自编码的三维模型表征方法[J]. 图学学报, 2024, 45(6): 1375-1388. |
[5] | 许丹丹, 崔勇, 张世倩, 刘雨聪, 林予松. 优化医学影像三维渲染可视化效果:技术综述[J]. 图学学报, 2024, 45(5): 879-891. |
[6] | 胡凤阔, 叶兰, 谭显峰, 张钦展, 胡志新, 方清, 王磊, 满孝锋. 一种基于改进YOLOv8的轻量化路面病害检测算法[J]. 图学学报, 2024, 45(5): 892-900. |
[7] | 刘义艳, 郝婷楠, 贺晨, 常英杰. 基于DBBR-YOLO的光伏电池表面缺陷检测[J]. 图学学报, 2024, 45(5): 913-921. |
[8] | 翟永杰, 李佳蔚, 陈年昊, 王乾铭, 王新颖. 融合改进Transformer的车辆部件检测方法[J]. 图学学报, 2024, 45(5): 930-940. |
[9] | 姜晓恒, 段金忠, 卢洋, 崔丽莎, 徐明亮. 融合先验知识推理的表面缺陷检测[J]. 图学学报, 2024, 45(5): 957-967. |
[10] | 胡欣, 常娅姝, 秦皓, 肖剑, 程鸿亮. 基于改进YOLOv8和GMM图像点集匹配的双目测距方法[J]. 图学学报, 2024, 45(4): 714-725. |
[11] | 牛为华, 郭迅. 基于改进YOLOv8的船舰遥感图像旋转目标检测算法[J]. 图学学报, 2024, 45(4): 726-735. |
[12] | 李滔, 胡婷, 武丹丹. 结合金字塔结构和注意力机制的单目深度估计[J]. 图学学报, 2024, 45(3): 454-463. |
[13] | 朱光辉, 缪君, 胡宏利, 申基, 杜荣华. 基于自增强注意力机制的室内单图像分段平面三维重建[J]. 图学学报, 2024, 45(3): 464-471. |
[14] | 王稚儒, 常远, 鲁鹏, 潘成伟. 神经辐射场加速算法综述[J]. 图学学报, 2024, 45(1): 1-13. |
[15] | 王欣雨, 刘慧, 朱积成, 盛玉瑞, 张彩明. 基于高低频特征分解的深度多模态医学图像融合网络[J]. 图学学报, 2024, 45(1): 65-77. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||