欢迎访问《图学学报》 分享到:

图学学报 ›› 2024, Vol. 45 ›› Issue (5): 1008-1016.DOI: 10.11996/JG.j.2095-302X.2024051008

• 计算机图形学与虚拟现实 • 上一篇    下一篇

特征对齐与上下文引导的多视图三维重建

熊超1(), 王云艳1,2(), 罗雨浩1   

  1. 1.湖北工业大学电气与电子工程学院,湖北 武汉 430068
    2.襄阳湖北工业大学产业研究院,湖北 襄阳 441100
  • 收稿日期:2024-04-09 修回日期:2024-07-22 出版日期:2024-10-31 发布日期:2024-10-31
  • 通讯作者:王云艳(1981-),女,副教授,博士。主要研究方向为图像处理和遥感解译。E-mail:helen9224@126.com
  • 第一作者:熊超(1998-),男,硕士研究生。主要研究方向为图像处理和三维重建。E-mail:102210294@hbut.edu.cn
  • 基金资助:
    国家自然科学基金项目(41601394)

Multi-view stereo network reconstruction with feature alignment and context-guided

XIONG Chao1(), WANG Yunyan1,2(), LUO Yuhao1   

  1. 1. College of Electrical and Electronic Engineering, Hubei University of Technology, Wuhan Hubei 430068, China
    2. Xiangyang Industrial Institute of Hubei University of Technology, Xiangyang Hubei 441100, China
  • Received:2024-04-09 Revised:2024-07-22 Published:2024-10-31 Online:2024-10-31
  • Contact: WANG Yunyan (1981-),associate professor, Ph.D. Her main research interests cover image processing and remote sensing interpretation. E-mail:helen9224@126.com
  • First author:XIONG Chao (1998-), master student. His main research interests cover image processing and 3D reconstruction. E-mail:102210294@hbut.edu.cn
  • Supported by:
    National Natural Science Foundation of China(41601394)

摘要:

针对三维重建对细小特征及边缘区域重建欠佳的问题,提出了一个基于特征对齐与上下文引导的多视图三维重建网络,即AGA-MVSNet。首先,构建了一个特征对齐模块(FA)与特征选择模块(FS),能够将特征金字塔不同层级的特征先对齐之后再进行融合,提高对小尺寸物体和边缘区域的特征提取能力;然后,在代价体正则化中加入了一个上下文引导模块,该模块能够在略微增加运行内存的情况下充分利用周围信息,增强成本体积之间的相关性,提高三维重建的精度与完整度;最后,在DTU数据集上进行了实验,实验结果表明,该方法相比于基准网络CasMVSNet精度提升了2.2%,整体重建质量提升了2.5%。此外,在Tanks and Temples数据集上的表现相较一些已知的方法也十分优异,且在BlendedMVS数据集上也生成了不错的点云效果。

关键词: 深度学习, 多视图三维重建, 特征对齐, 上下文引导, 3D注意力机制

Abstract:

To address the problem of poor reconstruction of small features and edge areas in 3D reconstruction, a multi-view 3D reconstruction network based on feature alignment and context-guided was proposed, namely AGA-MVSNet (alignment and context guidance MVSNet). First, a feature alignment module (FA) and a feature selection module (FS) were constructed to combine different levels of the feature pyramid. The features were first aligned and then fused to enhance the feature extraction capabilities of small-sized objects and edge areas. Subsequently, a context guidance module was incorporated into the cost volume regularization to fully utilize surrounding information and solve the problem of poor correlation between cost volumes, thereby improving the accuracy and completeness of three-dimensional reconstruction, with only a slight increase in memory consumption. Finally, experiments were conducted on the DTU dataset. Experimental results demonstrated that the proposed method improved the accuracy by 2.2% and the overall reconstruction quality by 2.5% compared with the benchmark network CasMVSNet. In addition, the performance on the Tanks and Temples dataset was also excellent compared with some known methods, and good point cloud effects were also generated on the BlendedMVS dataset.

Key words: deep learning, multi-view 3D reconstruction, feature alignment, context guidance, 3D attention mechanism

中图分类号: