Welcome to Journal of Graphics share: 

Journal of Graphics ›› 2024, Vol. 45 ›› Issue (5): 1008-1016.DOI: 10.11996/JG.j.2095-302X.2024051008

• Computer Graphics and Virtual Reality • Previous Articles     Next Articles

Multi-view stereo network reconstruction with feature alignment and context-guided

XIONG Chao1(), WANG Yunyan1,2(), LUO Yuhao1   

  1. 1. College of Electrical and Electronic Engineering, Hubei University of Technology, Wuhan Hubei 430068, China
    2. Xiangyang Industrial Institute of Hubei University of Technology, Xiangyang Hubei 441100, China
  • Received:2024-04-09 Revised:2024-07-22 Online:2024-10-31 Published:2024-10-31
  • Contact: WANG Yunyan
  • About author:First author contact:

    XIONG Chao (1998-), master student. His main research interests cover image processing and 3D reconstruction. E-mail:102210294@hbut.edu.cn

  • Supported by:
    National Natural Science Foundation of China(41601394)

Abstract:

To address the problem of poor reconstruction of small features and edge areas in 3D reconstruction, a multi-view 3D reconstruction network based on feature alignment and context-guided was proposed, namely AGA-MVSNet (alignment and context guidance MVSNet). First, a feature alignment module (FA) and a feature selection module (FS) were constructed to combine different levels of the feature pyramid. The features were first aligned and then fused to enhance the feature extraction capabilities of small-sized objects and edge areas. Subsequently, a context guidance module was incorporated into the cost volume regularization to fully utilize surrounding information and solve the problem of poor correlation between cost volumes, thereby improving the accuracy and completeness of three-dimensional reconstruction, with only a slight increase in memory consumption. Finally, experiments were conducted on the DTU dataset. Experimental results demonstrated that the proposed method improved the accuracy by 2.2% and the overall reconstruction quality by 2.5% compared with the benchmark network CasMVSNet. In addition, the performance on the Tanks and Temples dataset was also excellent compared with some known methods, and good point cloud effects were also generated on the BlendedMVS dataset.

Key words: deep learning, multi-view 3D reconstruction, feature alignment, context guidance, 3D attention mechanism

CLC Number: