欢迎访问《图学学报》 分享到:

图学学报 ›› 2024, Vol. 45 ›› Issue (1): 230-239.DOI: 10.11996/JG.j.2095-302X.2024010230

• 计算机图形学与虚拟现实 • 上一篇    下一篇

基于自适应聚合循环递归的稠密点云重建网络

王江安(), 黄乐, 庞大为, 秦林珍, 梁温茜   

  1. 长安大学信息工程学院,陕西 西安 710064
  • 收稿日期:2023-06-19 接受日期:2023-12-04 出版日期:2024-02-29 发布日期:2024-02-29
  • 第一作者:王江安(1981-),男,副教授,博士。主要研究方向为计算机视觉与三维建模。E-mail:wangjiangan@126.com
  • 基金资助:
    国家自然科学基金面上项目(61771075);陕西省自然科学基金项目(2017JQ6048);江西省高等学校教学改革研究课题(JXJG-22-24-6)

Dense point cloud reconstruction network based on adaptive aggregation recurrent recursion

WANG Jiang’an(), HUANG Le, PANG Dawei, QIN Linzhen, LIANG Wenqian   

  1. School of Information Engineering, Chang’an University, Xi’an, Shaanxi 710064, China
  • Received:2023-06-19 Accepted:2023-12-04 Published:2024-02-29 Online:2024-02-29
  • First author:WANG Jiangan (1981-), associate professor, Ph.D. His main research interests cover computer vision and 3D modeling. E-mail:wangjiangan@126.com
  • Supported by:
    National Natural Science Foundation of China(61771075);Natural Science Foundation of Shaanxi Province(2017JQ6048);Teaching Reform Research Topics in Colleges and Universities in Jiangxi Province(JXJG-22-24-6)

摘要:

为了解决弱纹理重建难、资源消耗大和重建时间长等问题,提出了一种基于自适应聚合循环递归卷积的多阶段稠密点云重建网络, 即A2R2-MVSNet(adaptive aggregation recurrent recursive multi view stereo net)。该方法首先引入一种基于多尺度循环递归残差的特征提取模块,聚合上下文语义信息,以解决弱纹理或无纹理区域特征提取难的问题。在代价体正则化部分,提出一种残差正则化模块,该模块在略微增加内存消耗的前提下,提高了3D CNN提取和聚合上下文语意的能力。实验结果表明,提出的方法在 DTU数据集上的综合指标排名靠前,在重建细节上有着更好的体现,且在BlendedMVS数据集上生成了不错的深度图和点云结果,此外网络还在自采集的大规模高分辨率数据集上进行了泛化测试。归功于由粗到细的多阶段思想和我们提出的模块,网络在生成高准确性和完整性深度图的同时,还能进行高分辨率重建以适用于实际问题。

长安大学王江安副教授提出了一种基于自适应聚合循环递归卷积的多阶段稠密点云重建网络。其中多尺度循环特征提取模块可以提取出深层丰富的语义信息,以解决弱纹理、反射面等区域特征提取难、提取效果差的问题。残差正则化模块可以加强上下文信息聚合和抗噪能力。所提网络综合性能优于大部分算法,GPU占用也更少,具有良好的泛化性。

关键词: 深度学习, 计算机视觉, 三维重建, 稠密重建, 多视图立体, 递归神经网络

Abstract:

To address the problems such as difficulties in weak texture reconstruction, high resource consumption, and long reconstruction time, a multi-stage dense point cloud reconstruction network based on adaptive aggregation cyclic recursive convolution was proposed, namely A2R2-MVSNet (adaptive aggregation recurrent recursive multi view stereo net). This method first introduced a feature extraction module based on multi-scale cyclic recursive residuals to aggregate contextual semantic information, addressing the problem of difficult feature extraction in weakly textured or textureless regions. In the cost body regularization part, a residual regularization module was proposed. This module enhanced the ability of 3D CNN to extract and aggregate contextual semantics under the premise of slightly increasing memory consumption. The experimental results demonstrated that the proposed method ranked high in comprehensive metrics on the DTU dataset, showcasing superior performance in reconstructing details. Additionally, it could generate good depth maps and point cloud results on the BlendedMVS dataset. Furthermore, the network was tested for generalization on self-collected large-scale high-resolution datasets. Thanks to the coarse-to-fine multi-stage idea and our proposed module, the network could not only generate high-accuracy and complete depth maps, but also perform high-resolution reconstructions suitable for practical applications.

Key words: deep learning, computer vision, 3D reconstruction, dense reconstruction, multi-view stereo, recurrent neural network

中图分类号: