图学学报 ›› 2024, Vol. 45 ›› Issue (1): 230-239.DOI: 10.11996/JG.j.2095-302X.2024010230
收稿日期:
2023-06-19
接受日期:
2023-12-04
出版日期:
2024-02-29
发布日期:
2024-02-29
第一作者:
王江安(1981-),男,副教授,博士。主要研究方向为计算机视觉与三维建模。E-mail:wangjiangan@126.com
基金资助:
WANG Jiang’an(), HUANG Le, PANG Dawei, QIN Linzhen, LIANG Wenqian
Received:
2023-06-19
Accepted:
2023-12-04
Published:
2024-02-29
Online:
2024-02-29
First author:
WANG Jiangan (1981-), associate professor, Ph.D. His main research interests cover computer vision and 3D modeling. E-mail:wangjiangan@126.com
Supported by:
摘要:
为了解决弱纹理重建难、资源消耗大和重建时间长等问题,提出了一种基于自适应聚合循环递归卷积的多阶段稠密点云重建网络, 即A2R2-MVSNet(adaptive aggregation recurrent recursive multi view stereo net)。该方法首先引入一种基于多尺度循环递归残差的特征提取模块,聚合上下文语义信息,以解决弱纹理或无纹理区域特征提取难的问题。在代价体正则化部分,提出一种残差正则化模块,该模块在略微增加内存消耗的前提下,提高了3D CNN提取和聚合上下文语意的能力。实验结果表明,提出的方法在 DTU数据集上的综合指标排名靠前,在重建细节上有着更好的体现,且在BlendedMVS数据集上生成了不错的深度图和点云结果,此外网络还在自采集的大规模高分辨率数据集上进行了泛化测试。归功于由粗到细的多阶段思想和我们提出的模块,网络在生成高准确性和完整性深度图的同时,还能进行高分辨率重建以适用于实际问题。
中图分类号:
王江安, 黄乐, 庞大为, 秦林珍, 梁温茜. 基于自适应聚合循环递归的稠密点云重建网络[J]. 图学学报, 2024, 45(1): 230-239.
WANG Jiang’an, HUANG Le, PANG Dawei, QIN Linzhen, LIANG Wenqian. Dense point cloud reconstruction network based on adaptive aggregation recurrent recursion[J]. Journal of Graphics, 2024, 45(1): 230-239.
输入尺寸 | 结构 | 输出尺寸 |
---|---|---|
H×W×3 | Conv+GN+LeakyReLU,3×3, stride=1 | H×W×8 |
H×W×8 | Conv+GN+LeakyReLU,3×3, stride=1 | H×W×8 |
H×W×8 | Conv+GN+LeakyReLU,3×3, stride=2 | H/2×W/2×16 |
H/2×W/2×16 | Conv+GN+LeakyReLU,3×3, stride=1 | H/2×W/2×16 |
H/2×W/2×16 | Conv+GN+LeakyReLU,3×3, stride=2 | H/4×W/4×32 |
H/4×W/4×32 | Conv+GN+LeakyReLU,3×3, stride=1 | H/4×W/4×32 |
H/4×W/4×32 | Conv+GN+LeakyReLU,3×3, stride=2 | H/8×W/8×64 |
H/8×W/8×64 | Conv+GN+LeakyReLU,3×3, stride=1 | H/8×W/8×64 |
H/8×W/8×64 | Conv+GN+LeakyReLU,3×3, stride=1 | H/8×W/8×64 |
H/4×W/4×96 | Conv+GN+LeakyReLU,3×3, stride=1 | H/4×W/4×32 |
H/2×W/2×48 | Conv+GN+LeakyReLU,3×3, stride=1 | H/2×W/2×16 |
H×W×24 | Conv+GN+LeakyReLU,3×3, stride=1 | H×W×8 |
H/4×W/4×32 | Conv+GN+LeakyReLU,3×3, stride=1 | H/4×W/4×16 |
H/2×W/2×16 | Conv+GN+LeakyReLU,3×3, stride=1 | H/2×W/2×16 |
H×W×8 | Conv+GN+LeakyReLU,3×3, stride=1 | H×W×16 |
表1 多尺度特征提取网络构成
Table 1 Multi-scale feature extraction network composition
输入尺寸 | 结构 | 输出尺寸 |
---|---|---|
H×W×3 | Conv+GN+LeakyReLU,3×3, stride=1 | H×W×8 |
H×W×8 | Conv+GN+LeakyReLU,3×3, stride=1 | H×W×8 |
H×W×8 | Conv+GN+LeakyReLU,3×3, stride=2 | H/2×W/2×16 |
H/2×W/2×16 | Conv+GN+LeakyReLU,3×3, stride=1 | H/2×W/2×16 |
H/2×W/2×16 | Conv+GN+LeakyReLU,3×3, stride=2 | H/4×W/4×32 |
H/4×W/4×32 | Conv+GN+LeakyReLU,3×3, stride=1 | H/4×W/4×32 |
H/4×W/4×32 | Conv+GN+LeakyReLU,3×3, stride=2 | H/8×W/8×64 |
H/8×W/8×64 | Conv+GN+LeakyReLU,3×3, stride=1 | H/8×W/8×64 |
H/8×W/8×64 | Conv+GN+LeakyReLU,3×3, stride=1 | H/8×W/8×64 |
H/4×W/4×96 | Conv+GN+LeakyReLU,3×3, stride=1 | H/4×W/4×32 |
H/2×W/2×48 | Conv+GN+LeakyReLU,3×3, stride=1 | H/2×W/2×16 |
H×W×24 | Conv+GN+LeakyReLU,3×3, stride=1 | H×W×8 |
H/4×W/4×32 | Conv+GN+LeakyReLU,3×3, stride=1 | H/4×W/4×16 |
H/2×W/2×16 | Conv+GN+LeakyReLU,3×3, stride=1 | H/2×W/2×16 |
H×W×8 | Conv+GN+LeakyReLU,3×3, stride=1 | H×W×16 |
图3 DTU数据集不同网络深度图对比
Fig. 3 Comparison of different network depth maps of DTU dataset ((a) R-MVSNet; (b) UCSNet; (c) Vis-MVSNet; (d) Cas-MVNet; (e) CVP-MVNet; (f) Ours; (g) Ground truth)
方法 | Acc | Comp | Overall |
---|---|---|---|
Furu[ | 0.613 | 0.941 | 0.777 |
Gipuma[ | 0.283 | 0.873 | 0.578 |
COLMAP[ | 0.400 | 0.664 | 0.532 |
MVSNet[ | 0.396 | 0.527 | 0.462 |
R-MVSNet[ | 0.383 | 0.452 | 0.417 |
D2HC-RMVSNet[ | 0.395 | 0.378 | 0.386 |
IterMVS[ | 0.373 | 0.354 | 0.363 |
EPP-MVSNet[ | 0.413 | 0.296 | 0.355 |
Cas-MVSNet[ | 0.325 | 0.385 | 0.355 |
PatchmatchNet[ | 0.427 | 0.277 | 0.352 |
CVP-MVSNet[ | 0.296 | 0.406 | 0.351 |
MG-MVSNet[ | 0.358 | 0.338 | 0.348 |
UCSNet[ | 0.338 | 0.349 | 0.344 |
LANet[ | 0.320 | 0.349 | 0.335 |
UniMVSNet[ | 0.352 | 0.278 | 0.315 |
Ours | 0.321 | 0.346 | 0.334 |
表2 DTU数据集评估结果对比/mm
Table 2 Comparison of DTU dataset evaluation results/mm
方法 | Acc | Comp | Overall |
---|---|---|---|
Furu[ | 0.613 | 0.941 | 0.777 |
Gipuma[ | 0.283 | 0.873 | 0.578 |
COLMAP[ | 0.400 | 0.664 | 0.532 |
MVSNet[ | 0.396 | 0.527 | 0.462 |
R-MVSNet[ | 0.383 | 0.452 | 0.417 |
D2HC-RMVSNet[ | 0.395 | 0.378 | 0.386 |
IterMVS[ | 0.373 | 0.354 | 0.363 |
EPP-MVSNet[ | 0.413 | 0.296 | 0.355 |
Cas-MVSNet[ | 0.325 | 0.385 | 0.355 |
PatchmatchNet[ | 0.427 | 0.277 | 0.352 |
CVP-MVSNet[ | 0.296 | 0.406 | 0.351 |
MG-MVSNet[ | 0.358 | 0.338 | 0.348 |
UCSNet[ | 0.338 | 0.349 | 0.344 |
LANet[ | 0.320 | 0.349 | 0.335 |
UniMVSNet[ | 0.352 | 0.278 | 0.315 |
Ours | 0.321 | 0.346 | 0.334 |
图6 Self-collected数据集点云结果((a) R-MVSNet整体;(b) R-MVSNet分块;(c) Ours整体;(d) Ours分块)
Fig. 6 Self-collected data set point cloud result ((a) R-MVSNet overall; (b) R-MVSNet block; (c) Ours overall; (d) Ours block)
方法 | 参数量/M | GPU占用/GB | 运行时间/s | Acc/mm | Comp/mm | Overall/mm |
---|---|---|---|---|---|---|
Baseline | 0.44 | 7.545 | 2.872 | 0.348 | 0.357 | 0.353 |
Baseline+FPN | 0.55 | 7.548 | 2.885 | 0.345 | 0.352 | 0.349 |
Baseline+A2R2CNN | 0.56 | 7.548 | 3.165 | 0.337 | 0.342 | 0.340 |
Baseline+RU-Net | 0.56 | 8.339 | 2.911 | 0.327 | 0.356 | 0.342 |
Ours | 0.67 | 8.342 | 3.210 | 0.321 | 0.346 | 0.334 |
表3 网络模块对比
Table 3 Network module comparison
方法 | 参数量/M | GPU占用/GB | 运行时间/s | Acc/mm | Comp/mm | Overall/mm |
---|---|---|---|---|---|---|
Baseline | 0.44 | 7.545 | 2.872 | 0.348 | 0.357 | 0.353 |
Baseline+FPN | 0.55 | 7.548 | 2.885 | 0.345 | 0.352 | 0.349 |
Baseline+A2R2CNN | 0.56 | 7.548 | 3.165 | 0.337 | 0.342 | 0.340 |
Baseline+RU-Net | 0.56 | 8.339 | 2.911 | 0.327 | 0.356 | 0.342 |
Ours | 0.67 | 8.342 | 3.210 | 0.321 | 0.346 | 0.334 |
方法 | 输入分辨率 | 输出分辨率 | GPU/GB | 运行时间/s | Acc/mm | Comp/mm |
---|---|---|---|---|---|---|
R-MVSNet | 1536×1152 | 384×288 | 9.800 | 2.518 | 0.383 | 0.452 |
Vis-MVSNet | 1536×1152 | 768×576 | 5.583 | 3.902 | 0.369 | 0.361 |
CVP-MVSNet | 1536×1152 | 1536×1152 | 8.335 | 3.118 | 0.296 | 0.406 |
UniMVSNet | 1536×1152 | 1536×1152 | 9.991 | 1.466 | 0.352 | 0.278 |
Ours | 1536×1152 | 1536×1152 | 8.342 | 3.210 | 0.321 | 0.346 |
Ours | 1536×1152 | 768×576 | 4.043 | 0.862 | 0.367 | 0.381 |
表4 分辨率对网络的影响
Table 4 Network performance comparison
方法 | 输入分辨率 | 输出分辨率 | GPU/GB | 运行时间/s | Acc/mm | Comp/mm |
---|---|---|---|---|---|---|
R-MVSNet | 1536×1152 | 384×288 | 9.800 | 2.518 | 0.383 | 0.452 |
Vis-MVSNet | 1536×1152 | 768×576 | 5.583 | 3.902 | 0.369 | 0.361 |
CVP-MVSNet | 1536×1152 | 1536×1152 | 8.335 | 3.118 | 0.296 | 0.406 |
UniMVSNet | 1536×1152 | 1536×1152 | 9.991 | 1.466 | 0.352 | 0.278 |
Ours | 1536×1152 | 1536×1152 | 8.342 | 3.210 | 0.321 | 0.346 |
Ours | 1536×1152 | 768×576 | 4.043 | 0.862 | 0.367 | 0.381 |
[1] |
AANAES H, JENSEN R R, VOGIATZIS G, et al. Large-scale data for multiple-view stereopsis[J]. International Journal of Computer Vision, 2016, 120(2): 153-168.
DOI URL |
[2] |
FURUKAWA Y, HERNÁNDEZ C. Multi-view stereo: a tutorial[J]. Foundations and Trends® in Computer Graphics and Vision, 2015, 9(1-2): 1-148.
DOI URL |
[3] | 王思启, 张家强, 李丽圆, 等. MVSNet在空间目标三维重建中的应用[J]. 中国激光, 2022, 49(23): 176-185. |
WANG S Q, ZHANG J Q, LI L Y, et al. Application of MVSNet in 3D reconstruction of space objects[J]. Chinese Journal of Lasers, 2022, 49(23): 176-185 (in Chinese). | |
[4] | SCHÖNBERGER J L, FRAHM J M. Structure-from-motion revisited[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 4104-4113. |
[5] | KANG S B, SZELISKI R, CHAI J X. Handling occlusions in dense multi-view stereo[C]// The 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR. New York: IEEE Press, 2003:I: 103-I:110.. |
[6] | SCHÖNBERGER J L, ZHENG E L, FRAHM J M, et al. Pixelwise view selection for unstructured multi-view stereo[C]// European Conference on Computer Vision. Cham: Springer, 2016: 501-518. |
[7] | 刘万军, 王俊恺, 曲海成. 多尺度代价体信息共享的多视角立体重建网络[J]. 中国图象图形学报, 2022, 27(11): 3331-3342. |
LIU W J, WANG J K, QU H C. Multi-scale cost volumes information sharing based multi-view stereo reconstructed model[J]. Journal of Image and Graphics, 2022, 27(11): 3331-3342 (in Chinese). | |
[8] | 王江安, 庞大为, 黄乐, 等. 基于多尺度特征递归卷积的稠密点云重建网络[J]. 图学学报, 2022, 43(5): 875-883. |
WANG J A, PANG D W, HUANG L, et al. Dense point cloud reconstruction network using multi-scale feature recursive convolution[J]. Journal of Graphics, 2022, 43(5): 875-883 (in Chinese). | |
[9] | NIRKIN Y, WOLF L, HASSNER T. HyperSeg: patch-wise hypernetwork for real-time semantic segmentation[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 4060-4069. |
[10] | 罗旭东, 吴一全, 陈金林. 无人机航拍影像目标检测与语义分割的深度学习方法研究进展[J/OL]. 航空学报, 2023: 1-33. [2023-06-12]. https://kns.cnki.net/kcms/detail/11.1929.V.20230609.1350.008.html. |
LUO X D, WU Y Q, CHEN J L. Research progress on deep learning methods for object detection and semantic segmentation in UAV aerial images[J/OL]. Acta Aeronautica et Astronautica Sinica, 2023: 1-33. [2023-06-12]. https://kns.cnki.net/kcms/detail/11.1929.V.20230609.1350.008.html. (in Chinese). | |
[11] | 王艺娴, 胡雨凡, 孔庆群, 等. 三维点云语义分割:现状与挑战[J]. 工程科学学报, 2023, 45(10): 1653-1665. |
WANG Y X, HU Y F, KONG Q Q, et al. 3D point cloud semantic segmentation: state of the art and challenges[J]. Chinese Journal of Engineering, 2023, 45(10): 1653-1665 (in Chinese). | |
[12] |
HAMID M S, MANAP N A, HAMZAH R A, et al. Stereo matching algorithm based on deep learning: a survey[J]. Journal of King Saud University - Computer and Information Sciences, 2022, 34(5): 1663-1673.
DOI URL |
[13] | 张新钰, 高洪波, 赵建辉, 等. 基于深度学习的自动驾驶技术综述[J]. 清华大学学报: 自然科学版, 2018, 58(4): 438-444. |
ZHANG X Y, GAO H B, ZHAO J H, et al. Overview of deep learning intelligent driving methods[J]. Journal of Tsinghua University: Science and Technology, 2018, 58(4): 438-444 (in Chinese). | |
[14] | KNAPITSCH A, PARK J, ZHOU Q Y, et al. Tanks and temples: benchmarking large-scale scene reconstruction[J]. ACM Transactions on Graphics, 36(4): 78:1-78:13. |
[15] | ZHU Q T, MIN C, WEI Z Z, et al. Deep learning for multi-view stereo via plane sweep: a survey[EB/OL]. [2023-06-22]. http://arxiv.org/abs/2106.15328v2. |
[16] | 许允波, 张建兵, 谭宁生. 基于平面扫描的线状缓冲区生成的改进算法[J]. 计算机应用研究, 2012, 29(11): 4364-4366, 4389. |
XU Y B, ZHANG J B, TAN N S. Improved algorithm for line buffering based on plane sweep technique[J]. Application Research of Computers, 2012, 29(11): 4364-4366, 4389 (in Chinese). | |
[17] | YAO Y, LUO Z X, LI S W, et al. MVSNet: depth inference for unstructured multi-view stereo[C]// European Conference on Computer Vision. Cham: Springer, 2018: 785-801. |
[18] | YAO Y, LUO Z X, LI S W, et al. Recurrent MVSNet for high-resolution multi-view stereo depth inference[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 5520-5529. |
[19] | YU Z H, GAO S H. Fast-MVSNet: sparse-to-dense multi-view stereo with learned propagation and gauss-newton refinement[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 1946-1955. |
[20] | 汤建龙, 解佳龙, 薛成均. 利用高斯牛顿迭代的时频差无源定位算法[J]. 西安电子科技大学学报, 2023, 50(1): 19-28, 47. |
TANG J L, XIE J L, XUE C J. TDOA-FDOA passive location algorithm using gauss-newton iteration[J]. Journal of Xidian University, 2023, 50(1): 19-28, 47 (in Chinese). | |
[21] | ZHANG J Y, YAO Y, LI S W, et al. Visibility-aware multi-view stereo network[EB/OL]. [2023-06-22]. https://arxiv.org/abs/2008.07928.pdf. |
[22] | YAN J F, WEI Z Z, YI H W, et al. Dense hybrid recurrent multi-view stereo net with dynamic consistency checking[C]// European Conference on Computer Vision. Cham: Springer, 2020: 674-689. |
[23] | WEI Z Z, ZHU Q T, MIN C, et al. AA-RMVSNet: adaptive aggregation recurrent multi-view stereo network[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2022: 6167-6176. |
[24] | SHI X J, CHEN Z R, WANG H, et al. Convolutional LSTM Network: a machine learning approach for precipitation nowcasting[C]// The 28th International Conference on Neural Information Processing Systems - Volume 1. New York:ACM, 2015: 802-810. |
[25] | GU X D, FAN Z W, ZHU S Y, et al. Cascade cost volume for high-resolution multi-view stereo and stereo matching[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 2492-2501. |
[26] | YANG J Y, MAO W, ALVAREZ J M, et al. Cost volume pyramid based depth inference for multi-view stereo[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 4876-4885. |
[27] | ZHANG X D, HU Y T, WANG H C, et al. Long-range attention network for multi-view stereo[C]// 2021 IEEE Winter Conference on Applications of Computer Vision. New York: IEEE Press, 2021: 3781-3790. |
[28] | WANG F, GALLIANI S, VOGEL C, et al. PatchmatchNet: learned multi-view patchmatch stereo[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 14189-14198. |
[29] | MA X J, GONG Y, WANG Q R, et al. EPP-MVSNet: epipolar-assembling based depth prediction for multi-view stereo[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2022: 5712-5720. |
[30] | WANG F, GALLIANI S, VOGEL C, et al. IterMVS: iterative probability estimation for efficient multi-view stereo[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 8596-8605. |
[31] | CHO K, VAN MERRIENBOER B, GULCEHRE C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[EB/OL]. [2023-06-22]. https://arxiv.org/abs/1406.1078.pdf |
[32] | PENG R, WANG R J, WANG Z Y, et al. Rethinking depth estimation for multi-view stereo: a unified representation[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 8635-8644. |
[33] | XI J H, SHI Y F, WANG Y J, et al. RayMVSNet: learning ray-based 1D implicit fields for accurate multi-view stereo[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 8585-8595. |
[34] | DING Y K, YUAN W T, ZHU Q T, et al. TransMVSNet: global context-aware multi-view stereo network with transformers[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 8575-8584. |
[35] | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all You need[C]// The 31st International Conference on Neural Information Processing Systems. New York: ACM, 2017: 6000-6010. |
[36] | MI Z X, DI C, XU D. Generalized binary search network for highly-efficient multi-view stereo[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 12981-12990. |
[37] | YAMASHITA K, ENYO Y, NOBUHARA S, et al. nLMVS-net: deep non-lambertian multi-view stereo[C]// 2023 IEEE/CVF Winter Conference on Applications of Computer Vision. New York: IEEE Press, 2023: 3036-3045. |
[38] | CHIU C Y, WU Y T, SHEN I C, et al. 360MVSNet: deep multi-view stereo network with 360° images for indoor scene reconstruction[C]// 2023 IEEE/CVF Winter Conference on Applications of Computer Vision. New York: IEEE Press, 2023: 3056-3065. |
[39] |
ZHANG X D, YANG F Z, CHANG M, et al. MG-MVSNet: multiple granularities feature fusion network for multi-view stereo[J]. Neurocomputing, 2023, 528: 35-47.
DOI URL |
[40] | ZHANGL Y, ZHU J K, LIN L X. Multi-view stereo representation revist: region-aware MVSNet[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 17376-17385. |
[41] | QIAO S Y, CHEN L C, YUILLE A. DetectoRS: detecting objects with recursive feature pyramid and switchable atrous convolution[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 10208-10219. |
[42] | HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 2261-2269. |
[43] | HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 770-778. |
[44] | 鄢化彪, 徐方奇, 黄绿娥, 等. 基于深度学习的多视图立体重建方法综述[J]. 光学精密工程, 2023, 31(16): 2444-2464. |
YAN H B, XU F Q, HUANG L E, et al. Review of multi-view stereo reconstruction methods based on deep learning[J]. Optics and Precision Engineering, 2023, 31(16): 2444-2464 (in Chinese).
DOI URL |
|
[45] | RONNEBERGER O, FISCHER P, BROX T. U-net: convolutional networks for biomedical image segmentation[M]// Lecture Notes in Computer Science. Cham: Springer International Publishing, 2015: 234-241. |
[46] | 杨航, 陈瑞, 安仕鹏, 等. 深度学习背景下的图像三维重建技术进展综述[J]. 中国图象图形学报, 2023, 28(8): 2396-2409. |
YANG H, CHEN R, AN S P, et al. The growth of image-related three dimensional reconstruction techniques in deep learning-driven era: a critical summary[J]. Journal of Image and Graphics, 2023, 28(8): 2396-2409 (in Chinese).
DOI URL |
|
[47] | IOFFE S, SZEGEDY C. Batch normalization: accelerating deep network training by reducing internal covariate shift[C]// The 32nd International Conference on International Conference on Machine Learning - Volume 37. New York:ACM, 2015: 448-456. |
[48] | Glorot X, Bordes A, Bengio Y. Deep sparse rectifier neural networks[C]// Proceedings of the fourteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings, 2011: 315-323. |
[49] | WU Y X, HE K M. Group normalization[C]// European Conference on Computer Vision. Cham: Springer, 2018: 3-19. |
[50] | XU B, WANG N Y, CHEN T Q, et al. Empirical evaluation of rectified activations in convolutional network[EB/OL]. [2023-06-22]. https://arxiv.org/abs/1505.00853.pdf |
[51] |
许彪, 董友强, 张力, 等. 分区优化混合SfM方法[J]. 测绘学报, 2022, 51(1): 115-126.
DOI |
XU B, DONG Y Q, ZHANG L, et al. A hybrid SfM method based on partition optimization[J]. Acta Geodaetica et Cartographica Sinica, 2022, 51(1): 115-126 (in Chinese).
DOI |
|
[52] | 袁艺天, 林春雨, 赵耀, 等. 基于边缘校正的深度图像上采样后处理算法[J]. 铁道学报, 2015, 37(12): 67-73. |
YUAN Y T, LIN C Y, ZHAO Y, et al. A post processing algorithm for upsampling depth image based on boundary correction[J]. Journal of the China Railway Society, 2015, 37(12): 67-73 (in Chinese). | |
[53] | YAO Y, LUO Z X, LI S W, et al. BlendedMVS: a large-scale dataset for generalized multi-view stereo networks[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 1787-1796. |
[54] | GALLIANI S, LASINGER K, SCHINDLER K. Massively parallel multiview stereopsis by surface normal diffusion[C]// 2015 IEEE International Conference on Computer Vision. New York: IEEE Press, 2016: 873-881. |
[55] |
FURUKAWA Y, PONCE J. Accurate, dense, and robust multiview stereopsis[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(8): 1362-1376.
DOI PMID |
[56] | CHENG S, XU Z X, ZHU S L, et al. Deep stereo using adaptive thin volume representation with uncertainty awareness[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 2521-2531. |
[1] | 胡欣, 常娅姝, 秦皓, 肖剑, 程鸿亮. 基于改进YOLOv8和GMM图像点集匹配的双目测距方法[J]. 图学学报, 2024, 45(4): 714-725. |
[2] | 牛为华, 郭迅. 基于改进YOLOv8的船舰遥感图像旋转目标检测算法[J]. 图学学报, 2024, 45(4): 726-735. |
[3] | 李滔, 胡婷, 武丹丹. 结合金字塔结构和注意力机制的单目深度估计[J]. 图学学报, 2024, 45(3): 454-463. |
[4] | 朱光辉, 缪君, 胡宏利, 申基, 杜荣华. 基于自增强注意力机制的室内单图像分段平面三维重建[J]. 图学学报, 2024, 45(3): 464-471. |
[5] | 王稚儒, 常远, 鲁鹏, 潘成伟. 神经辐射场加速算法综述[J]. 图学学报, 2024, 45(1): 1-13. |
[6] | 黄家晖, 穆太江. 动态三维场景重建研究综述[J]. 图学学报, 2024, 45(1): 14-25. |
[7] | 王欣雨, 刘慧, 朱积成, 盛玉瑞, 张彩明. 基于高低频特征分解的深度多模态医学图像融合网络[J]. 图学学报, 2024, 45(1): 65-77. |
[8] | 李佳琦, 王辉, 郭宇. 基于Transformer的三角形网格分类分割网络[J]. 图学学报, 2024, 45(1): 78-89. |
[9] | 石敏, 王炳祺, 李兆歆, 朱登明. 一种带高光处理的无缝纹理映射方法[J]. 图学学报, 2024, 45(1): 148-158. |
[10] | 周婧怡, 张栖桐, 冯结青. 基于混合结构的多视图三维场景重建[J]. 图学学报, 2024, 45(1): 199-208. |
[11] | 韩亚振, 尹梦晓, 马伟钊, 杨诗耕, 胡锦飞, 朱丛洋. DGOA:基于动态图和偏移注意力的点云上采样[J]. 图学学报, 2024, 45(1): 219-229. |
[12] | 成欢, 王硕, 李孟, 秦伦明, 赵芳. 面向自动驾驶场景的神经辐射场综述[J]. 图学学报, 2023, 44(6): 1091-1103. |
[13] | 李泓萱, 张松洋, 任博. 基于多级可逆神经网络的大容量裁剪稳健型图像隐写技术[J]. 图学学报, 2023, 44(6): 1149-1161. |
[14] | 周锐闯, 田瑾, 闫丰亭, 朱天晓, 张玉金. 融合外部注意力和图卷积的点云分类模型[J]. 图学学报, 2023, 44(6): 1162-1172. |
[15] | 王吉, 王森, 蒋智文, 谢志峰, 李梦甜. 基于深度条件扩散模型的零样本文本驱动虚拟人生成方法[J]. 图学学报, 2023, 44(6): 1218-1226. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||