Journal of Graphics ›› 2024, Vol. 45 ›› Issue (5): 1008-1016.DOI: 10.11996/JG.j.2095-302X.2024051008
• Computer Graphics and Virtual Reality • Previous Articles Next Articles
XIONG Chao1(), WANG Yunyan1,2(
), LUO Yuhao1
Received:
2024-04-09
Revised:
2024-07-22
Online:
2024-10-31
Published:
2024-10-31
Contact:
WANG Yunyan
About author:
First author contact:XIONG Chao (1998-), master student. His main research interests cover image processing and 3D reconstruction. E-mail:102210294@hbut.edu.cn
Supported by:
CLC Number:
XIONG Chao, WANG Yunyan, LUO Yuhao. Multi-view stereo network reconstruction with feature alignment and context-guided[J]. Journal of Graphics, 2024, 45(5): 1008-1016.
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.txxb.com.cn/EN/10.11996/JG.j.2095-302X.2024051008
Method | Acc | Comp | Overall |
---|---|---|---|
Gipuma | 0.283 1 | 0.873 2 | 0.578 |
MVSNet | 0.456 2 | 0.646 2 | 0.551 |
R-MVSNet | 0.387 4 | 0.443 5 | 0.415 |
P-MVSNet | 0.396 2 | 0.414 5 | 0.405 |
CVPMVSNet | 0.334 5 | 0.407 5 | 0.371 |
CasMVSNet | 0.383 2 | 0.378 8 | 0.381 |
Ours | 0.361 2 | 0.350 8 | 0.356 |
Table 1 Quantitative evaluation results on DTU dataset/mm
Method | Acc | Comp | Overall |
---|---|---|---|
Gipuma | 0.283 1 | 0.873 2 | 0.578 |
MVSNet | 0.456 2 | 0.646 2 | 0.551 |
R-MVSNet | 0.387 4 | 0.443 5 | 0.415 |
P-MVSNet | 0.396 2 | 0.414 5 | 0.405 |
CVPMVSNet | 0.334 5 | 0.407 5 | 0.371 |
CasMVSNet | 0.383 2 | 0.378 8 | 0.381 |
Ours | 0.361 2 | 0.350 8 | 0.356 |
Method | 内存消耗/MB | 运行时间/s | 整体性误差/mm |
---|---|---|---|
CasMVSNet | 6 208 | 0.784 | 0.381 |
Ours | 7 339 | 0.927 | 0.356 |
Table 2 Training cost
Method | 内存消耗/MB | 运行时间/s | 整体性误差/mm |
---|---|---|---|
CasMVSNet | 6 208 | 0.784 | 0.381 |
Ours | 7 339 | 0.927 | 0.356 |
Method | Acc | Comp | Overall |
---|---|---|---|
NeuS | 1.202 7 | 0.874 3 | 1.038 |
Ours | 0.361 2 | 0.350 8 | 0.356 |
Table 3 Quantitative evaluation results with NeuS/mm
Method | Acc | Comp | Overall |
---|---|---|---|
NeuS | 1.202 7 | 0.874 3 | 1.038 |
Ours | 0.361 2 | 0.350 8 | 0.356 |
Methods | Fam | Fra | Hor | Lig | M60 | Pan | Pla | Tra | Mean |
---|---|---|---|---|---|---|---|---|---|
MVSNet | 55.89 | 28.65 | 25.14 | 51.77 | 54.01 | 50.55 | 47.85 | 35.21 | 43.62 |
P-MVSNet | 69.95 | 44.54 | 40.16 | 65.33 | 54.98 | 55.21 | 60.33 | 54.32 | 55.60 |
CVP-MVSNet | 74.15 | 47.68 | 36.44 | 54.18 | 57.33 | 53.28 | 57.44 | 46.21 | 53.33 |
CasMVSNet | 73.33 | 54.45 | 43.17 | 52.43 | 53.16 | 51.07 | 54.25 | 43.36 | 53.15 |
Ours | 76.33 | 58.21 | 46.32 | 55.23 | 55.96 | 54.18 | 58.27 | 46.47 | 56.37 |
Table 4 Quantitative results for the Tanks and Temples dataset
Methods | Fam | Fra | Hor | Lig | M60 | Pan | Pla | Tra | Mean |
---|---|---|---|---|---|---|---|---|---|
MVSNet | 55.89 | 28.65 | 25.14 | 51.77 | 54.01 | 50.55 | 47.85 | 35.21 | 43.62 |
P-MVSNet | 69.95 | 44.54 | 40.16 | 65.33 | 54.98 | 55.21 | 60.33 | 54.32 | 55.60 |
CVP-MVSNet | 74.15 | 47.68 | 36.44 | 54.18 | 57.33 | 53.28 | 57.44 | 46.21 | 53.33 |
CasMVSNet | 73.33 | 54.45 | 43.17 | 52.43 | 53.16 | 51.07 | 54.25 | 43.36 | 53.15 |
Ours | 76.33 | 58.21 | 46.32 | 55.23 | 55.96 | 54.18 | 58.27 | 46.47 | 56.37 |
Method | Acc | Comp | Overall |
---|---|---|---|
CasMVSNet | 0.383 2 | 0.378 8 | 0.381 |
CasMVSNet+FA&FS | 0.373 2 | 0.369 5 | 0.371 |
CasMVSNet+Guide | 0.365 4 | 0.366 7 | 0.366 |
Ours | 0.361 2 | 0.350 8 | 0.356 |
Table 5 Comparison of ablation evaluation indicators/mm
Method | Acc | Comp | Overall |
---|---|---|---|
CasMVSNet | 0.383 2 | 0.378 8 | 0.381 |
CasMVSNet+FA&FS | 0.373 2 | 0.369 5 | 0.371 |
CasMVSNet+Guide | 0.365 4 | 0.366 7 | 0.366 |
Ours | 0.361 2 | 0.350 8 | 0.356 |
[1] | LIU Y P, JING T, QU Q, et al. An augmented-reality holographic stereogram based on 3D optical field information manipulation and reconstruction[J]. Frontiers in Physics, 2022, 9: 828825. |
[2] | 顾珈静, 刘春, 周骁腾, 等. 基于视觉的明清古家具数字文化档案高精度三维重建[J]. 文物保护与考古科学, 2022, 34(2): 22-30. |
GU J J, LIU C, ZHOU X T, et al. A vision-based high-accuracy 3D-reconstruction method for digital cultural archives of Ming and Qing furniture[J]. Sciences of Conservation and Archaeology, 2022, 34(2): 22-30 (in Chinese). | |
[3] | 李露, 孟偲, 武永昌, 等. 无人机对地观测的遥感场景三维重建实验方案[J]. 实验室研究与探索, 2023, 42(3): 235-240. |
LI L, MENG C, WU Y C, et al. Experimental scheme of 3D reconstruction of remote sensing scenes based on ground observation by UAV[J]. Research and Exploration in Laboratory, 2023, 42(3): 235-240 (in Chinese). | |
[4] | 段永飞, 倪建鑫, 严奉奇, 等. 肾三维重建在结节性硬化症相关肾血管平滑肌脂肪瘤治疗中的应用综述[J]. 解放军医学院学报, 2023, 44(2): 173-176. |
DUAN Y F, NI J X, YAN F Q, et al. Research advances in three-dimensional renal reconstruction in treatment of tuberous sclerosis associated renal angiomyolipoma[J]. Academic Journal of Chinese PLA Medical School, 2023, 44(2): 173-176 (in Chinese). | |
[5] | 马建红, 王稀瑶, 陈永霞, 等. 自动驾驶中图像与点云融合方法研究综述[J]. 郑州大学学报: 理学版, 2022, 54(6): 24-33. |
MA J H, WANG X Y, CHEN Y X, et al. A review of research on image and point cloud fusion methods in automatic driving[J]. Journal of Zhengzhou University: Natural Science Edition, 2022, 54(6): 24-33 (in Chinese). | |
[6] | FU Q C, XU Q S, ONG Y S, et al. Geo-Neus: geometry-consistent neural implicit surfaces learning for multi-view reconstruction[C]// The 36th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2022: 246. |
[7] | XIAO Y X, XUE N, WU T F, et al. Level-S2fM: structure from motion on neural level set of implicit surfaces[C]// The IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 17205-17214. |
[8] | TRUONG P, RAKOTOSAONA M J, MANHARDT F, et al. SPARF: neural radiance fields from sparse and noisy poses[C]// The IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 4190-4200. |
[9] | HUANG S S, ZOU Z X, ZHANG Y C, et al. SC-NeuS: consistent neural surface reconstruction from sparse and noisy views[C]// The 38th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2024: 2357-2365. |
[10] | YAO Y, LUO Z X, LI S W, et al. MVSNet: depth inference for unstructured multi-view stereo[C]// The 15th European Conference on Computer Vision. Cham: Springer, 2018: 785-801. |
[11] | YAO Y, LUO Z X, LI S W, et al. Recurrent MVSNet for high-resolution multi-view stereo depth inference[C]// The IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 5520-5529. |
[12] | GU X D, FAN Z W, ZHU S Y, et al. Cascade cost volume for high-resolution multi-view stereo and stereo matching[C]// The IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 2492-2501. |
[13] | ZHANG S, WEI Z W, XU W J, et al. DSC-MVSNet: attention aware cost volume regularization based on depthwise separable convolution for multi-view stereo[J]. Complex & Intelligent Systems, 2023, 9(6): 6953-6969. |
[14] | LUO K Y, GUAN T, JU L L, et al. P-MVSNet: learning patch-wise matching confidence aggregation for multi-view stereo[C]// The IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2019: 10451-10460. |
[15] | YANG J Y, MAO W, ALVAREZ J M, et al. Cost volume pyramid based depth inference for multi-view stereo[C]// The IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 4876-4885. |
[16] | YANG R H, MIAO W, ZHANG Z X, et al. SA-MVSNet: self-attention-based multi-view stereo network for 3D reconstruction of images with weak texture[J]. Engineering Applications of Artificial Intelligence, 2024, 131: 107800. |
[17] | VATS V K, JOSHI S, CRANDALL D J, et al. GC-MVSNet: multi-view, multi-scale, geometrically-consistent multi-view stereo[C]// The IEEE/CVF Winter Conference on Applications of Computer Vision. New York: IEEE Press, 2024: 3230-3240. |
[18] | WANG S C, JIANG H, XIANG L. CT-MVSNet: efficient multi-view stereo with cross-scale transformer[C]// The 30th International Conference on Multimedia Modeling. Cham: Springer, 2024: 394-408. |
[19] |
王江安, 黄乐, 庞大为, 等. 基于自适应聚合循环递归的稠密点云重建网络[J]. 图学学报, 2024, 45(1): 230-239.
DOI |
WANG J A, HUANG L, PANG D W, et al. Dense point cloud reconstruction network based on adaptive aggregation recurrent recursion[J]. Journal of Graphics, 2024, 45(1): 230-239 (in Chinese).
DOI |
|
[20] | 王云艳, 朱镇中, 熊超. 结合注意力机制与路径聚合的多视图三维重建[J]. 重庆理工大学学报: 自然科学, 2023, 37(10): 127-135. |
WANG Y Y, ZHU Z Z, XIONG C. Multi-view 3D reconstruction combining attention mechanism and path aggregation[J]. Journal of Chongqing University of Technology: Natural Science, 2023, 37(10): 127-135 (in Chinese). | |
[21] |
周婧怡, 张栖桐, 冯结青. 基于混合结构的多视图三维场景重建[J]. 图学学报, 2024, 45(1): 199-208.
DOI |
ZHOU J Y, ZHANG Q T, FENG J Q. Hybrid-structure based multi-view 3D scene reconstruction[J]. Journal of Graphics, 2024, 45(1): 199-208 (in Chinese).
DOI |
|
[22] | 岳明齐, 张迎春, 吴立杰, 等. 基于可变形卷积技术的街景图像语义分割算法[J]. 计算机仿真, 2024, 41(3): 219-226, 259. |
YUE M Q, ZHANG Y C, WU L J, et al. A semantic segmentation algorithm for street view images based on deformable convolution technique[J]. Computer Simulation, 2024, 41(3): 219-226, 259 (in Chinese). | |
[23] | AANS H, JENSEN R R, VOGIATZIS G, et al. Large-scale data for multiple-view stereopsis[J]. International Journal of Computer Vision, 2016, 120(2): 153-168. |
[24] | KNAPITSCH A, PARK J, ZHOU Q Y, et al. Tanks and temples: benchmarking large-scale scene reconstruction[J]. ACM Transactions on Graphics, 2017, 36(4): 78. |
[25] | YAO Y, LUO Z X, LI S W, et al. BlendedMVS: a large-scale dataset for generalized multi-view stereo networks[C]// The IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 1787-1796. |
[1] | LI Qiong, KAO Yueying, ZHANG Ying, XU Pei. Review on object detection in UAV aerial images [J]. Journal of Graphics, 2024, 45(6): 1145-1164. |
[2] | LIU Canfeng, SUN Hao, DONG Hui. Molecular amplification time series prediction research combining Transformer with Kolmogorov-Arnold network [J]. Journal of Graphics, 2024, 45(6): 1256-1265. |
[3] | SONG Sicheng, CHEN Chen, LI Chenhui, WANG Changbo. Spatiotemporal data visualization based on density map multi-target tracking [J]. Journal of Graphics, 2024, 45(6): 1289-1300. |
[4] | WANG Zongji, LIU Yunfei, LU Feng. Cloud Sphere: a 3D shape representation method via progressive deformation [J]. Journal of Graphics, 2024, 45(6): 1375-1388. |
[5] | XU Dandan, CUI Yong, ZHANG Shiqian, LIU Yucong, LIN Yusong. Optimizing the visual effects of 3D rendering in medical imaging: a technical review [J]. Journal of Graphics, 2024, 45(5): 879-891. |
[6] | HU Fengkuo, YE Lan, TAN Xianfeng, ZHANG Qinzhan, HU Zhixin, FANG Qing, WANG Lei, MAN Xiaofeng. A refined YOLOv8-based algorithm for lightweight pavement disease detection [J]. Journal of Graphics, 2024, 45(5): 892-900. |
[7] | LIU Yiyan, HAO Tingnan, HE Chen, CHANG Yingjie. Photovoltaic cell surface defect detection based on DBBR-YOLO [J]. Journal of Graphics, 2024, 45(5): 913-921. |
[8] | ZHAI Yongjie, LI Jiawei, CHEN Nianhao, WANG Qianming, WANG Xinying. The vehicle parts detection method enhanced with Transformer integration [J]. Journal of Graphics, 2024, 45(5): 930-940. |
[9] | JIANG Xiaoheng, DUAN Jinzhong, LU Yang, CUI Lisha, XU Mingliang. Fusing prior knowledge reasoning for surface defect detection [J]. Journal of Graphics, 2024, 45(5): 957-967. |
[10] | HU Xin, CHANG Yashu, QIN Hao, XIAO Jian, CHENG Hongliang. Binocular ranging method based on improved YOLOv8 and GMM image point set matching [J]. Journal of Graphics, 2024, 45(4): 714-725. |
[11] | NIU Weihua, GUO Xun. Rotating target detection algorithm in ship remote sensing images based on YOLOv8 [J]. Journal of Graphics, 2024, 45(4): 726-735. |
[12] | LI Tao, HU Ting, WU Dandan. Monocular depth estimation combining pyramid structure and attention mechanism [J]. Journal of Graphics, 2024, 45(3): 454-463. |
[13] | ZHU Guanghui, MIAO Jun, HU Hongli, SHEN Ji, DU Ronghua. 3D piece-wise planar reconstruction from a single indoor image based on self-augmented -attention mechanism [J]. Journal of Graphics, 2024, 45(3): 464-471. |
[14] | WANG Zhiru, CHANG Yuan, LU Peng, PAN Chengwei. A review on neural radiance fields acceleration [J]. Journal of Graphics, 2024, 45(1): 1-13. |
[15] | WANG Xinyu, LIU Hui, ZHU Jicheng, SHENG Yurui, ZHANG Caiming. Deep multimodal medical image fusion network based on high-low frequency feature decomposition [J]. Journal of Graphics, 2024, 45(1): 65-77. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||