基于结构引导边界增长的大孔洞深度补全算法

doi:10.11996/JG.j.2095-302X.2024030548

图学学报 ›› 2024, Vol. 45 ›› Issue (3): 548-557.DOI: 10.11996/JG.j.2095-302X.2024030548

• 计算机图形学与虚拟现实 • 上一篇下一篇

基于结构引导边界增长的大孔洞深度补全算法

赵盛¹^,²(), 吴晓群¹^,²(), 刘鑫¹^,²

1.北京工商大学计算机与人工智能学院，北京 100048
2.食品安全大数据技术北京市重点实验室，北京 100048

收稿日期:2023-11-08 接受日期:2024-02-21 出版日期:2024-06-30 发布日期:2024-06-12
通讯作者:吴晓群(1984-)，女，教授，博士。主要研究方向为计算机图形学、数字几何处理和图像处理。E-mail：wuxiaoqun@btbu.edu.cn
第一作者:赵盛(1996-)，男，硕士研究生。主要研究方向为计算机图形学、数字几何处理和图像处理。E-mail：winner_zs@163.com
基金资助:
国家自然科学基金面上项目(62272014)

Depth completion with large holes based on structure-guided boundary propagation

ZHAO Sheng¹^,²(), WU Xiaoqun¹^,²(), LIU Xin¹^,²

1. School of Computing and Artificial Intelligence, Beijing Technology and Business University, Beijing 100048, China
2. Beijing Key Laboratory of Big Data Technology for Food Safety, Beijing 100048, China

Received:2023-11-08 Accepted:2024-02-21 Published:2024-06-30 Online:2024-06-12
First author：ZHAO Sheng (1996-), master student. His main research interests cover computer graphics, digital geometry processing and image processing. E-mail：winner_zs@163.com
Supported by:
National Natural Science Foundation of China General Program(62272014)

摘要/Abstract

摘要：

使用消费级深度相机采集深度信息时，受到设备、环境和物体材质等因素的影响，采集的深度信息往往存在缺失和孔洞，使得深度图像在后续的视觉任务中应用受限。现有的深度补全算法在解决大面积深度缺失时存在补全效果不佳和物体边界保持较差的问题。针对这2个问题，提出了基于结构引导边界增长的大孔洞深度补全算法。首先，结合RGB图像提供的边界信息，利用结构引导的边界增长策略补全物体边界处的深度缺失；最后，利用大孔洞切分填充与均值滤波相结合的方法，补全物体内部的大孔洞。实验结果表明，该算法能够在具有大面积缺失以及跨越物体缺失情况下有效地保持物体边界，同时能够补全大面积缺失的深度信息，并在多个数据集上的定量以及定性结果证明了该方法的有效性。

关键词: 分割图像, 结构引导, Bézier曲线拟合, 大孔洞补全, 边界增长

Abstract:

When collecting depth information using consumer-depth cameras, the collected depth information is often influenced by factors such as equipment, environment, and object material, often leading to missing depth information and holes, limiting the application of depth images in subsequent vision tasks. Existing depth-completion algorithms often struggle to effectively address large-area depth missing, resulting in poor complementation effect and poor object boundary maintenance. To tackle these two problems, a depth-completion algorithm for large holes based on structure-guided boundary growth was proposed. First, combined with the boundary information provided by the RGB images, the structure-guided boundary growth strategy was employed to complement the depth loss at the object boundary. Finally, the large holes inside the object were complemented using a combination of large-hole cut-and-fill and mean filtering. The experimental results demonstrated that the algorithm was able to efficiently maintain object boundaries with large missing areas and across missing objects, while being able to complement the depth information of large missing areas. Quantitative and qualitative results on multiple datasets demonstrated the effectiveness of the method.

Key words: segmented image, structural guidance, Bézier curve fitting, large hole completion, boundary propagation

中图分类号:

TP391

赵盛, 吴晓群, 刘鑫. 基于结构引导边界增长的大孔洞深度补全算法[J]. 图学学报, 2024, 45(3): 548-557.

ZHAO Sheng, WU Xiaoqun, LIU Xin. Depth completion with large holes based on structure-guided boundary propagation[J]. Journal of Graphics, 2024, 45(3): 548-557.

图/表 17

图1 算法框架

Fig. 1 Algorithm framework

图2 结构引导拟合填充方向比较((a)无引导填充方向；(b)结构引导填充方向；(c)边界增长结果)

Fig. 2 Structure-guided fit fill direction comparison ((a) The fill direction without guidance; (b) The fill direction with structural guidance; (c) The result of boundary erosion)

图3 大孔洞切分点采样

Fig. 3 Point sampling of large hole segmentation

图4 结构引导的大孔洞切分填充((a)填充前；(b)填充后)

Fig. 4 Structure-guided cut-and-fill of large holes ((a) Before filling; (b) After filling)

图5 结构引导的边界增长参数影响

Fig. 5 Effect of structure-guided boundary propagation parameters

表1 结构引导的大孔洞切分填充参数影响

Table 1 Effect of structure-guided large-hole cut-and-fill parameters

参数	m
h	1	2	3	4
3	0.607 6	0.601 0	0.594 3	0.588 8
4	0.612 1	0.608 0	0.578 1	0.585 5
5	0.610 9	0.602 6	0.581 0	0.598 0

图6 Hypersim合成数据集不同算法结果比较((a)缺失深度图；(b) MCBR[7]；(c) DepthComp[6]；(d)本文算法；(e)真值)

Fig. 6 Comparison of results of different algorithms in Hypersim synthetic datasets ((a) Raw depth; (b) MCBR[7]; (c) DepthComp[6]; (d) Ours; (e) GT)

表2 Hypersim合成数据集误差统计

Table 2 Hypersim synthesis dataset error statistics

方法	ai_001_001			ai_001_003			ai_001_004			ai_001_006
	16.28%			13.21%			11.29%			12.13%
	SSIM	RMSE	PSNR	SSIM	RMSE	PSNR	SSIM	RMSE	PSNR	SSIM	RMSE	PSNR
DepthComp^[6]	0.992 6	1.134 7	34.214 1	0.997 0	0.920 9	42.095 8	0.993 7	1.237 6	35.842 1	0.995 7	1.241 1	39.453 0
MCBR^[7]	0.982 3	4.002 7	30.032 7	0.983 4	3.302 6	28.429 2	0.978 2	3.299 7	26.074 2	0.982 5	3.322 1	31.888 1
本文算法	0.998 9	0.566 7	53.006 5	0.999 1	0.466 4	52.593 8	0.998 7	1.018 4	46.911 6	0.999 1	0.678 6	51.248 4

表3 合成数据集误差统计

Table 3 Error statistics of synthetic datasets

方法	Buddha		LivingRoom		Outdoor		Table
方法	RMSE	PSNR	RMSE	PSNR	RMSE	PSNR	RMSE	PSNR
M-JBU^[22]	1.50	26.15	1.82	27.97	3.18	25.91	1.85	26.02
FBS^[23]	1.73	26.15	2.16	30.99	2.94	27.09	2.04	29.77
JBF^[1]	2.21	16.17	1.68	23.77	2.90	22.40	1.72	20.76
M-SRF^[4]	1.12	32.17	1.56	33.31	2.44	28.01	1.28	31.71
本文算法	1.11	34.92	1.24	34.45	2.01	32.49	1.17	31.80

图7 合成数据集下不同算法结果比较((a)原始深度；(b) M-JBU[22]；(c) FBS[23]；(d) JBF[1]；(e) M-SRF[4]；(f)本文算法)

Fig. 7 Comparison of results of different algorithms in synthetic datasets ((a) Raw depth; (b) M-JBU[22]; (c) FBS[23]; (d) JBF[1]; (e) M-SRF[4]; (f) Ours)

表4 Middlebury数据集不同算法结果定量比较

Table 4 Quantitative comparison of results of different algorithms in the Middlebury dataset

方法	Adirondack	Bicycle1	Classroom1	Couch	Sword2	Umbrella	Piano
M-JBU^[22]	31.99	34.61	29.04	31.87	34.29	31.00	33.54
FBS^[23]	31.33	34.61	32.47	32.63	33.03	32.61	34.52
M-SRF^[4]	36.13	35.07	32.80	33.83	36.78	35.90	37.55
本文算法	40.59	38.25	40.05	35.83	39.19	38.17	38.04

图8 Middlebury数据集不同算法结果展示((a)原始深度；(b) M-JBU[22]；(c) FBS[23]；(d) M-SRF[4]；(e)本文算法；(f)真值)

Fig. 8 Middlebury dataset different algorithm results display ((a) Raw depth; (b) M-JBU[22]; (c) FBS[23]; (d) M-SRF[4]; (e) Ours; (f) GT)

表5 SUNRGBD数据集误差统计

Table 5 Error statistics of SUNRGBD datasets

方法	001		002		003		004
方法	RMSE	PSNR	RMSE	PSNR	RMSE	PSNR	RMSE	PSNR
MCBR^[7]	1.74	23.28	2.29	15.23	2.12	23.21	1.76	31.52
DepthComp^[6]	1.05	28.17	2.09	20.40	1.77	25.60	1.45	30.88
本文算法	0.95	28.49	1.91	25.80	1.70	33.01	1.09	38.63

图9 SUNRGBD数据集结果展示((a)原始深度；(b) MCBR[7]；(c) DepthComp[6]；(d)真值；(e)本文算法)

Fig. 9 SUNRGBD dataset results display ((a) Raw depth; (b) MCBR[7]; (c) DepthComp[6]; (d) GT; (e) Ours)

表6 与深度学习算法比较

Table 6 Compare with deep learning algorithms

方法	RMSE	是否需要训练
RGB-Guidance^[27]	0.260	是
MSG-CHN^[25]	0.190	是
DM-LRN^[26]	0.205	是
RGB-D Fusion GAN^[11]	0.139	是
CompletionFormer^[12]	0.127	是
本文	0.181	否

表7 数据集介绍

Table 7 Introduction of datasets

数据集	训练集/张	验证集/张
NYUv2^[17]	50 000	654
KITTI^[21]	86 898	1000
混合数据	136 898	600

表8 混合数据下与深度学习算法比较

Table 8 Compare with deep learning algorithms under mixed data

方法	RMSE	是否需要训练
RGB-Guidance^[27]	0.327	是
MS-CHN^[25]	0.251	是
DM-LRN^[26]	0.290	是
RGB-D Fusion GAN^[11]	0.276	是
CompletionFormer^[12]	0.293	是
本文	0.195	否

参考文献 27

[1]	LE A V, JUNG S W, WON C S. Directional joint bilateral filter for depth images[J]. Sensors, 2014, 14(7): 11362-11378. DOI PMID
[2]	万琴, 朱晓林, 陈国泉, 等. 分层联合双边滤波的深度图修复算法研究[J]. 计算机工程与应用, 2021, 57(6): 184-190. DOI
	WAN Q, ZHU X L, CHEN G Q, et al. Research on depth map restoration algorithm based on hierarchical joint bilateral filter[J]. Computer Engineering and Applications, 2021, 57(6): 184-190 (in Chinese). DOI
[3]	QI F, HAN J Y, WANG P J, et al. Structure guided fusion for depth map inpainting[J]. Pattern Recognition Letters, 2013, 34(1): 70-76.
[4]	WU Y T, LI T M, SHEN I C, et al. Multi-resolution shared representative filtering for real-time depth completion[EB/OL]. [2023-06-07]. https://kevincosner.github.io/publications/Wu2021MSR/paper.pdf.
[5]	PO L M, ZHANG S H, XU X Y, et al. A new multidirectional extrapolation hole-filling method for depth-image-based rendering[C]// 2011 18th IEEE International Conference on Image Processing. New York: IEEE Press, 2011: 2589-2592.
[6]	ATAPOUR A A, BRECKON T. DepthComp: real-time depth image completion based on prior semantic scene segmentation[C]// The British Machine Vision Conference 2017. Guildford: British Machine Vision Association, 2017: 208. 1-208.13.
[7]	GARDUÑO-RAMÓN M A, TEROL-VILLALOBOS I R, OSORNIO-RIOS R A, et al. A new method for inpainting of depth maps from time-of-flight sensors based on a modified closing by reconstruction algorithm[J]. Journal of Visual Communication and Image Representation, 2017, 47: 36-47.
[8]	CHENG X J, WANG P, YANG R G. Learning depth with convolutional spatial propagation network[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 42(10): 2361-2379.
[9]	CHENG X J, WANG P, GUAN C Y, et al. CSPN++: learning context and resource aware convolutional spatial propagation networks for depth completion[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7): 10615-10622.
[10]	LIN Y K, CHENG T, ZHONG Q, et al. Dynamic spatial propagation network for depth completion[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2022, 36(2): 1638-1646.
[11]	WANG H W, WANG M Y, CHE Z P, et al. RGB-depth fusion GAN for indoor depth completion[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 6209-6218.
[12]	ZHANG Y M, GUO X D, POGGI M, et al. CompletionFormer: depth completion with convolutions and vision transformers[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 18527-18536.
[13]	SUZUKI S, BE K. Topological structural analysis of digitized binary images by border following[J]. Computer Vision, Graphics, and Image Processing, 1985, 30(1): 32-46.
[14]	SCHARSTEIN D, HIRSCHMÜLLER H, KITAJIMA Y, et al. High-resolution stereo datasets with subpixel-accurate ground truth[M]// Lecture Notes in Computer Science. Cham: Springer International Publishing, 2014: 31-42.
[15]	ROBERTS M, RAMAPURAM J, RANJAN A, et al. Hypersim: a photorealistic synthetic dataset for holistic indoor scene understanding[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 10912-10922.
[16]	SILBERMAN N, FERGUS R. Indoor scene segmentation using a structured light sensor[C]// 2011 IEEE International Conference on Computer Vision Workshops. New York: IEEE Press, 2011: 601-608.
[17]	SILBERMAN N, HOIEM D, KOHLI P, et al. Indoor segmentation and support inference from RGBD images[C]// European Conference on Computer Vision. Heidelberg: Springer, 2012: 746-760.
[18]	SONG S R, LICHTENBERG S P, XIAO J X. SUN RGB-D: a RGB-D scene understanding benchmark suite[C]// 2015 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2015: 567-576.
[19]	JANOCH A, KARAYEV S, JIA Y Q, et al. A category-level 3-D object dataset: putting the Kinect to work[C]// 2011 IEEE International Conference on Computer Vision Workshops. New York: IEEE Press, 2011: 1168-1174.
[20]	XIAO J X, OWENS A, TORRALBA A. SUN3D: a database of big spaces reconstructed using SfM and object labels[C]// 2013 IEEE International Conference on Computer Vision. New York: IEEE Press, 2013: 1625-1632.
[21]	UHRIG J, SCHNEIDER N, SCHNEIDER L, et al. Sparsity invariant CNNs[C]// 2017 International Conference on 3D Vision. New York: IEEE Press, 2017: 11-20.
[22]	RICHARDT C, STOLL C, DODGSON N A, et al. Coherent spatiotemporal filtering, upsampling and rendering of RGBZ videos[J]. Computer Graphics Forum, 2012, 31(2pt1): 247-256.
[23]	BARRON J T, POOLE B. The fast bilateral solver[C]// European Conference on Computer Vision. Cham: Springer, 2016: 617-632.
[24]	FU J, LIU J, TIAN H J, et al. Dual attention network for scene segmentation[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 3141-3149.
[25]	LI A, YUAN Z J, LING Y G, et al. A multi-scale guided cascade hourglass network for depth completion[C]// 2020 IEEE Winter Conference on Applications of Computer Vision. New York: IEEE Press, 2020: 32-40.
[26]	SENUSHKIN D, ROMANOV M, BELIKOV I, et al. Decoder modulation for indoor depth completion[C]// 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems. New York: IEEE Press, 2021: 2181-2188.
[27]	VAN GANSBEKE W, NEVEN D, DE BRABANDERE B, et al. Sparse and noisy LiDAR completion with RGB guidance and uncertainty[C]// 2019 16th International Conference on Machine Vision Applications. New York: IEEE Press, 2019: 1-6.

基于结构引导边界增长的大孔洞深度补全算法

Depth completion with large holes based on structure-guided boundary propagation

RichHTML

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

图/表 17

参考文献 27

相关文章 15

编辑推荐

Metrics

本文评价

[1]	黄昱喆, 王旭鹏, 陈文会, 周中泽, 赵嘉鑫, 王芸倩. 面向足底压力优化的全接触矫形鞋垫设计[J]. 图学学报, 2024, 45(4): 868-878.
[2]	王枫红, 陈岱琳, 高紫婷, 文兆铖. HUD道路引导空间位置对新手驾驶人的影响[J]. 图学学报, 2024, 45(4): 856-867.
[3]	邹亚坤, 陈贤川, 谭毅, 林永枫, 张亚飞. 基于BIM和三维激光扫描的桁架几何质量自动化检测研究[J]. 图学学报, 2024, 45(4): 845-855.
[4]	张冀, 崔文帅, 张荣华, 王文彬, 李亚琦. 基于关键视图的文本驱动3D场景编辑方法[J]. 图学学报, 2024, 45(4): 834-844.
[5]	侯文军, 郭雨阳, 李桐. 大型公共场所全息地图信息交互应用研究[J]. 图学学报, 2024, 45(4): 827-833.
[6]	朱宝旭, 刘漫丹, 张雯婷, 谢立志. 高分辨率人脸纹理图全流程生成方法[J]. 图学学报, 2024, 45(4): 814-826.
[7]	龚辰晨, 曹力, 张腾腾, 吴奕泽. 面向建筑彩绘纹样的高质量贴图重构方法[J]. 图学学报, 2024, 45(4): 804-813.
[8]	梁成武, 杨杰, 胡伟, 蒋松琪, 钱其扬, 侯宁. 基于时间动态帧选择与时空图卷积的可解释骨架行为识别[J]. 图学学报, 2024, 45(4): 791-803.
[9]	赵磊, 李栋, 房建东, 曹琪. 面向交通标志的改进YOLO目标检测算法[J]. 图学学报, 2024, 45(4): 779-790.
[10]	武兵, 田莹. 基于注意力机制的多尺度道路损伤检测算法研究[J]. 图学学报, 2024, 45(4): 770-778.
[11]	李松洋, 王雪婷, 陈相龙, 陈恩庆. 基于骨骼点动态时域滤波的人体动作识别[J]. 图学学报, 2024, 45(4): 760-769.
[12]	宫永超, 沈旭昆. 一种用于互惠目标检测与实例分割的深层架构[J]. 图学学报, 2024, 45(4): 745-759.
[13]	曾志超, 徐玥, 王景玉, 叶元龙, 黄志开, 王欢. 基于SOE-YOLO轻量化的水面目标检测算法[J]. 图学学报, 2024, 45(4): 736-744.
[14]	牛为华, 郭迅. 基于改进YOLOv8的船舰遥感图像旋转目标检测算法[J]. 图学学报, 2024, 45(4): 726-735.
[15]	胡欣, 常娅姝, 秦皓, 肖剑, 程鸿亮. 基于改进YOLOv8和GMM图像点集匹配的双目测距方法[J]. 图学学报, 2024, 45(4): 714-725.