基于编解码卷积神经网络的单张图像深度估计

doi:10.11996/JG.j.2095-302X.2019040718

图学学报

基于编解码卷积神经网络的单张图像深度估计

(北方工业大学信息学院，北京 100144)

出版日期:2019-08-31 发布日期:2019-08-30
基金资助:
北京市教委面上基金(KM201510009005)；北方工业大学学生科技活动项目(110051360007)

Single Image Depth Estimation Based on Encoder-Decoder Convolution Neural Network

(School of Information Science and Technology, North China University of Technology, Beijing 100144, China)

Online:2019-08-31 Published:2019-08-30

摘要/Abstract

摘要： 摘要：针对传统方法在单目视觉图像深度估计时存在鲁棒性差、精度低等问题，提出一种基于卷积神经网络(CNN)的单张图像深度估计方法。首先，提出层级融合编码器-解码器网络，该网络是对端到端的编码器-解码器网络结构的一种改进。编码器端引入层级融合模块，并通过对多层级特征进行融合，提升网络对多尺度信息的利用率。其次，提出多感受野残差模块，其作为解码器的主要组成部分，负责从高级语义信息中估计深度信息。同时，多感受野残差模块可灵活地调整网络感受野大小，提高网络对多尺度特征的提取能力。在 NYUD v2 数据集上完成网络模型有效性验证。实验结果表明，与多尺度卷积神经网络相比，该方法在精度 δ<1.25 上提高约 4.4%，在平均相对误差指标上降低约 8.2%。证明其在单张图像深度估计的可行性。

关键词: 关键词：CNN, 编码器-解码器, 深度估计, 单目视觉

Abstract: Abstract: Focusing on the poor robustness and lower accuracy in traditional methods of estimating depth in monocular vision, a method based on convolution neural network (CNN) is proposed for predicting depth from a single image. At first, fused-layers encoder-decoder network is presented. This network is an improvement of the end-to-end encoder-decoder network structure. Fused-layers block is added to encoder network, and the network utilization of multi-scale information is improved by this block with fusing multi-layers feature. Then, a multi-receptive field res-block is proposed, which is the main component of the decoder and used for estimating depth from high-level semantic information. Meanwhile, the network capacity of multi-scale feature extraction is enhanced because the size of receptive field is flexible to change in multi-receptive field res-block. The validation of proposed network is conducted on NYUD v2 dataset, and compared with multi-scale convolution neural network, experimental results show that the accuracy of proposed method is improved by about 4.4% in δ<1.25 and average relative error is reduced by about 8.2%. The feasibility of proposed method in estimating depth from a single image is proved.

Key words: Keywords: CNN, encoder-decoder, depth estimation, monocular vision

贾瑞明，刘立强，刘圣杰，崔家礼 . 基于编解码卷积神经网络的单张图像深度估计[J]. 图学学报, DOI: 10.11996/JG.j.2095-302X.2019040718.

JIA Rui-ming, LIU Li-qiang, LIU Sheng-jie, CUI Jia-li . Single Image Depth Estimation Based on Encoder-Decoder Convolution Neural Network[J]. Journal of Graphics, DOI: 10.11996/JG.j.2095-302X.2019040718.

[1]	廖志伟, 金兢, 张超凡, 杨学志. 基于分层压缩激励的 ASPP 网络单目深度估计[J]. 图学学报, 2022, 43(2): 214-222.
[2]	牟琦, 张寒, 何志强, 李占利 . 基于深度估计和特征融合的尺度自适应目标跟踪算法[J]. 图学学报, 2021, 42(4): 563-571.
[3]	蒋素琴, 张梦骏, 李蔚清, 苏智勇 . 基于投票决策的实时遮挡处理技术[J]. 图学学报, 2021, 42(4): 629-635.
[4]	何也, 张旭东, 吴迪. 特征融合网络：多通道信息融合的光场深度估计 [J]. 图学学报, 2020, 41(6): 922-929.
[5]	温静，安国艳，梁宇栋 . 基于 CNN 特征提取和加权深度迁移的单目图像深度估计[J]. 图学学报, 2019, 40(2): 248-255.

基于编解码卷积神经网络的单张图像深度估计

Single Image Depth Estimation Based on Encoder-Decoder Convolution Neural Network

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 5

编辑推荐

Metrics

本文评价