特征融合网络：多通道信息融合的光场深度估计

doi:10.11996/JG.j.2095-302X.2020060922

图学学报 ›› 2020, Vol. 41 ›› Issue (6): 922-929.DOI: 10.11996/JG.j.2095-302X.2020060922

• 图像处理与计算机视觉 • 上一篇下一篇

特征融合网络：多通道信息融合的光场深度估计

(合肥工业大学计算机与信息学院，安徽合肥 230009)

出版日期:2020-12-31 发布日期:2021-01-08
基金资助:
基金项目：国家自然科学基金面上项目(61876057，61971177)

FANET: light field depth estimation with multi-channel information fusion

(School of Computer and Information, Hefei University of Technology, Hefei Anhui 230009, China)

Online:2020-12-31 Published:2021-01-08
Supported by:
Foundation items：General Project of National Natural Science Foundation of China (61876057, 61971177)

摘要/Abstract

摘要： 摘要：光场相机可以仅在一次拍摄中记录场景的空间和角度信息，所生成的图像与传统二维图像相比包含了更多的信息，在深度估计任务方面更具有优势。为了利用光场图像获取高质量的场景深度，基于其多视角的表征方式，提出了一种具有多通道信息高效融合结构的特征融合网络。在人为选择特定视角的基础上，使用不同尺寸卷积核来应对不同的基线变化；同时针对光场数据的多路输入特点搭建了特征融合模块，并利用双通道的网络结构整合神经网络的前后层信息，提升网络的学习效率并减少信息损失。在 new HCI 数据集上的实验结果显示，该网络在训练集上的收敛速度较快，可以在非朗伯场景中实现精确的深度估计，并且在 MSE 指标的平均值表现上要优于所对比的其他先进的方法。

关键词: 关键词：光场, 深度估计, 卷积神经网络, 特征融合, 注意力, 多视角

Abstract: Abstract: Compared with the traditional two-dimensional images, the images, generated by the light field camera capturing the spatial and angular information of the scene in only one shot, contain more information and exhibit more advantages in the depth estimation task. In order to obtain high-quality scene depth using light field images, a feature assigning network, of which the structure can efficiently fuse the multi-channel information, was designed for depth estimation based on its multi-angle representation. On the basis of the artificial selection of specific views, convolution kernels of different sizes were utilized to cope with different baseline changes. Meanwhile, a feature fusion module was established based on the multi-input characteristics of light field data, and the double-channel network structure was used to integrate the front and back layer information, boosting the learning efficiency and performance of the network. Experimental results on the new HCI data set show that the network converges faster on the training set and can achieve accurate depth estimation in non-Lambertian scenes, and that the average performance on the MSE indicator is superior to other advanced methods.

Key words: Keywords: light field, depth estimation, convolutional neural network, feature fusion, attention, multi-view

中图分类号:

中图分类号：TP 391

何也, 张旭东, 吴迪. 特征融合网络：多通道信息融合的光场深度估计 [J]. 图学学报, 2020, 41(6): 922-929.

HE Ye, ZHANG Xu-dong, WU Di. FANET: light field depth estimation with multi-channel information fusion [J]. Journal of Graphics, 2020, 41(6): 922-929.

[1]	张盾, 黄志开, 王欢, 吴义鹏, 王颖, 邹家豪. 基于多尺度特征实现超参进化的野生菌分类研究与应用[J]. 图学学报, 2022, 43(4): 580-589.
[2]	刘玉珍, 李楠, 陶志勇. 基于环查询和通道注意力的点云分类与分割[J]. 图学学报, 2022, 43(4): 616-623.
[3]	贺琪, 李汶龙, 宋巍, 杜艳玲, 黄冬梅, 耿立佳 . 结合残差时空注意力机制的海面温度预测算法[J]. 图学学报, 2022, 43(4): 677-684.
[4]	王素琴, 任琪, 石敏, 朱登明. 基于异常检测的产品表面缺陷检测与分割[J]. 图学学报, 2022, 43(3): 377-386.
[5]	方洪波, 万广, 陈忠辉, 黄以卫, 张文勇, 谢本亮. 基于改进 YOLOv5s 的离线手写数学符号识别[J]. 图学学报, 2022, 43(3): 387-395.
[6]	白静, 孟庆亮, 徐昊, 范有福, 杨瞻源. ST-Rec3D：基于结构和目标感知的三维重建[J]. 图学学报, 2022, 43(3): 469-477.
[7]	李扬科, 宋全博, 周元峰. 用于手势识别的时空融合网络以及虚拟签名系统[J]. 图学学报, 2022, 43(3): 504-512.
[8]	廖志伟, 金兢, 张超凡, 杨学志. 基于分层压缩激励的 ASPP 网络单目深度估计[J]. 图学学报, 2022, 43(2): 214-222.
[9]	张运波, 易鹏飞, 周东生, 张强, 魏小鹏. 深度可分离卷积和标准卷积相结合的高效行人检测器[J]. 图学学报, 2022, 43(2): 230-238.
[10]	张明, 张芳慧, 宗佳平, 宋治, 岑翼刚, 张琳娜. 基于轻量级网络的人脸检测及嵌入式实现[J]. 图学学报, 2022, 43(2): 239-246.
[11]	苏常保, 龚世才. 基于深度学习的人物肖像全自动抠图算法[J]. 图学学报, 2022, 43(2): 247-253.
[12]	李翠云, 白静, 郑凉. 融合边缘增强注意力机制和 U-Net 网络的医学图像分割[J]. 图学学报, 2022, 43(2): 273-278.
[13]	刘玉杰, 张敏杰, 李宗民, 李华. 基于全局姿态感知的轻量级人体姿态估计[J]. 图学学报, 2022, 43(2): 333-341.
[14]	何国忠, 梁宇. 基于卷积神经网络的 PCB 缺陷检测[J]. 图学学报, 2022, 43(1): 21-27.
[15]	温静, 丁友东, 于冰. 基于上下文门卷积的盲图像修复[J]. 图学学报, 2022, 43(1): 70-78.

特征融合网络：多通道信息融合的光场深度估计

FANET: light field depth estimation with multi-channel information fusion

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价