欢迎访问《图学学报》 分享到:

图学学报 ›› 2021, Vol. 42 ›› Issue (5): 767-774.DOI: 10.11996/JG.j.2095-302X.2021050767

• 图像处理与计算机视觉 • 上一篇    下一篇

基于 RGB-D 的反向融合实例分割算法

  

  1. 合肥工业大学计算机与信息学院,安徽 合肥 230009
  • 出版日期:2021-10-31 发布日期:2021-11-03
  • 基金资助:
    国家自然科学基金项目(61876057,61971177) 

A reverse fusion instance segmentation algorithm based on RGB-D  

  1. School of Computer and Information, Hefei University of Technology, Hefei Anhui 230009, China
  • Online:2021-10-31 Published:2021-11-03
  • Supported by:
    National Natural Science Foundation of China (61876057, 61971177) 

摘要: RGB-D 图像在提供场景 RGB 信息的基础上添加了 Depth 信息,可以有效地描述场景的色彩及 三维几何信息。结合 RGB 图像及 Depth 图像的特点,提出一种将高层次的语义特征反向融合到低层次的边缘 细节特征的反向融合实例分割算法。该方法通过采用不同深度的特征金字塔网络(FPN)分别提取 RGB 与 Depth 图像特征,将高层特征经上采样后达到与最底层特征同等尺寸,再采用反向融合将高层特征融合到低层,同时 在掩码分支引入掩码优化结构,从而实现 RGB-D 的反向融合实例分割。实验结果表明,反向融合特征模型能 够在 RGB-D 实例分割的研究中获得更加优异的成绩,有效地融合了 Depth 图像与彩色图像 2 种不同特征图像 特征,在使用 ResNet-101 作为骨干网络的基础上,与不加入深度信息的 Mask R-CNN 相比平均精度提高 10.6%, 比直接正向融合 2 种特征平均精度提高 4.5%。

关键词: Depth 图像, 实例分割, 特征融合, 反向融合, 掩码优化

Abstract: The RGB-D images add the Depth information with the given RGB information of the scene, which can effectively describe the color and three-dimensional geometric information of the scene. With the integration of the characteristics of RGB image and Depth image, this paper proposed a reverse fusion instance segmentation algorithm that reversely merged high-level semantic features to low-level edge detail features. In order to achieve RGB-D reverse fusion instance segmentation, this method extracted RGB and depth image features separately using feature pyramid networks (FPN) of different depths, upsampling high-level features to the same size as the bottom-level features. Then reverse fusion was utilized to fuse the high-level features to the low-level, and at the same time mask optimization structurewas introduced to mask branch. The experimental results show that the proposed reverse fusion feature model can produce more excellent results in the research on RGB-D instance segmentation, effectively fusing two different feature image features of Depth image and color image. On the basis of ResNet-101 serving as the backbone network, compared with mask R-CNN without depth information, the average accuracy was increased by 10.6%, and that of the two features was increased by 4.5% with the direct forward fusion. 

Key words: Depth images, instance segmentation, feature fusion, reverse fusion, mask refinement

中图分类号: