欢迎访问《图学学报》 分享到:

图学学报 ›› 2021, Vol. 42 ›› Issue (6): 899-907.DOI: 10.11996/JG.j.2095-302X.2021060899

• 图像处理与计算机视觉 • 上一篇    下一篇

属性语义与图谱语义融合增强的 零次学习图像识别

  

  1. 云南大学软件学院,云南 昆明 650500
  • 出版日期:2022-01-18 发布日期:2022-01-18
  • 基金资助:
    中国科协“青年人才托举工程”项目(W8193209);云南省科技厅项目(202001BB050035) 

Attribute and graph semantic reinforcement based zero-shot learning for image recognition

  1. School of Software, Yunnan University, Kunming Yunnan 650500, China
  • Online:2022-01-18 Published:2022-01-18
  • Supported by:
    China Association for Science and Technology “Youths Talents Support Project” (W8193209); Technology Department Program of Yunnan Province (202001BB050035) 

摘要: 零次学习(ZSL)是迁移学习在图像识别领域一个重要的分支。其主要的学习方法是在不使用未见类 的情况下,通过训练可见类语义属性和视觉属性映射关系来对未见类样本进行识别,是当前图像识别领域的热点。 现有的 ZSL 模型存在语义属性和视觉属性的信息不对称,语义信息不能很好地描述视觉信息,从而出现了领域漂 移问题。未见类语义属性到视觉属性合成过程中部分视觉特征信息未被合成,影响了识别准确率。为了解决未见 类语义特征缺失和未见类视觉特征匹配合成问题,本文设计了属性语义与图谱语义融合增强的 ZSL 模型实现 ZSL 效果的提升。该模型学习过程中使用知识图谱关联视觉特征,同时考虑样本之间的属性联系,对可见类样本和未 见类样本语义信息进行了增强,采用对抗式的学习过程加强视觉特征的合成。该方法在 4 个典型的数据集上实验 表现出了较好的实验效果,模型也可以合成较为细致的视觉特征,优于目前已有的 ZSL 方法。

关键词: 零次学习, 知识图谱, 生成对抗网络, 图卷积神经网络, 图像识别

Abstract: Zero-shot learning (ZSL) is an important branch of transfer learning in the field of image recognition. The main learning method is to train the mapping relationship between the semantic attributes of the visible category and the visual attributes without using the unseen category, and use this mapping relationship to identify the unseen category samples, which is a hot spot in the current image recognition field. For the existing ZSL model, there remains the information asymmetry between the semantic attributes and the visual attributes, and the semantic information cannot well describe visual information, leading to the problem of domain shift. In the process of synthesizing unseen semantic attributes into visual attributes, part of the visual feature information was not synthesized, which affected the recognition accuracy. In order to solve the problem of the lack of unseen semantic features and synthesis of unseen visual features, this paper designed a ZSL model that combined attribute and graph semantic to improve the zero-shot learning’s accuracy. In the learning process of the model, the knowledge graph was employed to associate visual features, while considering the attribute connection among samples, the semantic information of the seen and unseen samples was enhanced, and the adversarial learning process was utilized to strengthen the synthesis of visual features. The method shows good experimental results through experiments on four typical data sets, and the model can synthesize more detailed visual features, and its performance is superior to the existing ZSL methods. 

Key words:  , zero-shot learning, knowledge graph, generative adversarial networks, graph convolution, image recognition

中图分类号: