欢迎访问《图学学报》 分享到:

图学学报

• 视觉与图像 • 上一篇    下一篇

视频语义上下文标签树及其结构化分析

  

  • 出版日期:2015-10-30 发布日期:2015-11-05

Video Semantic Context Label Tree and Its Structural Analysis

  • Online:2015-10-30 Published:2015-11-05

摘要: 视频内容具有非常强的时间关联和逻辑结构,镜头语义是视频内容理解的基本单元。
从符合人类认识理解视频内容的角度来看,镜头语义之间隐含着时间上、语义上、结构上的多种
上下文关联信息。合理地描述这种上下文信息至关重要。为此,首先采用一棵带有上下文标签的
标签树作为镜头语义上下文层次结构的表征模型,以序列化的镜头语义序列为底层叶节点,以内
节点的上下文标签表征镜头语义间的上下文关联,其树形结构与视频内容层次化表征形式一致,
能为视频内容理解提供显著的信息增益。然后,着眼于解决镜头语义从其序列结构向标签树的层
次结构转化,采用结构化支持向量机的分析方法,根据镜头语义序列和视频语义上下文标签树的
联合特性构造了语义上下文结构化函数和损失函数,实现了镜头语义的结构化分析。实验结果表
明,视频语义上下文标签树在时序性、层次性、领域性、逻辑性等方面具有良好的表征能力,而
基于结构化支持向量机的结构化分析方法在镜头语义上下文分析的准确率、召回率及F1 值表现
良好。

关键词: 视频语义上下文标签树, 结构化支持向量机, 语义上下文, 结构化数据, 视频语义
标注

Abstract: Video content is strongly associated with time series and has a strong logical structure. Shot
semantic is a kind of basic unit for understanding video content. From the point view of user cognition,
among shot semantics, there are various context information hidden rather than explicit temporal
association, such as logical and structural association. Obviously, it is important to describe these
context information in an reasonable manner. Firstly, this paper presents a label tree with context label
to represent the structured context as characterization model of video semantic context. Within the label
tree, each shot semantic in a shot semantic sequence is taken as a leaf node and all inner nodes with
context label is adopted to represent the inter-dependencies among its child nodes. More important, its
hierarchical structure, corresponding to the hierarchical model of video content, leads to significant
information gain for video content understanding. Furthermore, it is tough to construct a hierarchical
video semantic context label tree from the shot semantic sequence, which needs to bridge from
sequence space to tree structure space. Then, according to the combined feature of shot semantic sequence and video semantic label tree, this paper uses an SVM-Struct analysis to construct structural
function and loss function for the semantic context and implement the construction of video semantic
context label tree. The experimental results show that video semantic context label tree has a better
characterization ability in many aspects. And SVM-Struct driven analysis ensures the characterization
ability of video semantic label tree with high precision, recall and F1 rate.

Key words: video semantic context label tree, SVM-Struct, semantic context, structure data, video semantic
annotation