欢迎访问《图学学报》 分享到:

图学学报

• 图像处理与计算机视觉 • 上一篇    下一篇

ForegroundNet:一种基于语义与动态特征的前景检测算法

  

  1. (1. 中国石化销售股份有限公司华南分公司,广东广州 510000;
    2. 中国科学院软件研究所,北京 100190;
    3. 中国科学院大学计算机科学与技术学院,北京 101408)
  • 出版日期:2020-06-30 发布日期:2020-08-18
  • 基金资助:
    国家自然科学基金项目(61872346);国家重点研发计划项目(2018YFC0809303)

ForegroundNet: a semantic and motional feature based foreground detection algorithm

  1. (1. South China branch of Sinopec Sales Co., Ltd, Guangdong Province, Guangzhou Guangdong 510000, China;
    2. Institute of Software, Chinese Academy of Sciences, Beijing 100190, China;
    3. School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 101408, China)
  • Online:2020-06-30 Published:2020-08-18

摘要: 针对以往的前景检测方法对场景信息依赖较多的问题,提出了一种实时的无需迭
代更新背景模型的前景检测深度学习模型ForegroundNet。ForegroundNet 首先通过骨干网络从
当前图像和辅助图像中提取语义特征,辅助图像为相邻的图像帧或者是自动生成的视频背景图
像;然后将提取得到的特征输入到包含短连接的反卷积网络中,使得最终特征图在与输入图像
具有相同的大小,并且包含不同尺度的语义及动态特征;最后使用softmax 层进行二值分类,
得到最终检测结果。在CDNet 数据集上进行的实验结果表明,相比于当前F 值为0.82 的次优
方法,ForegroundNet 能够获得0.94 的F 值,具有更高的检测精度;同时ForegroundNet 检测速
度达到123 fps,具有良好的实时性。

关键词: 前景检测, 深度学习, 计算机视觉, 卷积神经网络, 运动分割

Abstract: Aiming at the problem that the previous foreground detection methods depend more
heavily on scene information, a real-time foreground detection deep learning model ForegroundNet
without iteratively updating the background model is proposed. ForegroundNet extracts semantic
features from current and auxiliary images with backbone networks firstly, the auxiliary images which
can be either an adjacent image frame or an automatically generated background image. These
features are further fed into deconvolution network with short connections, which make the final
feature maps have the same size as input images and contain semantic and motional features in
different scales, finally we use softmax layer to perform a binary classification. The results on CDNet
dataset show that ForegroundNet achieves better F-Measure of 0.94 compare to the 0.82 of
suboptimal method. More over ForegroundNet has good real-time performance that its speed reaches
123 fps.

Key words: foreground detection, deep learning, computer vision, convolution neural network, motion segmentation