欢迎访问《图学学报》 分享到:

图学学报 ›› 2022, Vol. 43 ›› Issue (1): 53-59.DOI: 10.11996/JG.j.2095-302X.2022010053

• 图像处理与计算机视觉 • 上一篇    下一篇

基于多尺度时域 3D 卷积的视频超分辨率重建

  

  1. 1. 钱学森空间技术实验室,北京 100086;  2. 河南大学软件学院,河南 开封 475004;  3. 清华大学电子工程系,北京 100084
  • 出版日期:2022-02-28 发布日期:2022-02-16
  • 基金资助:
    科技部重点研发计划项目(2020YFA0714100)

Video super-resolution reconstruction based on multi-scale time domain 3D convolution 

  1. 1. Qian Xuesen Space Technology Laboratory, Beijing 100086, China;  2. College of Software, Henan University, Kaifeng Henan 475004, China;  3. Department of Electronic Engineering, Tsinghua University, Beijing 100084, China 
  • Online:2022-02-28 Published:2022-02-16
  • Supported by:
    Key R&D Program of the Ministry of Science and Technology (2020YFA0714100) 

摘要: 视频超分辨率是一项很有实用价值的工作。针对超高清产业中高分辨率资源较为匮乏的问题, 为了有效利用视频序列帧间丰富的时间相关性信息及空间信息,提出一种基于多尺度时域 3D 卷积的视频超分 辨率重建算法。该算法将输入的低分辨率视频序列帧分别通过不同时间尺度的 3D 卷积进行时空特征提取,3D 卷积能够同时对空间与时间建模,相较于 2D 卷积更加适用于视频任务的处理,通过不同尺度时域下提取的 2 种时空特征自适应运动补偿后,由亚像素卷积层执行分辨率的提升并与上采样后的输入帧相加后得到最终重建 的高分辨率图像。在标准数据集上的实验结果表明,该算法无论在视觉效果上,还是峰值信噪比与结构相似性 等客观质量评价指标上,均有显著地提升,优于 FSRCNN 和 EDSR 等算法。

关键词: 视频超分辨率, 深度学习, 3D 卷积, 多尺度时域特征, 亚像素卷积

Abstract: Video super-resolution was a work of great practical value. In view of the lack of high-resolution resources in the ultra-high-definition industry, to efficiently utilize the rich temporal correlation information and spatial information between video sequence frames, a video super-resolution reconstruction algorithm based on multi-scale time-domain 3D convolution was proposed. The algorithm extracted the spatiotemporal features of the input low-resolution video sequence frames through the 3D convolution of different time scales. 3D convolution can simultaneously model space and time, which is more suitable for processing video tasks than 2D convolution. After the adaptive motion compensation of two spatio-temporal features extracted in different scales and time domains, the sub-pixel convolutional layer performed resolution enhancement, which was added to the up-sampled input frame to obtain the final reconstructed high-resolution image. The experimental results on the standard data set show that the algorithm can significantly boost visual effects and objective quality evaluation indicators such as peak signal-to-noise ratio and structural similarity, outperforming algorithms such as FSRCNN and EDSR. 

Key words: video super-resolution, deep learning, 3D convolution, multi-scale time domain features, sub-pixel convolution 

中图分类号: