欢迎访问《图学学报》 分享到:

图学学报 ›› 2025, Vol. 46 ›› Issue (1): 35-46.DOI: 10.11996/JG.j.2095-302X.2025010035

• 图像处理与计算机视觉 • 上一篇    下一篇

基于转置注意力和CNN的图像超分辨率重建网络

陈冠豪(), 徐丹(), 贺康建, 施洪贞, 张浩   

  1. 云南大学信息学院,云南 昆明 650091
  • 收稿日期:2024-07-04 接受日期:2024-10-07 出版日期:2025-02-28 发布日期:2025-02-14
  • 通讯作者:徐丹(1968-),女,教授,博士。主要研究方向为计算机视觉、图像分析与理解、文化计算等。E-mail:danxu@ynu.edu.cn
  • 第一作者:陈冠豪(1995-),男,硕士研究生。主要研究方向为计算机视觉。E-mail:chenguanhao@stu.ynu.edu.cn
  • 基金资助:
    国家自然科学基金(62162068);国家自然科学基金(62202416)

TSA-SFNet: transpose self-attention and CNN based stereoscopic fusion network for image super-resolution

CHEN Guanhao(), XU Dan(), HE Kangjian, SHI Hongzhen, ZHANG Hao   

  1. School of Information Science and Engineering, Yunnan University, Kunming Yunnan 650091
  • Received:2024-07-04 Accepted:2024-10-07 Published:2025-02-28 Online:2025-02-14
  • Contact: XU Dan (1968-), professor, Ph.D. Her main research interests cover computer vision, image analysis and understanding, cultural computing, etc. E-mail:danxu@ynu.edu.cn
  • First author:CHEN Guanhao (1995-), master student. His main research interest covers computer vision. E-mail:chenguanhao@stu.ynu.edu.cn
  • Supported by:
    National Natural Science Foundation of China(62162068);National Natural Science Foundation of China(62202416)

摘要:

基于Transformer的图像超分辨率重建方法近年来表现出了显著的性能。针对现有方法仍然面临诸如高频信息不完全恢复、图像重建时附加像素激活不足、跨窗口信息交互不充分以及由残差连接引起的训练不稳定等挑战,提出了基于转置注意力和CNN的图像超分辨率重建网络(TSA-SFNet)。TSA-SFNet通过调整窗口多头自注意力模块来缓解残差连接引起的振幅问题,并引入通道注意力以激活更多像素进行图像重建。此外,为了加强相邻窗口之间的交互以捕获更多的结构信息,并实现对高频细节更全面的重建,同时引入了重叠窗口注意力和卷积前馈神经网络。在经典的超分辨率任务和真实世界的超分辨率挑战方面对该网络模型进行了定量和定性评估。实验结果表明,TSA-SFNet在5个常用基准数据集上取得了最好的结果,并生成了更为逼真的超分辨率重建图像。

关键词: 图像超分辨率重建, 重叠窗口注意力, 高频信息恢复, 像素激活, 自注意力机制

Abstract:

Transformer-based image super-resolution methods have demonstrated remarkable performance in recent years. Nonetheless, existing approaches encounter challenges such as incomplete recovery of high-frequency information, insufficient activation of additional pixels for image reconstruction, inadequate cross-window information interaction, and training instability caused by residual connections. To address these challenges, the transpose self-attention and CNN based stereoscopic fusion network (TSA-SFNet) was proposed. TSA-SFNet adapted the window multi-head self-attention modules to mitigate amplitude issues caused by residual connections and incorporated channel attention to activate more pixels for image reconstruction. Additionally, to bolster the interaction between adjacent windows for capturing additional structural information and achieving a more comprehensive reconstruction of high-frequency details, overlapping window attention and a convolutional feedforward neural network were introduced. Quantitative and qualitative evaluations of the network model were conducted on classical super-resolution tasks and real-world super-resolution challenges. The experimental results demonstrated that the proposed TSA-SFNet achieved state-of-the-art results on five commonly used benchmark datasets and generated more realistic super-resolution images.

Key words: image super-resolution, overlapping window attentions, high frequency information recovery, pixel activation, self-attention mechanism

中图分类号: