Welcome to Journal of Graphics share: 

Journal of Graphics ›› 2025, Vol. 46 ›› Issue (1): 35-46.DOI: 10.11996/JG.j.2095-302X.2025010035

• Image Processing and Computer Vision • Previous Articles     Next Articles

TSA-SFNet: transpose self-attention and CNN based stereoscopic fusion network for image super-resolution

CHEN Guanhao(), XU Dan(), HE Kangjian, SHI Hongzhen, ZHANG Hao   

  1. School of Information Science and Engineering, Yunnan University, Kunming Yunnan 650091
  • Received:2024-07-04 Accepted:2024-10-07 Online:2025-02-28 Published:2025-02-14
  • Contact: XU Dan
  • About author:First author contact:

    CHEN Guanhao (1995-), master student. His main research interest covers computer vision. E-mail:chenguanhao@stu.ynu.edu.cn

  • Supported by:
    National Natural Science Foundation of China(62162068);National Natural Science Foundation of China(62202416)

Abstract:

Transformer-based image super-resolution methods have demonstrated remarkable performance in recent years. Nonetheless, existing approaches encounter challenges such as incomplete recovery of high-frequency information, insufficient activation of additional pixels for image reconstruction, inadequate cross-window information interaction, and training instability caused by residual connections. To address these challenges, the transpose self-attention and CNN based stereoscopic fusion network (TSA-SFNet) was proposed. TSA-SFNet adapted the window multi-head self-attention modules to mitigate amplitude issues caused by residual connections and incorporated channel attention to activate more pixels for image reconstruction. Additionally, to bolster the interaction between adjacent windows for capturing additional structural information and achieving a more comprehensive reconstruction of high-frequency details, overlapping window attention and a convolutional feedforward neural network were introduced. Quantitative and qualitative evaluations of the network model were conducted on classical super-resolution tasks and real-world super-resolution challenges. The experimental results demonstrated that the proposed TSA-SFNet achieved state-of-the-art results on five commonly used benchmark datasets and generated more realistic super-resolution images.

Key words: image super-resolution, overlapping window attentions, high frequency information recovery, pixel activation, self-attention mechanism

CLC Number: