Welcome to Journal of Graphics share: 

Journal of Graphics ›› 2023, Vol. 44 ›› Issue (5): 955-965.DOI: 10.11996/JG.j.2095-302X.2023050955

• Image Processing and Computer Vision • Previous Articles     Next Articles

A local optimization generation model for image inpainting

YANG Hong-ju1,2(), GAO Min1, ZHANG Chang-you3, BO Wen3, WU Wen-jia3, CAO Fu-yuan1,2   

  1. 1. School of Computer and Information, Shanxi University, Taiyuan Shanxi 030006, China
    2. Computational Intelligence and Chinese Information Processing of Ministry of Education, Shanxi University, Taiyuan Shanxi 030006, China
    3. Institute of Software, Chinese Academy of Sciences, Beijing 100190, China
  • Received:2023-02-24 Accepted:2023-05-06 Online:2023-10-31 Published:2023-10-31
  • About author:YANG Hong-ju (1975-), associate professor, Ph.D. Her main research interests cover computer vision, machine learning, etc. E-mail:yhju@sxu.edu.cn
  • Supported by:
    National Natural Science Foundation of China(61976128);Shanxi Scholarship Council of China(2022-008)


Image inpainting has extensive applications in photo editing and removal. In order to address the limitations of existing deep learning-based image inpainting model, which is affected by the receptive field of convolution operators and results in distorted structure or blurred texture, a locally optimized generation model LesT-GAN was proposed. This model comprised a generator and a discriminator. The generator consisted of a locally enhanced sliding window Transformer module. This module combined the translation invariance and locality advantages of deep convolution with the Transformer’s ability to model global information. As a result, it could cover a wide range of receptive fields while optimizing local details. The discriminator part was a relative average discriminator based on mask guidance and patch. It simulated pixel propagation around the boundary of the missing region by estimating the average probability of a given real image being more realistic than a generated image. As a result, during the generator training, it could generate clearer local textures directly from real images. In comparison experiments with other advanced image inpainting methods on the Places2, CelebA-HQ, and PairsStreet datasets, LesT-GAN improved L1 and FID by more than 10.8% and 41.36%, respectively. Experimental results demonstrated that LesT-GAN exhibited superior restoration performance across multiple scenes, and that it could be well generalized to images with higher resolution than those used during training.

Key words: deep learning, image inpainting, generation model, Transformer, local optimization

CLC Number: