Welcome to Journal of Graphics share: 

Journal of Graphics ›› 2025, Vol. 46 ›› Issue (2): 322-331.DOI: 10.11996/JG.j.2095-302X.2025020322

• Computer Graphics and Virtual Reality • Previous Articles     Next Articles

BPA-SAM: box prompt augmented SAM for traditional Chinese realistic painting

ZHANG Tiansheng1(), ZHU Minfeng2(), REN Yiwen3, WANG Chenhan3, ZHANG Lidong3, ZHANG Wei4, CHEN Wei1   

  1. 1. State Key Laboratory of CAD & CG, Zhejiang University, Hangzhou Zhejiang 310058, China
    2. School of Software Technology, Zhejiang University, Hangzhou Zhejiang 310058, China
    3. College of Computer Science and Technology, Zhejiang University, Hangzhou Zhejiang 310058, China
    4. Hangzhou City University, Hangzhou Zhejiang 310015, China
  • Received:2024-10-28 Accepted:2024-12-13 Online:2025-04-30 Published:2025-04-24
  • Contact: ZHU Minfeng
  • About author:First author contact:

    ZHANG Tiansheng (1998-), master student. His main research interest covers computer vision. E-mail:22221302@zju.edu.cn

  • Supported by:
    National Natural Science Foundation of China(62132017);Key Research and Development “Pioneer” Tackling Plan Program in Zhejiang Province(2023C01119);Zhejiang Provincial Natural Science Foundation of China(LD24F020011);Zhejiang Provincial Natural Science Foundation of China(Q24F020006)

Abstract:

Due to the lack of publicly available meticulously annotated datasets for traditional Chinese realistic painting, the development of image segmentation techniques in this field is severely hindered. Traditional Chinese realistic painting exhibits characteristics such as similarity in object and background color textures, as well as blurred object boundaries due to the use of gradient transitions, posing challenges for image segmentation. The emergence of the segment anything model (SAM) presents new possibilities for addressing these challenges. Despite SAM demonstrating remarkable segmentation capabilities and zero-shot generalization in the natural image domain, it faces issues of insensitivity to object details and foreground-background confusion when processing traditional Chinese realistic painting. To address these issues, a segmented Traditional Chinese realistic painting dataset themed around flowers and birds was constructed, comprising 403 images with 5 classes of fore-ground objects. Subsequently, we employed the LoRA (Low-Rank Adaptation) method was employed to fine-tune SAM, enabling it to adapt to the characteristics of traditional Chinese realistic paintings. Additionally, a novel boundary box prompting enhancement method called BPA-SAM was proposed, based on the U-Net model, to address fore-ground-background confusion by generating point prompts within the boundary box range. Ultimately, experiments confirmed that our approach improved SAM’s segmentation performance by 7.1% under boundary box prompting conditions, establishing a foundation for SAM’s image segmentation applications in the traditional Chinese realistic painting domain.

Key words: deep learning, image segmentation, traditional Chinese realistic painting, prompt augmentation, computer vision

CLC Number: