欢迎访问《图学学报》 分享到:

图学学报 ›› 2025, Vol. 46 ›› Issue (2): 322-331.DOI: 10.11996/JG.j.2095-302X.2025020322

• 计算机图形学与虚拟现实 • 上一篇    下一篇

BPA-SAM:面向工笔画数据的SAM边界框提示增强方法

张天圣1(), 朱闽峰2(), 任怡雯3, 王琛涵3, 张立冬3, 张玮4, 陈为1   

  1. 1.浙江大学计算机辅助设计与图形系统全国重点实验室,浙江 杭州 310058
    2.浙江大学软件学院,浙江 杭州 310058
    3.浙江大学计算机科学与技术学院,浙江 杭州 310058
    4.浙大城市学院,浙江 杭州 310015
  • 收稿日期:2024-10-28 接受日期:2024-12-13 出版日期:2025-04-30 发布日期:2025-04-24
  • 通讯作者:朱闽峰(1993-),男,研究员,博士。主要研究方向为人工智能、可视分析等。E-mail:minfeng_zhu@zju.edu.cn
  • 第一作者:张天圣(1998-),男,硕士研究生。主要研究方向为计算机视觉。E-mail:22221302@zju.edu.cn
  • 基金资助:
    国家自然科学基金(62132017);浙江省重点研发“尖兵”攻关计划(2023C01119);浙江省自然科学基金(LD24F020011);浙江省自然科学基金(Q24F020006)

BPA-SAM: box prompt augmented SAM for traditional Chinese realistic painting

ZHANG Tiansheng1(), ZHU Minfeng2(), REN Yiwen3, WANG Chenhan3, ZHANG Lidong3, ZHANG Wei4, CHEN Wei1   

  1. 1. State Key Laboratory of CAD & CG, Zhejiang University, Hangzhou Zhejiang 310058, China
    2. School of Software Technology, Zhejiang University, Hangzhou Zhejiang 310058, China
    3. College of Computer Science and Technology, Zhejiang University, Hangzhou Zhejiang 310058, China
    4. Hangzhou City University, Hangzhou Zhejiang 310015, China
  • Received:2024-10-28 Accepted:2024-12-13 Published:2025-04-30 Online:2025-04-24
  • First author:ZHANG Tiansheng (1998-), master student. His main research interest covers computer vision. E-mail:22221302@zju.edu.cn
  • Supported by:
    National Natural Science Foundation of China(62132017);Key Research and Development “Pioneer” Tackling Plan Program in Zhejiang Province(2023C01119);Zhejiang Provincial Natural Science Foundation of China(LD24F020011);Zhejiang Provincial Natural Science Foundation of China(Q24F020006)

摘要:

由于缺乏带有像素级标注的公开工笔画数据集,使得图像分割技术在工笔画领域的发展严重受阻。工笔画具有物象与背景颜色纹理相似、使用晕染渐变导致物象边界模糊等特性,给图像分割带来了挑战,SAM的出现为解决这些挑战带来新的可能性。尽管SAM在自然图像领域里展现出惊人分割能力和零样本泛化能力,但在处理工笔画图像时存在对物象不敏感、前景背景混淆等问题。针对上述问题,首先建立了一个包含403幅图像的花鸟主题工笔画数据集SegTCRP,其中包含5类前景对象。随后,采用LoRA方法对SAM进行微调,使其适应工笔画图像的特点。此外,提出了一种新的SAM边界框提示增强方法BPA-SAM,通过借助U-Net在边界框提示范围内基于一定策略辅助生成额外点提示来改善SAM前景背景混淆的问题。最终,实验验证了BPA-SAM较原始SAM在边界框提示条件下的分割性能提升了7.1%,为SAM在工笔画领域的图像分割应用奠定了基础。

关键词: 深度学习, 图像分割, 工笔画, 提示增强, 计算机视觉

Abstract:

Due to the lack of publicly available meticulously annotated datasets for traditional Chinese realistic painting, the development of image segmentation techniques in this field is severely hindered. Traditional Chinese realistic painting exhibits characteristics such as similarity in object and background color textures, as well as blurred object boundaries due to the use of gradient transitions, posing challenges for image segmentation. The emergence of the segment anything model (SAM) presents new possibilities for addressing these challenges. Despite SAM demonstrating remarkable segmentation capabilities and zero-shot generalization in the natural image domain, it faces issues of insensitivity to object details and foreground-background confusion when processing traditional Chinese realistic painting. To address these issues, a segmented Traditional Chinese realistic painting dataset themed around flowers and birds was constructed, comprising 403 images with 5 classes of fore-ground objects. Subsequently, we employed the LoRA (Low-Rank Adaptation) method was employed to fine-tune SAM, enabling it to adapt to the characteristics of traditional Chinese realistic paintings. Additionally, a novel boundary box prompting enhancement method called BPA-SAM was proposed, based on the U-Net model, to address fore-ground-background confusion by generating point prompts within the boundary box range. Ultimately, experiments confirmed that our approach improved SAM’s segmentation performance by 7.1% under boundary box prompting conditions, establishing a foundation for SAM’s image segmentation applications in the traditional Chinese realistic painting domain.

Key words: deep learning, image segmentation, traditional Chinese realistic painting, prompt augmentation, computer vision

中图分类号: