图学学报 ›› 2026, Vol. 47 ›› Issue (1): 131-142.DOI: 10.11996/JG.j.2095-302X.2026010131
李世亮1,2, 方强2(
), 王屹华1, 施逸飞2, 王卓1, 李泽玉1, 谢云飞1, 王佳1
收稿日期:2025-04-30
接受日期:2025-07-21
出版日期:2026-02-28
发布日期:2026-03-16
通讯作者:方强,E-mail:qiangfang@nudt.edu.cn基金资助:
LI Shiliang1,2, FANG Qiang2(
), WANG Yihua1, SHI Yifei2, WANG Zhuo1, LI Zeyu1, XIE Yunfei1, WANG Jia1
Received:2025-04-30
Accepted:2025-07-21
Published:2026-02-28
Online:2026-03-16
Supported by:摘要:
小样本图像生成在医学成像、艺术创作等领域具有重要的应用价值。近年来,该任务取得了诸多研究成果,主流方法通常依赖将大规模源域数据集上预训练的生成模型迁移至目标域,以缓解目标数据稀缺带来的训练困难。然而,当源域与目标域之间存在显著语义差异时,直接迁移往往会引入不兼容的源域特征,从而引发生成图像真实性降低与风格一致性减弱等问题。现有方法虽通过静态剪枝(如固定阈值裁剪滤波器)去除冗余特征,但仍难以适应深度网络各层特征表达的动态演化规律,且易造成浅层通用特征被误删、深层冗余特征残留等问题,从而影响模型的迁移效果与生成质量。为此,提出了一种基于滤波器重要性估计的动态剪枝方法。首先,在训练过程中持续跟踪各层滤波器的Fisher信息变化,衡量其对图像生成质量的重要性程度。然后,结合Fisher信息构建了一种基于累积重要性权重的自适应剪枝机制,能够动态确定不同层级的剪枝比例,从而更精准地剔除冗余或不兼容特征的滤波器,保留通用的结构语义信息。实验在多个具有代表性的小样本目标域上进行,结果表明,该方法在生成图像质量指标(FID)和多样性指标(Intra-LPIPS)上显著优于现有方法。其中,在与源域语义相差较大的目标域中该方法FID优于现有最优方法,验证了其在跨域小样本图像生成任务中的稳定性与优越性。
中图分类号:
李世亮, 方强, 王屹华, 施逸飞, 王卓, 李泽玉, 谢云飞, 王佳. 基于动态剪枝的跨域小样本图像生成方法研究[J]. 图学学报, 2026, 47(1): 131-142.
LI Shiliang, FANG Qiang, WANG Yihua, SHI Yifei, WANG Zhuo, LI Zeyu, XIE Yunfei, WANG Jia. A dynamic pruning approach for cross-domain few-shot image generation[J]. Journal of Graphics, 2026, 47(1): 131-142.
| 方法 | Backbone | Babies | Sunglasses | Sketches | AFHQ-Cat |
|---|---|---|---|---|---|
| TGAN[ | StyleGAN | 101.58 | 55.97 | 53.41 | 64.68 |
| TGAN+ADA[ | StyleGAN | 97.91 | 53.64 | 66.99 | 80.16 |
| FreezeD[ | StyleGAN | 96.25 | 46.95 | 46.54 | 63.60 |
| CDC[ | StyleGAN2 | 69.13 | 41.45 | 45.67 | 176.21 |
| DCL[ | StyleGAN2 | 56.48 | 37.66 | 57.72 | 156.82 |
| EWC[ | StyleGAN2 | 79.93 | 49.41 | 71.25 | 74.61 |
| DDPM-PA[ | DDPM | 48.92 | 34.75 | 一 | 一 |
| AdAM[ | StyleGAN2 | 48.83 | 28.03 | 38.11 | 58.07 |
| RICK[ | StyleGAN2 | 39.39 | 25.22 | 40.52 | 53.27 |
| CRDI[ | DDPM | 48.52 | 24.62 | 一 | 一 |
| DAP(Ours) | StyleGAN2 | 36.97 | 24.13 | 37.96 | 44.40 |
表1 FSIG各方法在不同目标域的FID(↓)结果
Table 1 Comparison of FID (↓) scores of different FSIG methods on various target domains
| 方法 | Backbone | Babies | Sunglasses | Sketches | AFHQ-Cat |
|---|---|---|---|---|---|
| TGAN[ | StyleGAN | 101.58 | 55.97 | 53.41 | 64.68 |
| TGAN+ADA[ | StyleGAN | 97.91 | 53.64 | 66.99 | 80.16 |
| FreezeD[ | StyleGAN | 96.25 | 46.95 | 46.54 | 63.60 |
| CDC[ | StyleGAN2 | 69.13 | 41.45 | 45.67 | 176.21 |
| DCL[ | StyleGAN2 | 56.48 | 37.66 | 57.72 | 156.82 |
| EWC[ | StyleGAN2 | 79.93 | 49.41 | 71.25 | 74.61 |
| DDPM-PA[ | DDPM | 48.92 | 34.75 | 一 | 一 |
| AdAM[ | StyleGAN2 | 48.83 | 28.03 | 38.11 | 58.07 |
| RICK[ | StyleGAN2 | 39.39 | 25.22 | 40.52 | 53.27 |
| CRDI[ | DDPM | 48.52 | 24.62 | 一 | 一 |
| DAP(Ours) | StyleGAN2 | 36.97 | 24.13 | 37.96 | 44.40 |
图7 RICK与DAP在Sketches数据集上的风格迁移能力对比((a) 目标域;(b) 源域;(c) RICK;(d) 本文方法)
Fig. 7 The comparison of style transfer ability between RICK and DAP on the Sketches dataset ((a) Target domain; (b) Source domain; (c) RICK; (d) Our method)
图8 生成器卷积层中DAP与RICK的剪枝数量分层对比((a) 浅层;(b) 中层;(c) 深层)
Fig. 8 Layer-wise Comparison of pruned filters between DAP and RICK in generator convolutional layers ((a) Shallow layer; (b) Middle layer; (c) Deep layer)
| 方法 | Backbone | Babies | Sunglasses | Sketches | AFHQ-Cat |
|---|---|---|---|---|---|
| AdAM[ | StyleGAN2 | 48.83 | 28.03 | 38.11 | 58.07 |
| AdAM+DAP(Ours) | StyleGAN2 | 46.14(-2.69) | 25.57(-2.46) | 38.06(-0.05) | 50.12(-7.95) |
表2 DAP方法应用于AdAM方法的消融实验结果
Table 2 Ablation study of DAP integration into AdAM: FID comparison on multiple target domains
| 方法 | Backbone | Babies | Sunglasses | Sketches | AFHQ-Cat |
|---|---|---|---|---|---|
| AdAM[ | StyleGAN2 | 48.83 | 28.03 | 38.11 | 58.07 |
| AdAM+DAP(Ours) | StyleGAN2 | 46.14(-2.69) | 25.57(-2.46) | 38.06(-0.05) | 50.12(-7.95) |
| [1] | GOODFELLOW I J, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial nets[C]// The 28th International Conference on Neural Information Processing Systems. Cambridge: MIT Press, 2014: 2672-2680. |
| [2] | KARRAS T, LAINE S, AITTALA M, et al. Analyzing and improving the image quality of StyleGAN[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 8107-8116. |
| [3] |
邵俊棋, 钱文华, 徐启豪. 基于条件残差生成对抗网络的风景图生成[J]. 图学学报, 2023, 44(4): 710-717.
DOI |
| SHAO J Q, QIAN W H, XU Q H. Landscape image generation based on conditional residual generative adversarial network[J]. Journal of Graphics, 2023, 44(4): 710-717 (in Chinese). | |
| [4] |
古天骏, 熊苏雅, 林晓. 基于SASGAN的戏剧脸谱多样化生成[J]. 图学学报, 2024, 45(1): 102-111.
DOI |
|
GU T J, XIONG S Y, LIN X. Diversified generation of theatrical masks based on SASGAN[J]. Journal of Graphics, 2024, 45(1): 102-111 (in Chinese).
DOI |
|
| [5] | LIN J, ZHANG R, GANZ F, et al. Anycost GANs for interactive image synthesis and editing[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 14981-14991. |
| [6] | WANG T F, ZHANG Y, FAN Y B, et al. High-fidelity GAN inversion for image attribute editing[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 11369-11378. |
| [7] |
刘宗明, 洪唯, 龙睿, 等. 基于自注意机制的乳源瑶绣自动生成与应用研究[J]. 图学学报, 2024, 45(5): 1096-1105.
DOI |
|
LIU Z M, HONG W, LONG R, et al. Research on automatic generation and application of Ruyuan Yao embroidery based on self-attention mechanism[J]. Journal of Graphics, 2024, 45(5): 1096-1105 (in Chinese).
DOI |
|
| [8] | CHAI L, ZHU J Y, SHECHTMAN E, et al. Ensembling with deep generative views[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 14992-15002. |
| [9] |
TRAN N T, TRAN V H, NGUYEN N B, et al. On data augmentation for GAN training[J]. IEEE Transactions on Image Processing, 2021, 30: 1882-1897.
DOI URL |
| [10] | GONG J, FOO L G, FAN Z P, et al. DiffPose: toward more reliable 3D pose estimation[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 13041-13051. |
| [11] | AFRASIYABI A, LAROCHELLE H, LALONDE J F, et al. Matching feature sets for few-shot image classification[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 9004-9014. |
| [12] | KARRAS T, AITTALA M, HELLSTEN J, et al. Training generative adversarial networks with limited data[C]// The 34th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2020: 1015. |
| [13] | TSENG H Y, JIANG L, LIU C, et al. Regularizing generative adversarial networks under limited data[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 7917-7927. |
| [14] | GONG J, FAN Z P, KE Q H, et al. Meta agent teaming active learning for pose estimation[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 11069-11079. |
| [15] | FOO L G, GONG J, FAN Z P, et al. System-status-aware adaptive network for online streaming video understanding[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 10514-10523. |
| [16] | 张彬, 周粤川, 张敏, 等. 生成对抗网络改进角度与应用研究综述[J]. 计算机应用研究, 2023, 40(3): 649-658. |
| ZHANG B, ZHOU Y C, ZHANG M, et al. Review of research on improvement and application of generative adversarial networks[J]. Application Research of Computers, 2023, 40(3): 649-658 (in Chinese). | |
| [17] |
孙龙飞, 刘慧, 杨奉常, 等. 面向医学图像层间插值的循环生成网络研究[J]. 图学学报, 2023, 44(3): 502-512.
DOI |
| SUN L F, LIU H, YANG F C, et al. Research on cyclic generative network oriented to inter-layer interpolation of medical images[J]. Journal of Graphics, 2023, 44(3): 502-512 (in Chinese). | |
| [18] | HU S X, LI D, STÜHMER J, et al. Pushing the limits of simple pipelines for few-shot learning: external data and fine-tuning make a difference[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 9058-9067. |
| [19] |
YANG M P, WANG Z. Image synthesis under limited data: a survey and taxonomy[J]. International Journal of Computer Vision, 2025, 133(6): 3689-3726.
DOI |
| [20] | LI H L, ZHU C L, ZHANG Y L, et al. Task-specific fine-tuning via variational information bottleneck for weakly- supervised pathology whole slide image classification[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 7454-7463. |
| [21] |
ZHUANG F Z, QI Z Y, DUAN K Y, et al. A comprehensive survey on transfer learning[J]. Proceedings of the IEEE, 2021, 109(1): 43-76.
DOI URL |
| [22] | ZHAO Y Q, DU C, ABDOLLAHZADEH M, et al. Exploring incompatible knowledge transfer in few-shot image generation[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 7380-7391. |
| [23] | OJHA U, LI Y J, LU J W, et al. Few-shot image generation via cross-domain correspondence[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 10738-10747. |
| [24] |
朱宝旭, 刘漫丹, 张雯婷, 等. 高分辨率人脸纹理图全流程生成方法[J]. 图学学报, 2024, 45(4): 814-826.
DOI |
|
ZHU B X, LIU M D, ZHANG W T, et al. Full process generation method of high-resolution face texture map[J]. Journal of Graphics, 2024, 45(4): 814-826 (in Chinese).
DOI |
|
| [25] | WANG Y X, WU C S, HERRANZ L, et al. Transferring GANs: generating images from limited data[C]// The 15th European Conference on Computer Vision. Cham: Springer, 2018: 220-236. |
| [26] | MO S, CHO M, SHIN J. Freeze the discriminator: a simple baseline for fine-tuning GANs[EB/OL]. [2025-02-30]. https://arxiv.org/abs/2002.10964. |
| [27] | LI Y J, ZHANG R, LU J W, et al. Few-shot image generation with elastic weight consolidation[C]// The 34th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2020: 1332. |
| [28] | XIAO J Y, LI L, WANG C F, et al. Few shot generative model adaption via relaxed spatial structural alignment[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 11194-11203. |
| [29] | ZHAO Y Q, DING H H, HUANG H J, et al. A closer look at few-shot image generation[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 9130-9140. |
| [30] | ZHAO Y Q, CHANDRASEGARAN K, ABDOLLAHZADEH M, et al. Few-shot image generation via adaptation-aware kernel modulation[C]// The 36th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2022: 1412. |
| [31] |
PAN S D, ZHANG Z Q, WEI K, et al. Few-shot generative model adaptation via style-guided prompt[J]. IEEE Transactions on Multimedia, 2024, 26: 7661-7672.
DOI URL |
| [32] | MOON T, CHOI M, LEE G, et al. Fine-tuning diffusion models with limited data[EB/OL]. [2025-02-30]. https://openreview.net/pdf?id=0J6afk9DqrR. |
| [33] | ZHU J Y, MA H M, CHEN J S, et al. Few-shot image generation with diffusion models[EB/OL]. [2025-02-30]. https://arxiv.org/abs/2211.03264. |
| [34] | CAO Y, GONG S G. Few-shot image generation by conditional relaxing diffusion inversion[C]// The 18th European Conference on Computer Vision. Cham: Springer, 2025: 20-37. |
| [1] | 刘德丰, 陈伟政, 白亚强, 刘凯, 王琦. 基于条件生成模型的船型概念方案正向设计方法探索[J]. 图学学报, 2025, 46(6): 1209-1215. |
| [2] | 刘宗明, 洪唯, 龙睿, 祝越, 张小宇. 基于自注意机制的乳源瑶绣自动生成与应用研究[J]. 图学学报, 2024, 45(5): 1096-1105. |
| [3] | 朱宝旭, 刘漫丹, 张雯婷, 谢立志. 高分辨率人脸纹理图全流程生成方法[J]. 图学学报, 2024, 45(4): 814-826. |
| [4] | 古天骏, 熊苏雅, 林晓. 基于SASGAN的戏剧脸谱多样化生成[J]. 图学学报, 2024, 45(1): 102-111. |
| [5] | 徐祯东, 张天宇, 张世恒, 姚从荣, 王道累. 基于YUV颜色空间GAN网络的图像去雾算法研究[J]. 图学学报, 2023, 44(5): 928-936. |
| [6] | 邵俊棋, 钱文华, 徐启豪. 基于条件残差生成对抗网络的风景图生成[J]. 图学学报, 2023, 44(4): 710-717. |
| [7] | 廖仕敏, 刘仰川, 朱叶晨, 王艳玲, 高欣 . 一种基于 CycleGAN 改进的低剂量 CT 图像增强网络[J]. 图学学报, 2022, 43(4): 570-578. |
| [8] | 方洪波, 万广, 陈忠辉, 黄以卫, 张文勇, 谢本亮. 基于改进 YOLOv5s 的离线手写数学符号识别[J]. 图学学报, 2022, 43(3): 387-395. |
| [9] | 汪玉金, 谢 诚, 余蓓蓓, 向鸿鑫, 柳 青. 属性语义与图谱语义融合增强的 零次学习图像识别[J]. 图学学报, 2021, 42(6): 899-907. |
| [10] | 林 森 , 刘 旭 . 门控融合对抗网络的水下图像增强 [J]. 图学学报, 2021, 42(6): 948-956. |
| [11] | 杨勇, 刘惠义. 极端低光情况下的图像增强方法[J]. 图学学报, 2020, 41(4): 520-528. |
| [12] | 李 桂, 李 腾. 基于姿态引导的场景保留人物视频生成[J]. 图学学报, 2020, 41(4): 539-547. |
| [13] | 罗琪彬 1,2, 蔡 强 1,2 . 采用双框架生成对抗网络的图像运动模糊盲去除[J]. 图学学报, 2019, 40(6): 1056-1063. |
| 阅读次数 | ||||||
|
全文 |
|
|||||
|
摘要 |
|
|||||