图学学报 ›› 2024, Vol. 45 ›› Issue (1): 102-111.DOI: 10.11996/JG.j.2095-302X.2024010102
收稿日期:
2023-06-29
接受日期:
2023-09-27
出版日期:
2024-02-29
发布日期:
2024-02-29
通讯作者:
林晓(1978-),女,教授,博士。主要研究方向为图像视频编辑处理和人工智能等。E-mail:lin6008@shnu.edu.cn第一作者:
古天骏(2002-),男,本科生。主要研究方向为图像生成和多智能体协同. E-mail:TianjunGu_Grady@outlook.com
基金资助:
GU Tianjun1(), XIONG Suya2, LIN Xiao1,3,4(
)
Received:
2023-06-29
Accepted:
2023-09-27
Published:
2024-02-29
Online:
2024-02-29
First author:
GU Tianjun (2002-), undergraduate student. His main research interests cover digital image processing and image generation. E-mail:TianjunGu_Grady@outlook.com
Supported by:
摘要:
为解决现有自动生成的戏剧脸谱在分辨率和真实性上效果不佳的问题,提出了基于自注意力机制的风格化生成对抗网络(SASGAN)。首先在StyleGAN的基础上引入了自注意力机制以及矢量量化方法,增强了对脸谱图案几何结构特征的提取,接着通过多样化差异性增强(DDG)扩充数据,采用脸谱色调辅助算法对DDG方法进行补充,建立了包含12 599张图像的戏剧脸谱数据集,最后在此数据集上进行训练,生成了兼顾多样性和真实性的脸谱图像。实验结果表明,对于戏剧脸谱图像,DDG方法较传统方法在数据增广方面有着较大提升,而SASGAN则提升了戏剧脸谱图像的分辨率和真实性,在主观视觉上得到了理想的效果。
中图分类号:
古天骏, 熊苏雅, 林晓. 基于SASGAN的戏剧脸谱多样化生成[J]. 图学学报, 2024, 45(1): 102-111.
GU Tianjun, XIONG Suya, LIN Xiao. Diversified generation of theatrical masks based on SASGAN[J]. Journal of Graphics, 2024, 45(1): 102-111.
网络 | SSIM | NIMA |
---|---|---|
SASGAN无矢量量化 | 0.813 6 | 5.762±1.637 |
SASGAN+矢量量化 | 0.820 7 | 5.796±1.622 |
表1 增加矢量量化后指标对比
Table 1 After adding vector quantization index contrast
网络 | SSIM | NIMA |
---|---|---|
SASGAN无矢量量化 | 0.813 6 | 5.762±1.637 |
SASGAN+矢量量化 | 0.820 7 | 5.796±1.622 |
网络 | SSIM | NIMA |
---|---|---|
SASGAN | 0.820 7 | 5.796±1.622 |
StyleGAN | 0.796 8 | 5.568±1.714 |
GAN | 0.619 2 | 4.996±1.983 |
表2 不同网络结果对比
Table 2 Comparison of results from different networks
网络 | SSIM | NIMA |
---|---|---|
SASGAN | 0.820 7 | 5.796±1.622 |
StyleGAN | 0.796 8 | 5.568±1.714 |
GAN | 0.619 2 | 4.996±1.983 |
数据增广方法 | SSIM | NIMA |
---|---|---|
DDG | 0.820 7 | 5.796±1.622 |
Traditional | 0.476 7 | 3.362±1.994 |
表3 不同数据增广方法结果对比
Table 3 Comparison of the results of different data enrichment methods
数据增广方法 | SSIM | NIMA |
---|---|---|
DDG | 0.820 7 | 5.796±1.622 |
Traditional | 0.476 7 | 3.362±1.994 |
网络 | 美观度 | 真实度 | 象征度 |
---|---|---|---|
Original | 4.2 | 4.7 | 0.98 |
SASGAN | 3.8 | 4.1 | 0.96 |
StyleGAN | 2.9 | 3.4 | 0.92 |
GAN | 1.2 | 0.8 | 0.71 |
表4 补充实验
Table 4 Supplementary experiments
网络 | 美观度 | 真实度 | 象征度 |
---|---|---|---|
Original | 4.2 | 4.7 | 0.98 |
SASGAN | 3.8 | 4.1 | 0.96 |
StyleGAN | 2.9 | 3.4 | 0.92 |
GAN | 1.2 | 0.8 | 0.71 |
网络 | 爱好者 | 在读学生 | 授课教师 |
---|---|---|---|
Original | 4.3 | 4.2 | 4.1 |
SASGAN | 4.1 | 4.0 | 3.7 |
StyleGAN | 3.0 | 2.9 | 2.7 |
GAN | 1.3 | 1.2 | 1.0 |
表5 各群体对美观度的评分
Table 5 Ratings of aesthetics by groups
网络 | 爱好者 | 在读学生 | 授课教师 |
---|---|---|---|
Original | 4.3 | 4.2 | 4.1 |
SASGAN | 4.1 | 4.0 | 3.7 |
StyleGAN | 3.0 | 2.9 | 2.7 |
GAN | 1.3 | 1.2 | 1.0 |
网络 | 爱好者 | 在读学生 | 授课教师 |
---|---|---|---|
Original | 4.8 | 4.7 | 4.5 |
SASGAN | 4.2 | 4.1 | 3.9 |
StyleGAN | 3.6 | 3.4 | 3.3 |
GAN | 1.1 | 0.8 | 0.7 |
表6 各群体对真实度的评分
Table 6 Ratings of authenticity by groups
网络 | 爱好者 | 在读学生 | 授课教师 |
---|---|---|---|
Original | 4.8 | 4.7 | 4.5 |
SASGAN | 4.2 | 4.1 | 3.9 |
StyleGAN | 3.6 | 3.4 | 3.3 |
GAN | 1.1 | 0.8 | 0.7 |
网络 | 爱好者 | 在读学生 | 授课教师 |
---|---|---|---|
Original | 0.99 | 0.98 | 0.97 |
SASGAN | 0.96 | 0.96 | 0.96 |
StyleGAN | 0.93 | 0.91 | 0.91 |
GAN | 0.72 | 0.71 | 0.69 |
表7 各群体对象征度的评分
Table 7 Ratings of Symbolism by Groups
网络 | 爱好者 | 在读学生 | 授课教师 |
---|---|---|---|
Original | 0.99 | 0.98 | 0.97 |
SASGAN | 0.96 | 0.96 | 0.96 |
StyleGAN | 0.93 | 0.91 | 0.91 |
GAN | 0.72 | 0.71 | 0.69 |
网络 | 爱好者 | 在读学生 | 授课教师 |
---|---|---|---|
Original | 0.015 | 0.018 | 0.008 |
SASGAN | 0.023 | 0.021 | 0.010 |
StyleGAN | 0.016 | 0.016 | 0.009 |
GAN | 0.013 | 0.017 | 0.012 |
表8 各群体对美观度的评分方差
Table 8 Variance in ratings of aesthetics by group
网络 | 爱好者 | 在读学生 | 授课教师 |
---|---|---|---|
Original | 0.015 | 0.018 | 0.008 |
SASGAN | 0.023 | 0.021 | 0.010 |
StyleGAN | 0.016 | 0.016 | 0.009 |
GAN | 0.013 | 0.017 | 0.012 |
网络 | 爱好者 | 在读学生 | 授课教师 |
---|---|---|---|
Original | 0.014 | 0.016 | 0.011 |
SASGAN | 0.018 | 0.017 | 0.009 |
StyleGAN | 0.027 | 0.019 | 0.014 |
GAN | 0.019 | 0.023 | 0.017 |
表9 各群体对真实度的评分方差
Table 9 Variance in ratings of authenticity by group
网络 | 爱好者 | 在读学生 | 授课教师 |
---|---|---|---|
Original | 0.014 | 0.016 | 0.011 |
SASGAN | 0.018 | 0.017 | 0.009 |
StyleGAN | 0.027 | 0.019 | 0.014 |
GAN | 0.019 | 0.023 | 0.017 |
网络 | 爱好者 | 在读学生 | 授课教师 |
---|---|---|---|
Original | 0.000 8 | 0.000 9 | 0.000 6 |
SASGAN | 0.000 6 | 0.000 4 | 0.000 3 |
StyleGAN | 0.000 7 | 0.000 8 | 0.000 3 |
GAN | 0.000 6 | 0.000 5 | 0.000 4 |
表10 各群体对象征度的评分方差
Table 10 Variance in ratings of symbolism by group
网络 | 爱好者 | 在读学生 | 授课教师 |
---|---|---|---|
Original | 0.000 8 | 0.000 9 | 0.000 6 |
SASGAN | 0.000 6 | 0.000 4 | 0.000 3 |
StyleGAN | 0.000 7 | 0.000 8 | 0.000 3 |
GAN | 0.000 6 | 0.000 5 | 0.000 4 |
[1] | 周大雯. 中国戏剧脸谱漫谈[J]. 戏剧之家, 2018(12): 35. |
ZHOU D W. Random talk on China’s traditional opera facial makeup[J]. Home Drama, 2018(12): 35 (in Chinese). | |
[2] | 何婧慈. 脸谱的装饰构图特色及其艺术价值[J]. 艺术品鉴, 2017(2): 198-199. |
HE J C. Characteristics of decorative composition of Facebook and its artistic value[J]. Appreciation, 2017(2): 198-199 (in Chinese). | |
[3] | ZHAO W C, ZHANG Q, LI H H, et al. Low-altitude UAV detection method based on one-staged detection framework[C]// The 2nd International Conference on Advances in Computer Technology, Information Science and Communications. New York: IEEE Press, 2020: 112-117. |
[4] | PUTTARUKSA C, TAEPRASARTSIT P. Color data augmentation through learning color-mapping parameters between cameras[C]// The 15th International Joint Conference on Computer Science and Software Engineering (JCSSE). New York: IEEE Press, 2018: 1-5. |
[5] | LI B Y, CUI Y, LIN T-Y, et al. Single image texture translation for data augmentation[EB/OL]. [2023-05-17]. https://arxiv.org/abs/2106.13804v1. |
[6] | ANTONIOU A, STORKEY A, EDWARDS H. Data augmentation generative adversarial networks[EB/OL]. [2023-05-17]. https://arxiv.org/abs/1711.04340v2. |
[7] | CHEN M H, ZHAO S, LIU H F, et al. Adversarial-learned loss for domain adaptation[EB/OL]. [2023-05-21]. http://arxiv.org/abs/2001.01046. |
[8] | JIANG X, LAO Q C, MATWIN S, et al. Implicit class- conditioned domain alignment for unsupervised domain adaptation[C]// The 37th International Conference on Machine Learning. New York: ACM, 2020: 4816-4827. |
[9] |
ZHANG X L, SONG H H, ZHANG K H, et al. Single image super-resolution with enhanced Laplacian pyramid network via conditional generative adversarial learning[J]. Neurocomputing, 2020, 398: 531-538.
DOI URL |
[10] |
KIM H J, LEE D. Image denoising with conditional generative adversarial networks (CGAN) in low dose chest images[J]. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, 2020, 954: 161914.
DOI URL |
[11] | JOSE A, FRANCIS A. Reversible colour density compression of images using cGANs[EB/OL]. [2023-04-21]. https://arxiv.org/abs/2106.10542.pdf. |
[12] | 黄凯奇, 赵鑫, 李乔哲, 等. 视觉图灵: 从人机对抗看计算机视觉下一步发展[J]. 图学学报, 2021, 42(3): 339-348. |
HUANG K Q, ZHAO X, LI Q Z, et al. Visual Turing: the next development of computer vision in the view of human- computer gaming[J]. Journal of Graphics, 2021, 42(3): 339-348 (in Chinese). | |
[13] | 林晓, 屈时操, 黄伟, 等. 显著区域保留的图像风格迁移算法[J]. 图学学报, 2021, 42(2): 190-197. |
LIN X, QU S C, HUANG W, et al. Style transfer algorithm for salient region preservation[J]. Journal of Graphics, 2021, 42(2): 190-197 (in Chinese). | |
[14] | 任好盼, 王文明, 危德健, 等. 基于高分辨率网络的人体姿态估计方法[J]. 图学学报, 2021, 42(3): 432-438. |
REN H P, WANG W M, WEI D J, et al. Human pose estimation based on high-resolution net[J]. Journal of Graphics, 2021, 42(3): 432-438 (in Chinese).
DOI |
|
[15] | 李彬, 王平, 赵思逸. 基于双重注意力机制的图像超分辨重建算法[J]. 图学学报, 2021, 42(2): 206-215. |
LI B, WANG P, ZHAO S Y. Image super-resolution reconstruction based on dual attention mechanism[J]. Journal of Graphics, 2021, 42(2): 206-215 (in Chinese).
DOI |
|
[16] |
GOODFELLOW I, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial networks[J]. Communications of the ACM, 2020, 63(11): 139-144.
DOI URL |
[17] | KANG M G, PARK J. ContraGAN: contrastive learning for conditional image generation[EB/OL]. [2023-04-21]. https://arxiv.org/abs/2006.12681v3. |
[18] | ARJOVSKY M, CHINTALA S, BOTTOU L. Wasserstein generative adversarial networks[C]// The 34th International Conference on Machine Learning - Volume 70. New York:ACM, 2017: 214-223. |
[19] |
KARRAS T, LAINE S, AILA T M. A style-based generator architecture for generative adversarial networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(12): 4217-4228.
DOI URL |
[20] | ZHANG H, GOODFELLOW I, METAXAS D, et al. Self-attention generative adversarial networks[EB/OL]. [2023-04-21]. https://arxiv.org/abs/1805.08318.pdf. |
[21] | KARRAS T, LAINE S, AITTALA M, et al. Analyzing and improving the image quality of StyleGAN[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 8107-8116. |
[22] | KARRAS T, AILA T, LAINE S, et al. [EB/OL]. [2023-04-21]. https://arxiv.org/abs/1710.10196v2. |
[23] | 曹锦纲, 杨国田, 杨锡运. 基于注意力机制的深度学习路面裂缝检测[J]. 计算机辅助设计与图形学学报, 2020, 32(8): 1324-1333. |
CAO J G, YANG G T, YANG X Y. Pavement crack detection with deep learning based on attention mechanism[J]. Journal of Computer-Aided Design & Computer Graphics, 2020, 32(8): 1324-1333 (in Chinese). | |
[24] | 王一鸣, 杜慧敏, 张霞, 等. 视觉注意力网络在工件表面缺陷检测中的应用[J]. 计算机辅助设计与图形学学报, 2019, 31(9): 1528-1534. |
WANG Y M, DU H M, ZHANG X, et al. Application of visual attention network in workpiece surface defect detection[J]. Journal of Computer-Aided Design & Computer Graphics, 2019, 31(9): 1528-1534 (in Chinese). | |
[25] | 刘德志, 梁正友, 孙宇. 结合空间注意力机制与光流特征的微表情识别方法[J]. 计算机辅助设计与图形学学报, 2021, 33(10): 1541-1552. |
LIU D Z, LIANG Z Y, SUN Y. Micro-expression recognition method based on spatial attention mechanism and optical flow features[J]. Journal of Computer-Aided Design & Computer Graphics, 2021, 33(10): 1541-1552 (in Chinese). | |
[26] | GRAY R. Vector quantization[J]. IEEE ASSP Magazine, 1984, 1(2): 4-29. |
[27] | GERSHO A, GRAY R. Vector quantization and signal compression[EB/OL]. [2023-04-21]. https://www.researchgate.net/publication/279368025_Vector_Quantization_and_Signal_Compression. |
[28] |
LINDE Y, BUZO A, GRAY R. An algorithm for vector quantizer design[J]. IEEE Transactions on Communications, 1980, 28(1): 84-95.
DOI URL |
[1] | 李大湘, 吉展, 刘颖, 唐垚. 改进YOLOv7遥感图像目标检测算法[J]. 图学学报, 2024, 45(4): 650-658. |
[2] | 魏敏, 姚鑫. 基于多尺度与注意力机制的两阶段风暴单体外推研究[J]. 图学学报, 2024, 45(4): 696-704. |
[3] | 曾志超, 徐玥, 王景玉, 叶元龙, 黄志开, 王欢. 基于SOE-YOLO轻量化的水面目标检测算法[J]. 图学学报, 2024, 45(4): 736-744. |
[4] | 武兵, 田莹. 基于注意力机制的多尺度道路损伤检测算法研究[J]. 图学学报, 2024, 45(4): 770-778. |
[5] | 赵磊, 李栋, 房建东, 曹琪. 面向交通标志的改进YOLO目标检测算法[J]. 图学学报, 2024, 45(4): 779-790. |
[6] | 朱宝旭, 刘漫丹, 张雯婷, 谢立志. 高分辨率人脸纹理图全流程生成方法[J]. 图学学报, 2024, 45(4): 814-826. |
[7] | 李跃华, 仲新, 姚章燕, 胡彬. 基于改进YOLOv5s的着装不规范检测算法研究[J]. 图学学报, 2024, 45(3): 433-445. |
[8] | 张相胜, 杨骁. 基于改进YOLOv7-tiny的橡胶密封圈缺陷检测方法[J]. 图学学报, 2024, 45(3): 446-453. |
[9] | 李滔, 胡婷, 武丹丹. 结合金字塔结构和注意力机制的单目深度估计[J]. 图学学报, 2024, 45(3): 454-463. |
[10] | 路龙飞, 王峻峰, 赵世闻, 李广, 丁鑫涛. 基于力位感知技能学习的轴孔柔顺装配方法[J]. 图学学报, 2024, 45(2): 250-258. |
[11] | 郭宗洋, 刘立东, 蒋东华, 刘子翔, 朱熟康, 陈京华. 基于语义引导神经网络的人体动作识别算法[J]. 图学学报, 2024, 45(1): 26-34. |
[12] | 翟永杰, 赵晓瑜, 王璐瑶, 王亚茹, 宋晓轲, 朱浩硕. IDD-YOLOv7:一种用于输电线路绝缘子多缺陷的轻量化检测方法[J]. 图学学报, 2024, 45(1): 90-101. |
[13] | 崔克彬, 焦静颐. 基于MCB-FAH-YOLOv8的钢材表面缺陷检测算法[J]. 图学学报, 2024, 45(1): 112-125. |
[14] | 魏陈浩, 杨睿, 刘振丙, 蓝如师, 孙希延, 罗笑南. 具有双层路由注意力的YOLOv8道路场景目标检测方法[J]. 图学学报, 2023, 44(6): 1104-1111. |
[15] | 丁建川, 肖金桐, 赵可新, 贾冬青, 崔炳德, 杨鑫. 基于脉冲神经网络的复杂场景导航避障算法[J]. 图学学报, 2023, 44(6): 1121-1129. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||