欢迎访问《图学学报》 分享到:

图学学报 ›› 2023, Vol. 44 ›› Issue (1): 120-130.DOI: 10.11996/JG.j.2095-302X.2023010120

• 计算机图形学与虚拟现实 • 上一篇    下一篇

CTH-Net:从线稿和颜色点生成服装图像的CNN-Transformer混合网络

潘东辉(), 金映含, 孙旭, 刘玉生, 张东亮()   

  1. 浙江大学计算机科学与技术学院,浙江 杭州 310000
  • 收稿日期:2022-04-24 修回日期:2022-07-01 出版日期:2023-10-31 发布日期:2023-02-16
  • 通讯作者: 张东亮
  • 作者简介:潘东辉(1997-),男,硕士研究生。主要研究方向为数字图像处理。E-mail:417969567@qq.com
  • 基金资助:
    国家重点研发计划(2022YFB3303100);国家自然科学基金项目(61972340);国家自然科学基金项目(61732015)

CTH-Net: CNN-Transformer hybrid network for garment image generation from sketches and color points

PAN Dong-hui(), JIN Ying-han, SUN Xu, LIU Yu-sheng, ZHANG Dong-liang()   

  1. College of Computer Science and Technology, Zhejiang University, Hangzhou Zhejiang 310000, China
  • Received:2022-04-24 Revised:2022-07-01 Online:2023-10-31 Published:2023-02-16
  • Contact: ZHANG Dong-liang
  • About author:PAN Dong-hui (1997-), master student. His main research interest covers digital image processing. E-mail:417969567@qq.com
  • Supported by:
    National Key R&D Program of China(2022YFB3303100);National Natural Science Foundation of China(61972340);National Natural Science Foundation of China(61732015)

摘要:

绘制服装效果图是服装设计过程中重要的一环,针对目前存在智能化程度不足、对用户绘画水平和想象能力要求较高等问题,提出了一种使用线稿和颜色点生成服装图像的CNN-Transformer混合网络CTH-Net。CTH-Net结合卷积神经网络(CNN)在提取局部信息和Transformer在处理长距离依赖方面的优势,将2个模型架构进行高效混合,并设计ToPatch和ToFeatureMap模块减小输入Transformer的数据量和维度以降低计算资源消耗。CTH-Net由3个阶段组成:一是草图阶段,旨在预测服装的颜色分布,获得没有渐变和阴影的水彩式图像;二是细化阶段,将水彩式图像细化为有光影效果的服装图像;三是调优阶段,组合一、二阶段的输出进一步优化生成质量。实验结果表明,仅需输入线稿和少量颜色点,CTH-Net便能生成出高质量的服装图像。与现有的方法相比,该网络生成图像的真实感和准确性均有较大优势。

关键词: 深度学习, 卷积神经网络, 图像生成, Transformer

Abstract:

Drawing garment images is an important part of garment design. To address the problems such as low intelligence and high requirements for users' drawing skills and imagination, a CNN-Transformer hybrid network (CTH-Net) was proposed to generate garment images from sketches and color points. CTH-Net combined the advantages of convolutional neural networks (CNN) in extracting local information and Transformer in processing long-range dependencies, efficiently fusing the architectures of these two models. The ToPatch and ToFeatureMap modules were also designed to reduce the amount and dimension of data input into Transformer, thus reducing the consumption of computing resources. CTH-Net consisted of three phases: the first drafting phase, which aimed to predict the color distribution of garments and obtain watercolor images without gradients and shadows; the second refinement phase, which refined the watercolor image into a realistic garment image; the third tuning phase, which combined the outputs of the above two phases to further optimize the generation quality. The experimental results show that CTH-Net could generate high-quality garment images by simply inputting sketches and some color points. The proposed network could outperform the existing methods in the realism and accuracy of the generated images.

Key words: deep learning, convolutional neural network, image generation, Transformer

中图分类号: