欢迎访问《图学学报》 分享到:

图学学报 ›› 2022, Vol. 43 ›› Issue (2): 316-323.DOI: 10.11996/JG.j.2095-302X.2022020316

• 计算机图形学与虚拟现实 • 上一篇    下一篇

两阶段可调节感知蒸馏网络的虚拟试衣方法

  

  1. 大连理工大学数学科学学院,辽宁 大连 116024
  • 出版日期:2022-04-30 发布日期:2022-05-07
  • 基金资助:
    国家自然科学基金项目(61976040)

Two-stage adjustable perceptual distillation network for virtual try-on

  1. School of Mathematical Sciences, Dalian University of Technology, Dalian Liaoning 116024, China
  • Online:2022-04-30 Published:2022-05-07
  • Supported by:
    National Natural Science Foundation of China (61976040)

摘要: 基于图像的虚拟试衣能将目标服装图像合成到人物图像上,此任务近年来因其在电子商务和时
装图像编辑上广泛应用而备受关注。针对该任务的特点和已有方法的缺陷,提出一种两阶段可调节感知蒸馏方
法(TS-APD)。该方法包括 3 个步骤:①分别对服装图像和人物图像预训练 2 个语义分割网络,生成更准确的服
装前景分割和上衣分割;②利用这 2 个语义分割和其他解析信息训练基于解析器的“导师”网络;③以“导师”网
络生成的假图像作为输入,以原始真实人物图像作为监督,采用一种 TS-APD 方案训练无解析器的“学生”网络。
最终经过蒸馏的“学生”网络能在不需要人体解析的情况下,生成高质量的试衣图像。在 VITON 数据集上的实
验结果表明,该算法在 FID、L 1 和 PCKh 的评分分别可达 9.10,0.015 3,0.985 6,均优于现有方法。用户研究
结果也表明,与已有方法相比,所提方法生成的图像更加逼真,所有偏好得分均达 77%以上。

关键词: 虚拟试衣, 知识蒸馏, 图像分割, 图像生成, 可调节因子

Abstract: It is known that image-based virtual try-on can fit a target garment image to a person image, and that this
task has gained much attention in recent years for its wide applications in e-commerce and fashion image editing. In
response to the characteristics of the task and the shortcomings of existing approaches, a method of two-stage
adjustable perceptual distillation (TS-APD) was proposed in this paper. This method consisted of 3 steps. Firstly, two
semantic segmentation networks were pre-trained on garment image and person image respectively, thus generating
more accurate garment foreground segmentation and upper garment segmentation. Then, these two semantic
segmentations and other parsing information were employed to train a parser-based “tutor” network. Finally, a
parser-free “student” network was trained through a two-stage adjustable perceptual distillation scheme, taking the
fake image generated by the “tutor” network as input and the original real person images as supervision. It can be
perceived that the “student” model with distillation is able to produce high-quality try-on images without human
parsing. The experimental results on VITON datasets show that this algorithm can achieve 9.10 FID score, 0.015 3 L 1
score, and 0.985 6 PCKh score, outperforming the existing methods. The user survey also shows that compared with
other methods, the images generated by the proposed method are more photo-realistic, with all the preference scores reaching more than 77%.

Key words: virtual try-on, knowledge distillation, image segmentation, image generation, adjustable factor

中图分类号: