欢迎访问《图学学报》 分享到:

图学学报

• 图像处理与计算机视觉 • 上一篇    下一篇

基于姿态引导的场景保留人物视频生成

  

  1. (安徽大学电气工程与自动化学院,安徽 合肥 230601)
  • 出版日期:2020-08-31 发布日期:2020-08-22
  • 基金资助:
    国家自然科学基金项目(61572029);安徽省杰出青年基金项目(1908085J25)

Pose-guided scene-preserving person video generation algorithm

  1. (School of Electrical Engineering and Automation, Anhui University, Hefei Anhui 230601, China)
  • Online:2020-08-31 Published:2020-08-22
  • Supported by:
    National Natural Science Foundation of China (61572029); Anhui Outstanding Youth Fund (1908085J25)

摘要: 人物视频生成技术是通过学习人体结构与运动的特征表示,实现从特征表示到
人物视频帧的空间生成映射。针对现有的人物视频生成算法未考虑背景环境转换及人体姿态
估计精度较低等问题,提出一种基于姿态引导的场景保留人物视频生成算法(PSPVG)。首先,
取合适的源视频和目标视频,利用分割人物外观的视频帧代替源视频帧作为网络的输入;然
后,基于GAN 的运动转换模型将源视频中的人物替换成目标人物,并保持动作一致性;最后,
引用泊松图像编辑将人物外观与源背景融合,去除边界异常像素,实现将人物自然地融入源
场景且避免改变画面背景环境和整体风格。该算法使用分割出的前景人物图代替源视频帧中
的人物,减少背景干扰,提高姿态估计精度,自然地实现运动转移过程中源场景的保留,生
成艺术性与真实性和谐并存的人物视频。

关键词: 人物视频生成, 姿态估计, 运动转换, 生成对抗网络, 图像处理

Abstract: The person video generation technology learns the feature representation of human body
structure and motion, so as to realize the spatial generation mapping from the feature representation to
the character video frame. In view of the existing person video generation algorithm lacking in the
transformation of background environment and the low accuracy of human pose estimation, a
pose-guided scene-preserving person video generation algorithm was proposed. First, the appropriate
source video and target video were selected, and the video frame with the appearance of the
segmented character served as the network input instead of the source video frame. Then, based on
GAN, a motion transformation model was employed to replace characters in source videos with target
characters and maintain the consistency of motion. Finally, the Poisson image editing was used to
fuse the character appearance with the source background, enabling the flowed advantages: (a)
removing border anomaly pixels; (b) realizing character blending naturally into the source scene; and 
(c) avoiding changing the background environment and overall image style. The proposed algorithm
used the segmented foreground person image instead of the source video frame to reduce background
interference and improve the accuracy of pose estimation, thus naturally realizing scene-preserving
during the motion transfer process and producing artistic and authentic person videos.

Key words: person video generation, pose estimation, motion transfer, generative adversarial
networks,
image processing