Welcome to Journal of Graphics share: 

Journal of Graphics ›› 2025, Vol. 46 ›› Issue (2): 332-344.DOI: 10.11996/JG.j.2095-302X.2025020332

• Computer Graphics and Virtual Reality • Previous Articles     Next Articles

Image to 3D vase generation technology combining procedural content generation and diffusion models

SUN Heyi1(), LI Yixiao2, TIAN Xi3, ZHANG Songhai2()   

  1. 1. Zhili College, Tsinghua University, Beijing 100084, China
    2. Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China
    3. Department of Computer Science, University of Bath, Somerset 133789, UK
  • Received:2024-08-19 Accepted:2024-10-28 Online:2025-04-30 Published:2025-04-24
  • Contact: ZHANG Songhai
  • About author:First author contact:

    SUN Heyi (2003-), undergraduate student. Her main research interest covers 3D reconstruction. E-mail:sun-hy21@mails.tsinghua.edu.cn

Abstract:

In the traditional manual production of 3D content, 3D meshes and textures serve as the foundational elements in constructing 3D assets. To enhance the visual representation and rendering performance of 3D assets, the meshes are typically constructed using quadrilateral faces, requiring optimal topology and UV mapping. Moreover, 3D textures must be congruent with the geometric shape and maintain global consistency. However, current 3D content generation technologies based on latent diffusion models fail to meet these standards, limiting their potential in practical applications. At the same time, procedural content generation techniques have gained widespread application in the gaming and architectural industries due to their ability to systematically produce a vast array of 3D assets that conform to industry best practices. To improve the usability of generated assets, an integrated solution combining procedural content generation with diffusion model techniques was proposed. Using the 3D rotational body example of a vase, the image-to-3D asset generation problem was divided into two principal tasks: 3D mesh reconstruction and 3D texture generation. In the domain of 3D mesh reconstruction, a novel vase generation program was developed, and a deep neural network was trained to learn the mapping between image features and procedural parameters, thereby facilitating the reconstruction from a 2D image to a 3D model. For3D texture generation, a novel two-stage texturing strategy was introduced, combining multi-view image synthesis and multi-view consistency sampling techniques to produce high quality texture maps with global coherence. In summary, a scheme for the automatic construction of 3D vase assets from images was presented, which can be generalized to generate other 3D rotational body content and holds promise for applications in generating other types of 3D content.

Key words: diffusion models, procedural content generation, 3D reconstruction, texture generation, deep learning

CLC Number: