Welcome to Journal of Graphics share: 

Journal of Graphics ›› 2025, Vol. 46 ›› Issue (2): 382-392.DOI: 10.11996/JG.j.2095-302X.2025020382

• Computer Graphics and Virtual Reality • Previous Articles     Next Articles

Human-in-the-loop field-specific logo generation method

LI Jiyuan(), GUAN Zheyu, SONG Haichuan(), TAN Xin, MA Lizhuang   

  1. School of Computer Science and Technology, East China Normal University, Shanghai 200062, China
  • Received:2024-07-08 Accepted:2024-10-08 Online:2025-04-30 Published:2025-04-24
  • Contact: SONG Haichuan
  • About author:First author contact:

    LI Jiyuan (2004-),undergraduate student. His main research interest covers computer-aided design. E-mail:leehenry1024@qq.com

  • Supported by:
    National Natural Science Foundation of China(62302167);National Natural Science Foundation of China(62222602);Shanghai Youth Science and Technology Talents Sailing Program(23YF1410500)

Abstract:

Compared to other types of generated pictures, logos are highly abstract, diversely-designed and unified in styles, making it challenging to directly control the outcome of the generated pictures. In an effort to efficiently generate logos that are in line with the characteristics of various industries and meet the requirements of multiple designs of composition patterns, a Human-in-the-Loop field-specific logo generation method was proposed. Firstly, based on Dreambooth, a method for fine tuning text-to-image diffusion models, and a dataset composed of logos collected from publicly available online sources the text-to-image model Stable Diffusion XL was utilized as the base model and trained to develop a “prototype model” for basic logo generation. Then, groups of lexicons for targeted industries were constructed. The prototype model was then used to generate logos for targeted industries under the guidance of the lexicons. Next, via human intervention, the generated outcomes were filtered into secondary datasets tailored to industry needs. Finally, “prototype model” was iteratively fine-tuned using LoRA and the secondary datasets, obtaining the final model for logo generation. The generated results of the final model were evaluated using cosine similarity between generated images and prompt words, as well as manual questionnaire indicators. The evaluation demonstrated that the logos generated by the final model have a considerable exhibited significant improvements in industry relevance, structural integrity, and aesthetic appearance compared to those generated directly by the untrained base model.

Key words: image synthesis, diffusion model, human-in-the-loop, training set construction, text to image

CLC Number: