Welcome to Journal of Graphics share: 

Journal of Graphics ›› 2024, Vol. 45 ›› Issue (6): 1243-1255.DOI: 10.11996/JG.j.2095-302X.2024061243

• Special Topic on “Large Models and Graphics Technology and Applications” • Previous Articles     Next Articles

Research on prompt engineering for large model art image generation

WANG Changsheng()   

  1. Department of Performance, Film, and Animation, Sejong University, Seoul 05006, Republic of Korea
  • Received:2024-05-06 Accepted:2024-07-18 Online:2024-12-31 Published:2024-12-24
  • About author:First author contact:

    WANG Changsheng (1995-), Ph.D. candidate. His main research interests cover AI painting and artificial intelligence art, et al.E-mail:137834933@qq.com

Abstract:

With the rapid advancement of artificial intelligence technology in the field of art, prompt-driven art image generation has become highly popular. However, the rules and methods for generating artistic images using prompts remain underexplored. This study quantitatively evaluated images generated by the Midjourney model through CLIP model calculations and expert assessments, combined with participatory observation through netnography, to comprehensively reveal the rules and methods of prompt-generated art images. The results showed that with the advancement of versions (from Midjourney V2 to V5), the aesthetic quality of images generated by the Midjourney model has significantly improved, highlighting the necessity for artists and creators to continuously learn to adapt to the evolving AI models. Therefore, an optimized prompt formula was proposed, which can swiftly and efficiently generate various high-aesthetic quality images. The AI model demonstrated different capabilities across various themes, excelling in generating oil paintings, watercolor ink paintings, and anime characters, and performing well in both figurative and abstract themes, though relatively weaker in sketch and colored pencil styles. Creators should leverage its strengths in these styles for image creation. Additionally, it was found that using the best prompt combinations tailored to specific versions can greatly enhance the quality of generated images. Carefully designing prompts is crucial, and newer versions are not necessarily superior to older ones. Creators need to explore and accumulate the best prompts that match the versions. This study not only revealed the rules and methods of prompt-generated art images but also provided theoretical and practical guidance for art creators in the field of AI art creation.

Key words: AI painting, prompt engineering, CLIP model, AI art, text-to-image generation, netnography

CLC Number: