欢迎访问《图学学报》 分享到:

图学学报 ›› 2024, Vol. 45 ›› Issue (3): 594-600.DOI: 10.11996/JG.j.2095-302X.2024030594

• 建筑与城市信息模型 • 上一篇    下一篇

大语言模型驱动的交互式建筑设计新范式——基于Rhino7的概念验证

蒋灿1,2(), 郑哲2, 梁雄1, 林佳瑞2,3(), 马智亮2, 陆新征2   

  1. 1.广联达科技股份有限公司,北京 100193
    2.清华大学土木工程系,北京 100084
    3.住房城乡建设部数字建造与孪生重点实验室,北京 100084
  • 收稿日期:2023-09-25 接受日期:2023-12-21 出版日期:2024-06-30 发布日期:2024-06-12
  • 通讯作者:林佳瑞(1987-),男,副研究员,博士。主要研究方向为智能建造、数字孪生和知识图谱等。E-mail:lin611@tsinghua.edu.cn
  • 第一作者:蒋灿(1993-),男,博士后,博士。主要研究方向为人工智能在智能建造领域的应用。E-mail:jiangc-l@glodon.com
  • 基金资助:
    国家自然科学基金项目(52378306);北京市科委-中关村管委会项目(20220468132)

A new interaction paradigm for building design driven by large language model: proof of concept with Rhino7

JIANG Can1,2(), ZHENG Zhe2, LIANG Xiong1, LIN Jiarui2,3(), MA Zhiliang2, LU Xinzheng2   

  1. 1. Glodon Company Limited, Beijing 100193, China
    2. Department of Civil Engineering, Tsinghua University, Beijing 100084, China
    3. Key Laboratory of Digital Construction and Digital Twin, Ministry of Housing and Urban-Rural Development, Beijing 100084, China
  • Received:2023-09-25 Accepted:2023-12-21 Published:2024-06-30 Online:2024-06-12
  • First author:JIANG Can (1993-), postdoctoral, Ph.D. His main research interest covers application of artificial intelligence in intelligent construction. E-mail:jiangc-l@glodon.com
  • Supported by:
    National Natural Science Foundation of China(52378306);Research Project of Beijing Municipal Science & Technology Commission, Administrative Commission of Zhongguancun Science Park(20220468132)

摘要:

随着社会对建筑设计质量要求越来越高,建筑设计软件也变得越来越专业和复杂。现在的设计软件不仅学习成本高,而且交互模式复杂。大语言模型(LLM)的最新突破使计算机清晰地理解人类自然语言指令,并准确生成代码语言具有可行性,有望为人与软件的交互范式提供新思路。因此,本文提出了LLM驱动的交互式建筑设计新范式——将设计师通过多次键鼠操作与设计软件交互转变为LLM根据设计师自然语言指令生成并执行API调用脚本的方式;提出了技术路线并验证了其在建筑设计场景落地的可能性。该技术路线包括:① LLM根据用户指令从API库中搜索与任务相关的API;② LLM基于指令和候选API摘要信息编写程序脚本并运行;③ LLM根据来自软件环境、用户等反馈改进优化所编写的程序脚本。通过Rhino7设计软件、GPT-4和CodeLlaMa完成多个设计任务,测试当前LLM是否具备执行该技术路线各关键环节的能力。测试结果不仅证明了LLM驱动的交互式设计范式在建筑设计场景已初具落地前景,也为技术落地提供经验和建议。该设计范式的落地可以降低软件的使用门槛和学习成本,提高设计师工作效率;有望在未来的建筑设计软件中发挥重要作用。

关键词: 建筑设计软件, 软件交互, 大语言模型, 应用程序接口, GPT-4, Rhino7, Ladybug

Abstract:

As society places higher demands on the quality of building designs, design software has become more professional and complicated. Current design software not only incurs high learning costs but also features complex interaction modes. The recent breakthroughs in large language models (LLM) have enabled computers to clearly comprehend instructions based on human natural language and accurately generate code, which is expected to provide new ideas for the paradigm of human interaction with software. Therefore, this study designed a new paradigm of interactive building design driven by LLM, i.e., shifting from the designers interacting with the design software through multiple keyboard and mouse operations to LLMs writing scripts to invoke APIs according to architects’ instructions. The methodology was proposed and its implementation feasibility in building design was validated. The methodology included: ① LLM retrieved task-related APIs from the API set according to user instructions; ② LLM wrote a program script based on instructions and the abstract of candidate APIs and ran it; ③ LLM revised the script written based on the feedback from the environment, users, etc. To validate the capabilities of current LLMs in executing the key steps of the methodology, multiple design tasks were completed with Rhino7 design software, GPT-4, and CodeLlaMa. The results not only demonstrated that the LLM-driven interactive design paradigm held initial prospects for implementation in building design, but also provided experiences and suggestions for its implementation. The implementation of this design paradigm could reduce the threshold and learning costs, improving the efficiency in many scenarios, and was expected to play a key role in future building design software.

Key words: building design software, interaction with software, large language model, application programming interface, GPT-4, Rhino7, Ladybug

中图分类号: