A new interaction paradigm for building design driven by large language model: proof of concept with Rhino7

doi:10.11996/JG.j.2095-302X.2024030594

Abstract

Abstract:

As society places higher demands on the quality of building designs, design software has become more professional and complicated. Current design software not only incurs high learning costs but also features complex interaction modes. The recent breakthroughs in large language models (LLM) have enabled computers to clearly comprehend instructions based on human natural language and accurately generate code, which is expected to provide new ideas for the paradigm of human interaction with software. Therefore, this study designed a new paradigm of interactive building design driven by LLM, i.e., shifting from the designers interacting with the design software through multiple keyboard and mouse operations to LLMs writing scripts to invoke APIs according to architects’ instructions. The methodology was proposed and its implementation feasibility in building design was validated. The methodology included: ① LLM retrieved task-related APIs from the API set according to user instructions; ② LLM wrote a program script based on instructions and the abstract of candidate APIs and ran it; ③ LLM revised the script written based on the feedback from the environment, users, etc. To validate the capabilities of current LLMs in executing the key steps of the methodology, multiple design tasks were completed with Rhino7 design software, GPT-4, and CodeLlaMa. The results not only demonstrated that the LLM-driven interactive design paradigm held initial prospects for implementation in building design, but also provided experiences and suggestions for its implementation. The implementation of this design paradigm could reduce the threshold and learning costs, improving the efficiency in many scenarios, and was expected to play a key role in future building design software.

Key words: building design software, interaction with software, large language model, application programming interface, GPT-4, Rhino7, Ladybug

CLC Number:

TP391

JIANG Can, ZHENG Zhe, LIANG Xiong, LIN Jiarui, MA Zhiliang, LU Xinzheng. A new interaction paradigm for building design driven by large language model: proof of concept with Rhino7[J]. Journal of Graphics, 2024, 45(3): 594-600.

Figures/Tables 6

References 20

[1]	SADEGHIPOUR ROUDSARI M, PAK M, VIOLA A. Ladybug: a parametric environmental plugin for grasshopper to help designers create an environmentally-conscious design[EB/OL]. [2023-05-11]. https://xueshu.baidu.com/usercenter/paper/show?paperid=db06c426c33b33371c6e5ad36b02ae91&site=xueshu_se.
[2]	ZHAO W X, ZHOU K, LI J Y, et al. A survey of large language models[EB/OL]. (2023-03-31) [2023-05-24]. http://arxiv.org/abs/2303.18223.pdf.
[3]	ANUMBA C J, ISSA R R A, PAN J Y, et al. Ontology-based information and knowledge management in construction[J]. Construction Innovation, 2008, 8(3): 218-239.
[4]	LIN J R, HU Z Z, ZHANG J P, et al. A natural-language-based approach to intelligent data retrieval and representation for cloud BIM[J]. Computer-Aided Civil and Infrastructure Engineering, 2016, 31(1): 18-33.
[5]	SHIN S, ISSA R R A. BIMASR: framework for voice-based BIM information retrieval[J]. Journal of Construction Engineering and Management, 2021, 147(10): 04021124.
[6]	SOCHER R, BAUER J, MANNING C D, et al. Parsing with compositional vector grammars[J]. ACL 2013 - 51st Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference 2013, 1: 455-465.
[7]	CHEN D Q, MANNING C. A fast and accurate dependency parser using neural networks[C]// The 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: Association for Computational Linguistics, 2014: 740-750.
[8]	ZHENG Z, LU X Z, CHEN K Y, et al. Pretrained domain-specific language model for natural language processing tasks in the AEC domain[J]. Computers in Industry, 2022, 142: 103733.
[9]	ZHOU Y C, ZHENG Z, LIN J R, et al. Integrating NLP and context-free grammar for complex rule interpretation towards automated compliance checking[J]. Computers in Industry, 2022, 142: 103746.
[10]	ZHENG Z, ZHOU Y C, LU X Z, et al. Knowledge-informed semantic alignment and rule interpretation for automated compliance checking[J]. Automation in Construction, 2022, 142: 104524.
[11]	ZHENG J W, FISCHER M. BIM-GPT: a prompt-based virtual assistant framework for BIM information retrieval[EB/OL]. (2023-04-18) [2023-05-11]. http://arxiv.org/abs/2304.09333.pdf.
[12]	ROMBACH R, BLATTMANN A, LORENZ D, et al. High-resolution image synthesis with latent diffusion models[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 10674-10685.
[13]	OPENAI, ACHIAM J, ADLER S, et al. GPT-4 technical report[EB/OL]. (2023-05-24) [2023-06-05]. http://arxiv.org/abs/2303.08774.pdf.
[14]	TUTORIALSUP. SketchUp + ChatGPT 4 different use cases[EB/OL]. (2023-05-04) [2023-06-11]. https://www.youtube.com/watch?v=IPoFA-XyWrc.
[15]	WHITE J, FU Q C, HAYS S, et al. A prompt pattern catalog to enhance prompt engineering with ChatGPT[EB/OL]. (2023- 02-21) [2023-05-14]. http://arxiv.org/abs/2302.11382.pdf.
[16]	PATIL S G, ZHANG T J, WANG X, et al. Gorilla: large language model connected with massive APIs[EB/OL]. (2023-05-24) [2023-06-05]. http://arxiv.org/abs/2305.15334.pdf.
[17]	LI M H, ZHAO Y X, YU B W, et al. API-bank: a comprehensive benchmark for tool-augmented LLMs[EB/OL]. (2023-04-14) [2023-06-06]. http://arxiv.org/abs/2304.08244.pdf.
[18]	WU Q Y, BANSAL G, ZHANG J Y, et al. AutoGen: enabling next-gen LLM applications via multi-agent conversation[EB/OL]. (2023-08-16) [2023-09-07]. http://arxiv.org/abs/2308.08155.pdf.
[19]	WANG G Z, XIE Y Q, JIANG Y F, et al. Voyager: an open-ended embodied agent with large language models[EB/OL]. (2023-05-25) [2023-07-28]. http://arxiv.org/abs/2305.16291.pdf.
[20]	ROZIÈRE B, GEHRING J, GLOECKLE F, et al. Code llama: open foundation models for code[EB/OL]. (2023-08-24) [2023-09-04]. http://arxiv.org/abs/2308.12950.pdf.

内容	例子
API名称	calc_sunpath(location, hoys, ···)
API功能	Calulate trajectory of sun according to location and time information
API输入	Location information (latitude, longitude, etc.) of a city (location: Object); ···
API输出	A list of solar altitude (altitudes: list); A list of solar azimuth (azimuths: list); ···
API调用案例	altitudes, azimuths, datetimes, vectors = calc_sunpath(location, hoys)

内容	例子
API名称	calc_sunpath(location, hoys, ···)
API功能	Calulate trajectory of sun according to location and time information
API输入	Location information (latitude, longitude, etc.) of a city (location: Object); ···
API输出	A list of solar altitude (altitudes: list); A list of solar azimuth (azimuths: list); ···
API调用案例	altitudes, azimuths, datetimes, vectors = calc_sunpath(location, hoys)

需求	任务	API数量
几何建模	生成矩形截面建筑模型	2
	生成不规则截面建筑模型	5
	生成多连立方体建筑模型	4
	多建筑模型随机排布	4
	多建筑模型按指定位姿排布	4
	考虑间距约束的多建筑模型排布	4
建筑性能分析	建筑日照分析	6
建筑性能分析	建筑辐照度分析	6
可视化渲染	太阳路径计算与可视化	4
	天穹辐射密度计算与可视化	4
	视角与渲染模式转换	2
	模型颜色变化	1

需求	任务	API数量
几何建模	生成矩形截面建筑模型	2
	生成不规则截面建筑模型	5
	生成多连立方体建筑模型	4
	多建筑模型随机排布	4
	多建筑模型按指定位姿排布	4
	考虑间距约束的多建筑模型排布	4
建筑性能分析	建筑日照分析	6
建筑性能分析	建筑辐照度分析	6
可视化渲染	太阳路径计算与可视化	4
	天穹辐射密度计算与可视化	4
	视角与渲染模式转换	2
	模型颜色变化	1