欢迎访问《图学学报》 分享到:

图学学报 ›› 2025, Vol. 46 ›› Issue (5): 1028-1041.DOI: 10.11996/JG.j.2095-302X.2025051028

• 计算机图形学与虚拟现实 • 上一篇    下一篇

DRec:大语言模型驱动的数据分析推荐系统

陈治彰(), 封颖超杰, 翁罗轩, 沈健, 陈为()   

  1. 浙江大学计算机辅助设计与图形系统全国重点实验室浙江 杭州 310058
  • 收稿日期:2024-11-19 接受日期:2025-02-25 出版日期:2025-10-30 发布日期:2025-09-10
  • 通讯作者:陈为(1976-),男,教授,博士。主要研究方向为可视化、可视分析等。E-mail:chenvis@zju.edu.cn
  • 第一作者:陈治彰(2000-),男,硕士研究生。主要研究方向为可视分析。E-mail:chenzhiz@zju.edu.cn
  • 基金资助:
    国家自然科学基金(62132017);浙江省领雁研发攻关计划(2024C01167);浙江省自然科学基金(LD24F020011)

DRec: large language model-driven data analysis recommendation system

CHEN Zhizhang(), FENG Yingchaojie, WENG Luoxuan, SHEN Jian, CHEN Wei()   

  1. State Key Laboratory of CAD&CG, Zhejiang University, Hangzhou Zhejiang 310058, China
  • Received:2024-11-19 Accepted:2025-02-25 Published:2025-10-30 Online:2025-09-10
  • First author:CHEN Zhizhang (2000-), master student. His main research interest covers visual analysis. E-mail:chenzhiz@zju.edu.cn
  • Supported by:
    National Natural Science Foundation of China(62132017);“Pioneer” and “Leading Goose” Research and Development Program of Zhejiang(2024C01167);Zhejiang Provincial Natural Science Foundation of China(LD24F020011)

摘要:

自然语言交互系统极大地简化了用户与数据分析的交互流程,允许用户通过自然语言来完成数据分析和图表绘制。随着大型语言模型(LLM)的兴起,近年来LLM驱动的自然语言数据分析系统逐渐成为一种趋势。LLM凭借其出色的逻辑推理和工具调用能力,能够生成更为复杂的逻辑推断和图表。尽管如此,依靠LLM进行的交互式数据分析仍充满挑战。数据分析师在分析过程中必须明确分析方向以推动交互式分析的进行,通常要求其对数据有深入的了解。此外,使用LLM进行数据探索时,分析师因为较少直接操作数据,致使对数据的理解不足,从而影响对分析流程的整体掌控。为了帮助用户明确分析流程、加深对数据的理解,提出一种基于推荐和关联的LLM数据分析系统DRec。该系统通过关联信息帮助用户建立起对数据的认知,并引导数据分析的流程。同时,系统从语义和数据2个维度为用户提供洞察,并据此推荐查询,以协助用户确定数据分析的方向。通过案例研究和用户实验,证明DRec系统能够提高数据分析效率并引导用户获得合理的数据分析结果。

关键词: 大语言模型, 交互式数据分析, 数据探索, 自然语言界面, 自然语言推荐

Abstract:

Natural language interaction systems have greatly simplified the interaction process between users and data analysis, allowing users to complete data analysis and chart generation through natural language. With the rise of large language models (LLMs), LLM-driven natural language data analysis systems have gradually become a trend in recent years. Thanks to their excellent logical reasoning and tool invocation capabilities, LLMs are able to generate more complex logical inferences and charts. However, interactive data analysis based on LLMs poses challenges. Data analysts must clearly define the direction of analysis to drive the interactive process, which often necessitates a deep understanding of the data. Furthermore, when employing LLMs for data exploration, analysts are often less directly involved with the data, which may lead to insufficient understanding of the data and consequently affect the overall control of the analysis process. To assist users in clarifying the analysis process and deepening their understanding of the data, the LLM-based recommendation and association-driven data analysis system DRec was proposed. This system aided users in developing a comprehensive understanding of the data through associative information and guides the data analysis process. At the same time, the system provided insights from both the semantic and data dimensions and offered query recommendations to assist users in determining the analysis direction. Case studies and user experiments demonstrated that the DRec system can enhance data analysis interaction efficiency and guide users toward reasonable data analysis results.

Key words: large language models, interactive data analysis, data exploration, natural language interface, natural language recommendation

中图分类号: