欢迎访问《图学学报》 分享到:

图学学报 ›› 2025, Vol. 46 ›› Issue (6): 1337-1345.DOI: 10.11996/JG.j.2095-302X.2025061337

• 计算机图形学与虚拟现实 • 上一篇    下一篇

基于几何超图感知的三维场景图生成

刘圆圆1,2(), 房友江1,2, 孟天宇1,2, 孟政宇1,2, 罗鹏伟1,2, 杨培根1,2, 姜雨彤3, 魏小鹏1,2, 张强1,2, 杨鑫1,2()   

  1. 1 大连理工大学计算机科学与技术学院辽宁 大连 116024
    2 社会计算与认知智能教育部重点实验室辽宁 大连 116024
    3 中国北方车辆研究所先进越野系统技术全国重点实验室北京 100072
  • 收稿日期:2024-10-09 接受日期:2025-04-15 出版日期:2025-12-30 发布日期:2025-12-27
  • 通讯作者:杨鑫(1984-),男,教授,博士。主要研究方向为计算机图形学与计算机视觉。E-mail:xinyang@dlut.edu.cn
  • 第一作者:刘圆圆(1999-),女,博士研究生。主要研究方向为计算机图形学与场景语义理解。E-mail:Lyy990415@gmail.com
  • 基金资助:
    科技创新2030——“新一代人工智能”重大项目(2021ZD12400)

Geometry hypergraph aware 3D scene graph generation

LIU Yuanyuan1,2(), FANG Youjiang1,2, MENG Tianyu1,2, MENG Zhengyu1,2, LUO Pengwei1,2, YANG Peigen1,2, JIANG Yutong3, WEI Xiaopeng1,2, ZHANG Qiang1,2, YANG Xin1,2()   

  1. 1 School of Computer Science and Technology, Dalian University of Technology, Dalian Liaoning 116024, China
    2 Key Laboratory of Social Computing and Cognitive Intelligence of Ministry of Education, Dalian Liaoning 116024, China
    3 Chinese Scholartree Ridge State Key Laboratory, China North Vehicle Research Institute, Beijing 100072, China
  • Received:2024-10-09 Accepted:2025-04-15 Published:2025-12-30 Online:2025-12-27
  • First author:LIU Yuanyuan (1999-), PhD candidate. Her main research interests cover computer graphics and scene semantic understanding. E-mail:Lyy990415@gmail.com
  • Supported by:
    Science and Technology Innovation 2030 - “New Generation Artificial Intelligence” Major Project(2021ZD12400)

摘要:

近年来,在计算机图形学与视觉领域,3D场景图生成(SGG)引起了广泛关注。尽管现有研究在粗分类和单一关系标签的准确性方面有所提高,但在细粒度分类和多标签情境下的表现依然不足,无法满足实际应用的需求。为此,提出了一种创新性框架,旨在充分利用上下文信息实现细粒度实体分类、多关系标签以及更高的准确性。该方法由图特征提取(GFE)模块和图上下文推理(GCI)模块组成。GFE模块负责从输入数据中提取实体及交互语义特征保留关键信息,而GCI模块通过引入传统图和超图的结构化特征,通过分析不同实体间的关系,识别邻域内的实体关联度,合并具有相似交互模式的实体,从而学习实体间的交互形式,其中引入的几何超图结构是基于场景布局动态生成的结构化组织信息。通过在3DSSG数据集上的实验评估,该框架通过融合传统图与超图对于节点和节点间关联的组织能力,有效地改善了3D场景图生成任务中的细粒度分类和多关系标签的识别能力。

关键词: 几何超图, 三维场景图生成, 结构化组织, 节点聚类, 多元关系提取, 自适应更新

Abstract:

In the field of computer graphics and vision, 3D scene graph generation (SGG) has gained widespread attention in recent years. While existing research has improved the accuracy of coarse-grained classification and single-relation labels, performance in fine-grained classification and multi-label scenarios remains inadequate, limiting real-world applications. To address this, an innovative framework was proposed to fully utilizes contextual information to achieve fine-grained entity classification, multi-relation labeling, and enhanced accuracy. Our method comprised two core modules: the graph feature extraction (GFE) module and the graph context inference (GCI) module. The GFE module was used to extract entity and interaction semantic features from input data to ensure the extraction of key information. The GCI module introduced structural features from both traditional graphs and hypergraphs, analyzed relationships between entities, identified relational proximity within neighborhoods, and merged entities with similar interaction patterns to learn their interactions. The geometric hypergraph structure was dynamically generated based on scene layouts, providing structured organizational information. Experimental evaluations on the 3DSSG dataset, by integrating the organizational capabilities of both traditional graphs and hypergraphs for node and relationship clustering, the proposed work effectively improved fine-grained classification and multi-relation label recognition in 3D SGG tasks.

Key words: geometric hypergraph, 3D scene graph generation, structured organization, node clustering, multi-relational extraction, adaptive updating

中图分类号: