欢迎访问《图学学报》 分享到:

图学学报 ›› 2025, Vol. 46 ›› Issue (6): 1267-1273.DOI: 10.11996/JG.j.2095-302X.2025061267

• 图像处理与计算机视觉 • 上一篇    下一篇

融合多频超图的图像语义分割方法

于男男1,2(), 孟政宇1,2, 房友江1,2, 孙传昱1,2, 殷雪峰1,2, 张强1,2, 魏小鹏1,2, 杨鑫1,2()   

  1. 1 大连理工大学计算机科学与技术学院辽宁 大连 116024
    2 社会计算与认知智能教育部重点实验室辽宁 大连 116024
  • 收稿日期:2024-10-09 接受日期:2025-04-16 出版日期:2025-12-30 发布日期:2025-12-27
  • 通讯作者:杨鑫(1984-),男,教授,博士。主要研究方向为计算机图形学与计算机视觉等。E-mail:xinyang@dlut.edu.cn
  • 第一作者:于男男(1993-),女,博士研究生。主要研究方向为基于事件相机的计算机视觉。E-mail:12009059@mail.dlut.edu.cn
  • 基金资助:
    科技创新2030——“新一代人工智能”重大项目(2021ZD12400)

Frequency-aware hypergraph fusion for event-based semantic segmentation

YU Nannan1,2(), MENG Zhengyu1,2, FANG Youjiang1,2, SUN Chuanyu1,2, YIN Xuefeng1,2, ZHANG Qiang1,2, WEI Xiaopeng1,2, YANG Xin1,2()   

  1. 1 School of Computer Science and Technology, Dalian University of Technology, Dalian Liaoning 116024, China
    2 Key Laboratory of Social Computing and Cognitive Intelligence of Ministry of Education, Dalian Liaoning 116024, China
  • Received:2024-10-09 Accepted:2025-04-16 Published:2025-12-30 Online:2025-12-27
  • First author:YU Nannan (1993-), PhD candidate. Her main research interest covers event-based computer vision. E-mail:12009059@mail.dlut.edu.cn
  • Supported by:
    Science and Technology Innovation 2030 - “New Generation Artificial Intelligence” Major Project(2021ZD12400)

摘要:

语义分割技术作为自动驾驶环境感知的核心任务,在复杂光照和高速运动场景下面临着传统相机成像质量不足的挑战。事件相机凭借微秒级时间分辨率和高动态范围特性,能够有效克服运动模糊和极端光照条件,但其异步稀疏事件数据缺乏纹理和色彩信息,且背景和目标运动导致事件分布不均匀,为语义特征提取带来显著困难。针对这些问题,提出了一种融合多频超图的事件图像语义分割方法。首先通过频率分解模块提取事件帧的多尺度时空特征,以分离高频运动边缘与低频结构信息;继而设计动态超图构建算法,将不同频率特征映射为超图节点,利用超图卷积捕获跨频率的长程依赖关系,最后通过注意力机制自适应融合特征,增强类别间的判别性。在Carla-Semantic和DDD17-Semantic数据集上的实验表明,该方法在MPA(88.21%)和mIoU(82.68%)指标上优于现有事件分割方法,验证了多频超图模型对事件数据语义理解的提升效果。为基于事件相机的鲁棒性环境感知提供了新思路,特别适用于自动驾驶中的低光照、高速运动等挑战场景。

关键词: 语义分割, 超图, 注意力机制, 多频融合, 事件相机

Abstract:

Semantic segmentation, a core task for autonomous driving perception, faces challenges under low-light and high-speed scenarios due to the limitations of conventional cameras. Event cameras, with their microsecond temporal resolution and high dynamic range, effectively mitigate motion blur and extreme lighting conditions. However, their asynchronous sparse event data lacks texture and color information, while uneven event distributions caused by relative motion between background and objects pose significant difficulties for semantic feature extraction. To address these issues, a multi-frequency hypergraph fusion method for event-based semantic segmentation was proposed. First, the approach decomposed event frames into multi-scale spatiotemporal features through a frequency separation module, distinguishing high-frequency motion edges from low-frequency structural information. A dynamic hypergraph construction algorithm then mapped these multi-frequency features into hypergraph nodes, utilizing hypergraph convolution to capture long-range dependencies across frequencies. Finally, an attention mechanism adaptively fused multi-frequency features to enhance inter-class discriminability. Experiments on Carla-Semantic and DDD17-Semantic datasets demonstrated that this method achieved 88.21% MPA and 82.68% mIoU, outperforming existing methods and validating the effectiveness of the multi-frequency hypergraph model for event-based semantic understanding. This research provided a novel solution for robust environment perception with event cameras, particularly suited to challenging autonomous driving scenarios involving low-light conditions and rapid motion.

Key words: semantic segmentation, hypergraph, attention mechanism, multi-frequency fusion, event camera

中图分类号: