欢迎访问《图学学报》 分享到:

图学学报 ›› 2025, Vol. 46 ›› Issue (2): 259-269.DOI: 10.11996/JG.j.2095-302X.2025020259

• 图像处理与计算机视觉 • 上一篇    下一篇

DEMF-Net:基于双分支增强和多尺度融合的大规模点云语义分割

李治寰1(), 宁小娟1,2(), 吕志勇1,2, 石争浩1,2, 金海燕1,2, 王映辉3, 周文明4   

  1. 1.西安理工大学计算机科学与工程学院,陕西 西安 710048
    2.陕西省网络计算与安全技术重点实验室,陕西 西安 710048
    3.江南大学人工智能与计算机学院,江苏 无锡 214122
    4.中国铁路设计集团有限公司,天津 300308
  • 收稿日期:2024-08-22 接受日期:2024-12-23 出版日期:2025-04-30 发布日期:2025-04-24
  • 通讯作者:宁小娟(1982-),女,教授,博士。主要研究方向为模式识别与图像处理。E-mail:ningxiaojuan@xaut.edu.cn
  • 第一作者:李治寰(2000-),男,硕士研究生。主要研究方向为三维点云分割。E-mail:2221221125@stu.xaut.edu.cn
  • 基金资助:
    国家自然科学基金(62172190);国家自然科学基金(62272383);天津市轨道交通导航定位及时空大数据技术重点实验室项目(TKL2023B11);陕西省重点研发计划(2024GX-YBXM-120)

DEMF-Net: dual-branch feature enhancement and multi-scale fusion for semantic segmentation of large-scale point clouds

LI Zhihuan1(), NING Xiaojuan1,2(), LV Zhiyong1,2, SHI Zhenghao1,2, JIN Haiyan1,2, WANG Yinghui3, ZHOU Wenming4   

  1. 1. School of Computer Science and Engineering, Xi’an University of Technology, Xi’an Shaanxi 710048, China
    2. Shaanxi Provincial Key Laboratory of Network Computing and Security Technology, Xi’an Shaanxi 710048, China
    3. School of Artificial Intelligence and Computer, Jiangnan University, Wuxi Jiangsu 214122, China
    4. China Railway Design Corporation Co., Ltd, Tianjin 300308, China
  • Received:2024-08-22 Accepted:2024-12-23 Published:2025-04-30 Online:2025-04-24
  • First author:LI Zhihuan (2000-), master student. His main research interest covers 3D point cloud segmentation. E-mail:2221221125@stu.xaut.edu.cn
  • Supported by:
    National Natural Science Foundation of China(62172190);National Natural Science Foundation of China(62272383);Tianjin Key Laboratory of Rail Transit Navigation Positioning and Spatiotemporal Big Data Technology(TKL2023B11);Shaanxi Provincial Key Research and Development Program(2024GX-YBXM-120)

摘要:

大规模点云语义分割是三维视觉领域的重要任务,广泛应用于自动驾驶、机器人导航、智慧城市建设和虚拟现实等领域。然而,现有方法采用下采样操作以及由于多尺度特征之间的差异过大都会降低模型对细节和局部特征的感知能力,从而大大影响语义分割的准确性。针对上述问题,提出了一种基于双分支特征增强和多尺度融合的语义分割网络DEMF-Net。设计了双分支增强聚合模块(DEA),聚焦于邻域内点云属性信息和语义特征的编码,根据双边特征生成偏移特征,将偏移特征嵌入对应原始特征,从而提高模型的局部感知能力。同时为了有效减弱不同尺度下特征间的语义鸿沟,另外设计了多尺度特征融合模块(MFF),通过融合相邻不同尺度特征,得到包含全部编码层输出的全局特征,提高模型的全局上下文感知能力并融合上层和底层编码输出,以提高特征辨识度。在SensatUrban和S3DIS场景数据集上进行大量的实验验证和分析,结果表明该方法平均交并比(mIoU)分别达到了61.6%和66.7%。

关键词: 三维视觉, 语义分割, 大规模点云, 城市场景, 特征编码

Abstract:

Large-scale point cloud semantic segmentation serves as a critical task in the domain of 3D vision, with broad applications across fields such as autonomous driving, robotic navigation, smart city construction, and virtual reality. However, existing methods relying on down-sampling and exhibiting excessive disparities between multi-scale features often suffer from a substantial loss in the ability to capture fine-grained details and local structures. This degradation in the model’s capacity to preserve such local features impairs the accuracy of semantic segmentation. To address these issues, a novel semantic segmentation framework, DEMF-Net was proposed, which integrated dual-branch feature enhancement and multi-scale fusion strategies. The network incorporated a dual-branch enhanced aggregation module, which was designed to jointly encode point cloud attribute information and semantic features from the local neighborhood. Bilateral features were leveraged and embedded into corresponding original features, thereby improving the model’s ability to capture local details with higher fidelity. Furthermore, a multi-scale feature fusion module was introduced to effectively reduce the semantic gap between features at different scales. This module facilitated the fusion of adjacent multi-scale features, resulting in a global feature representation that synthesized information across all encoding layers. Such a design significantly enhanced the model’s global context awareness and enabled the integration of both upper and lower layer encoding, thereby enhancing the feature recognition capabilities. Comprehensive experiments were conducted on two widely used point cloud datasets, SensatUrban and S3DIS, to validate the performance of the proposed approach. Experimental results demonstrated that the mean Intersection over Union (mIoU) could be achieved by DEMF-Net at 61.6% and 66.7%, respectively, outperforming existing state-of-the-art methods.

Key words: three-dimensional vision, semantic segmentation, large-scale point cloud, urban scene, feature encoding

中图分类号: