Welcome to Journal of Graphics share: 

Journal of Graphics ›› 2025, Vol. 46 ›› Issue (2): 259-269.DOI: 10.11996/JG.j.2095-302X.2025020259

Previous Articles     Next Articles

DEMF-Net: dual-branch feature enhancement and multi-scale fusion for semantic segmentation of large-scale point clouds

LI Zhihuan1(), NING Xiaojuan1,2(), LV Zhiyong1,2, SHI Zhenghao1,2, JIN Haiyan1,2, WANG Yinghui3, ZHOU Wenming4   

  1. 1. School of Computer Science and Engineering, Xi’an University of Technology, Xi’an Shaanxi 710048, China
    2. Shaanxi Provincial Key Laboratory of Network Computing and Security Technology, Xi’an Shaanxi 710048, China
    3. School of Artificial Intelligence and Computer, Jiangnan University, Wuxi Jiangsu 214122, China
    4. China Railway Design Corporation Co., Ltd, Tianjin 300308, China
  • Received:2024-08-22 Accepted:2024-12-23 Online:2025-04-30 Published:2025-04-24
  • Contact: NING Xiaojuan
  • About author:First author contact:

    LI Zhihuan (2000-), master student. His main research interest covers 3D point cloud segmentation. E-mail:2221221125@stu.xaut.edu.cn

  • Supported by:
    National Natural Science Foundation of China(62172190);National Natural Science Foundation of China(62272383);Tianjin Key Laboratory of Rail Transit Navigation Positioning and Spatiotemporal Big Data Technology(TKL2023B11);Shaanxi Provincial Key Research and Development Program(2024GX-YBXM-120)

Abstract:

Large-scale point cloud semantic segmentation serves as a critical task in the domain of 3D vision, with broad applications across fields such as autonomous driving, robotic navigation, smart city construction, and virtual reality. However, existing methods relying on down-sampling and exhibiting excessive disparities between multi-scale features often suffer from a substantial loss in the ability to capture fine-grained details and local structures. This degradation in the model’s capacity to preserve such local features impairs the accuracy of semantic segmentation. To address these issues, a novel semantic segmentation framework, DEMF-Net was proposed, which integrated dual-branch feature enhancement and multi-scale fusion strategies. The network incorporated a dual-branch enhanced aggregation module, which was designed to jointly encode point cloud attribute information and semantic features from the local neighborhood. Bilateral features were leveraged and embedded into corresponding original features, thereby improving the model’s ability to capture local details with higher fidelity. Furthermore, a multi-scale feature fusion module was introduced to effectively reduce the semantic gap between features at different scales. This module facilitated the fusion of adjacent multi-scale features, resulting in a global feature representation that synthesized information across all encoding layers. Such a design significantly enhanced the model’s global context awareness and enabled the integration of both upper and lower layer encoding, thereby enhancing the feature recognition capabilities. Comprehensive experiments were conducted on two widely used point cloud datasets, SensatUrban and S3DIS, to validate the performance of the proposed approach. Experimental results demonstrated that the mean Intersection over Union (mIoU) could be achieved by DEMF-Net at 61.6% and 66.7%, respectively, outperforming existing state-of-the-art methods.

Key words: three-dimensional vision, semantic segmentation, large-scale point cloud, urban scene, feature encoding

CLC Number: