PointMLP-FD：基于多级自适应下采样的点云分类模型

doi:10.11996/JG.j.2095-302X.2023010112

图学学报 ›› 2023, Vol. 44 ›› Issue (1): 112-119.DOI: 10.11996/JG.j.2095-302X.2023010112

• 计算机图形学与虚拟现实 • 上一篇下一篇

PointMLP-FD：基于多级自适应下采样的点云分类模型

梁奥¹^,²^,³^,⁴(), 李峙含¹^,²^,³^,⁴, 花海洋¹^,²()

1.中国科学院光电信息处理重点实验室，辽宁沈阳 110016
2.中国科学院沈阳自动化研究所，辽宁沈阳 110016
3.中国科学院机器人与智能制造创新研究院，辽宁沈阳 110169
4.中国科学院大学，北京 100049

收稿日期:2022-05-09 修回日期:2022-08-19 出版日期:2023-10-31 发布日期:2023-02-16
通讯作者: 花海洋
作者简介:梁奥(1998-)，男，硕士研究生。主要研究方向为基于激光雷达的目标检测及点云处理。E-mail：liangao@sia.cn
基金资助:
中国科学院创新基金项目(E01Z040101)

PointMLP-FD: a point cloud classification model based on multi-level adaptive downsampling

LIANG AO¹^,²^,³^,⁴(), LI Zhi-han¹^,²^,³^,⁴, HUA Hai-yang¹^,²()

1. Key Laboratory of Opto-Electronic Information Processing, Chinese Academy of Sciences, Shenyang Liaoning 110016, China
2. Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang Liaoning 110016, China
3. Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences, Shenyang Liaoning 110169, China
4. University of Chinese Academy of Sciences, Beijing 100049, China

Received:2022-05-09 Revised:2022-08-19 Online:2023-10-31 Published:2023-02-16
Contact: HUA Hai-yang
About author:LIANG Ao (1998-), master student. His main research interests cover LIDAR-based target detection and point cloud processing. mail：liangao@sia.cn
Supported by:
Chinese Academy of Sciences Innovation Fund(E01Z040101)

摘要/Abstract

摘要：

针对受硬件条件、物体遮挡和背景杂波等客观因素的影响，传感器采集的目标点云具有较强的稀疏性和密度不均匀性，导致分类模型对点云特征的学习效率低、分类泛化能力差的问题，提出了一种基于多级自适应下采样的点云分类模型PointMLP-FD。该模型设计了多个MLP模块作为网络分支，以点云的浅层特征为输入得到每个点云类别维度上的特征表达，之后再根据特征表达进行排序，选择具有更强语义特征的点构成下采样点集。通过过滤背景和与目标相关性低的信息来自适应保留反应目标本质特征的信息。最后分别计算分支网络的损失，与骨干网络并行训练来优化点云特征，减少模型参数。该方法在ScanObjectNN数据集上进行测试，结果表明相较于PointMLP-elite分类精度更高，mAcc提升1%，OA提升0.8%，以更少的参数量接近SOTA模型的性能。

关键词: 点云分类, 自适应, 下采样, 并行训练

Abstract:

Due to the influence of objective factors, such as hardware limitations, object occlusion, and background clutter, the target point clouds collected by sensors have strong sparsity and density inhomogeneity, resulting in low learning efficiency of point cloud features by the classification model and poor classification generalization ability. To address these challenges, a point cloud classification model PointMLP-FD (feature-driven) was proposed based on multi-level adaptive downsampling. Multiple MLP modules were designed as network branches in the model, and with the shallow features of point clouds as inputs, feature expressions in each point cloud category dimension could be obtained. Then the points with stronger semantic features were selected to form the downsampled point set according to the ranking of the feature expressions. The information reflecting the essential features of the target could be self-adaptively retained by filtering the background and the information with low relevance to the target. Finally, the losses of branch networks were calculated separately and trained in parallel with the backbone network to optimize the point cloud features and reduce the model parameters. The proposed method was tested on the Scan Object NN dataset, and the results show that compared with PointMLP-elite, the classification accuracy is higher, with 1% improvement in mAcc and 0.8% improvement in OA, approaching the performance of the SOTA model with fewer parameters.

Key words: point cloud classification, self-adaption, downsampling, parallel training

中图分类号:

TP391

梁奥, 李峙含, 花海洋. PointMLP-FD：基于多级自适应下采样的点云分类模型[J]. 图学学报, 2023, 44(1): 112-119.

LIANG AO, LI Zhi-han, HUA Hai-yang. PointMLP-FD: a point cloud classification model based on multi-level adaptive downsampling[J]. Journal of Graphics, 2023, 44(1): 112-119.

图/表 10

图1 通用的点云分类网络结构

Fig. 1 Generic point cloud classification network structure

图2 PointMLP-FD的模型结构，主干部分与PointMLP相同，Class ATT为所提出的自适应下采样模块，是一个以点云浅层特征为输入的浅层MLP模型。添加了2个分支网络分别计算损失，与网络最终的分类误差一起共同参与网络训练

Fig. 2 Model structure of PointMLP-FD, the backbone part is the same as PointMLP, Class ATT is the proposed adaptive downsampling module, which is a shallow MLP model with shallow point cloud features as input. Two branching networks are added to calculate the loss separately and participate in the network training together with the final classification error of the network

图3 Class ATT并行训练与损失计算

Fig. 3 Parallel training of Class ATT and loss calculation

图4 部分ScanObjectNN数据集中的样本

Fig. 4 Samples from the ScanObjectNN

图5 模型训练过程

Fig. 5 Training process ((a) PointMLP-elite; (b) PointMLP-FD)

表1 实验结果

Table 1 Experimental results

Method	Overall Acc(%)	Mean Acc(%)	Param(M)
PointNet^[20]	63.0	58.1	-
SpiderCNN^[35]	68.2	63.4	-
PointNet++^[21]	77.9	75.4	1.41
DGCNN^[36]	78.1	73.6	-
PointCNN^[37]	78.5	75.1	-
GBNet^[38]	80.5	77.8	8.39
PRA-Net^[39]	82.1	79.1	-
Point-TnT^[40]	83.5	81.0	-
Point-BERT^[41]	83.1	-	20.8
PointMLP(SOTA)	85.4±0.3	83.9±0.5	12.6
PointMLP-elite	83.8±0.6	81.8±0.8	0.68
PointMLP-FD(Ours)	85.15	83.64	0.77

表2 ScanObjectNN数据集中每一类的测试精度

Table 2 Results for each category in the ScanObjectNN dataset

Method	Bag	Bin	Box	Cabinet	Chair	Desk	Display	Door	Shelf	Table	Bed	Pillow	Sink	Sofa	Toilet
PointMLP-elite	0.59	0.89	0.60	0.81	0.94	0.78	0.88	0.90	0.87	0.71	0.84	0.77	0.78	0.92	0.82
Ours	0.70	0.87	0.62	0.85	0.93	0.79	0.88	0.94	0.83	0.73	0.87	0.84	0.85	0.94	0.85

图6 采样的可视化结果，FPS(上)和Class ATT(下)下采样。红色框显示了2种采样方式在Chair中第一个样本背景上采样的结果((a)椅子；(b)桌子)

Fig. 6 Visualize the sampling results, FPS (top) and Class ATT (bottom) downsampling. The red box shows the results of both sampling methods on the background of the first sample in Chair ((a) Chair; (b) Table)

表3 抽取的PointMLP-FD-se与PointMLP-elite?的实验结果

Table 3 Experimental results of extracted PointMLP-FD-se with PointMLP-elite?

Method	Overall Acc(%)	Avg Acc(%)	Param(M)
PointMLP-elite	83.8±0.6	81.8±0.8	0.68
PointMLP-eliteʹ	82.41	79.57	0.31
PointMLP-FD- se(Ours)	84.04	82.27	0.33

图7 网络分别使用最大池化((a)平均池化；(b)训练后的下采样结果)

Fig. 7 The downsampling results of the network after training with max-pooling ((a) Average pooling respectively; (b) Post-training down-sampling results)

参考文献 41

[1]	RAJ T, HASHIM F H, HUDDIN A B, et al. A survey on LiDAR scanning mechanisms[J]. Electronics, 2020, 9(5): 741. DOI URL
[2]	MEHENDALE N, NEOGE S. Review on lidar technology[EB/OL]. [2022-01-12]. https://www.researchgate.net/publication/342154967_Review_on_Lidar_Technology.
[3]	FEI B, YANG W D, CHEN W M, et al. Comprehensive review of deep learning-based 3D point cloud completion processing and analysis[EB/OL]. [2022-01-12].https://arxiv.org/abs/2203.03311.
[4]	ZAMANAKOS G, TSOCHATZIDIS L, AMANATIADIS A, et al. A comprehensive survey of LIDAR-based 3D object detection methods with deep learning for autonomous driving[J]. Computers & Graphics, 2021, 99: 153-181. DOI URL
[5]	QIAN R, LAI X, LI X R. 3D object detection for autonomous driving: a survey[EB/OL]. [2022-01-12].https://arxiv.org/abs/2106.10823.
[6]	RORIZ R, CABRAL J, GOMES T. Automotive LiDAR technology: a survey[J]. IEEE Transactions on Intelligent Transportation Systems, 2022, 23(7): 6282-6297. DOI URL
[7]	LV D K, YING X X, CUI Y J, et al. Research on the technology of LIDAR data processing[C]//2017 First International Conference on Electronics Instrumentation & Information Systems. New York: IEEE Press, 2017: 1-5.
[8]	BELLO S A, YU S S, WANG C, et al. Review: deep learning on 3D point clouds[J]. Remote Sensing, 2020, 12(11): 1729. DOI URL
[9]	SU H, MAJI S, KALOGERAKIS E, et al. Multi-view convolutional neural networks for 3D shape recognition[C]//2015 IEEE International Conference on Computer Vision. New York: IEEE Press, 2015: 945-953.
[10]	BAI S, BAI X, ZHOU Z C, et al. GIFT: a real-time and scalable 3D shape search engine[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 5023-5032.
[11]	KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]//The 25th International Conference on Neural Information Processing Systems - Volume 1. New York: ACM, 2012: 1097-1105.
[12]	SU H, MAJI S, KALOGERAKIS E, et al. Multi-view convolutional neural networks for 3D shape recognition[C]// 2015 IEEE International Conference on Computer Vision. New York: IEEE Press, 2015: 945-953.
[13]	QI C R, SU H, NIEßNER M, et al. Volumetric and multi-view CNNs for object classification on 3D data[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 5648-5656.
[14]	MATURANA D, SCHERER S.3D convolutional neural networks for landing zone detection from LiDAR[C]//2015 IEEE International Conference on Robotics and Automation. New York: IEEE Press, 2015: 3471-3478.
[15]	MATURANA D, SCHERER S. VoxNet: a 3D convolutional neural network for real-time object recognition[C]//2015 IEEE/RSJ International Conference on Intelligent Robots and Systems. New York: IEEE Press, 2015: 922-928.
[16]	WU Z R, SONG S R, KHOSLA A, et al. 3D ShapeNets: a deep representation for volumetric shapes[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2015: 1912-1920.
[17]	WANG C, CHENG M, SOHEL F, et al. NormalNet: a voxel-based CNN for 3D object classification and retrieval[J]. Neurocomputing, 2019, 323: 139-147. DOI
[18]	RIEGLER G, ULUSOY A O, GEIGER A. OctNet: learning deep 3D representations at high resolutions[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 6620-6629.
[19]	TATARCHENKO M, DOSOVITSKIY A, BROX T. Octree generating networks: efficient convolutional architectures for high-resolution 3D outputs[C]// 2017 IEEE International Conference on Computer Vision. New York: IEEE Press, 2017: 2107-2115.
[20]	CHARLES R Q, HAO S, MO K C, et al. PointNet: deep learning on point sets for 3D classification and segmentation[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 77-85.
[21]	QI C R, YI L, SU H, et al. PointNet++: deep hierarchical feature learning on point sets in a metric space[C]//The 31st International Conference on Neural Information Processing Systems. New York: ACM, 2017: 5105-5114.
[22]	LI J X, CHEN B M, LEE G H. SO-net: self-organizing network for point cloud analysis[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 9397-9406.
[23]	ZHAO H S, JIANG L, FU C W, et al. PointWeb: enhancing local neighborhood features for point cloud processing[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 5560-5568.
[24]	ZHANG W X, XIAO C X. PCAN: 3D attention map learning using contextual information for point cloud based retrieval[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 12428-12437.
[25]	KLOKOV R, LEMPITSKY V. Escape from cells: deep Kd-networks for the recognition of 3D point cloud models[C]//2017 IEEE International Conference on Computer Vision. New York: IEEE Press, 2017: 863-872.
[26]	WANG C, SAMARI B, SIDDIQI K. Local spectral graph convolution for point set feature learning[EB/OL]. [2022-01- 12]. https://arxiv.org/abs/1803.05827.
[27]	ZHAO H S, JIANG L, JIA J Y, et al. Point transformer[C]//2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 16239-16248.
[28]	ENGEL N, BELAGIANNIS V, DIETMAYER K. Point transformer[EB/OL]. [2022-01-12]. https://arxiv.org/abs/2011.00931.
[29]	GUO M H, CAI J X, LIU Z N, et al. PCT: point cloud transformer[J]. Computational Visual Media, 2021, 7(2): 187-199. DOI
[30]	PANG Y T, WANG W X, TAY F E H, et al. Masked autoencoders for point cloud self-supervised learning[EB/OL]. [2022-01-12]. https://arxiv.org/abs/2203.06604.
[31]	ZHANG Y F, HU Q Y, XU G Q, et al. Not all points are equal: learning highly efficient point-based detectors for 3D LiDAR point clouds[EB/OL]. [2022-01-12]. https://arxiv.org/abs/2203.11139.
[32]	WU Z R, SONG S R, KHOSLA A, et al. 3D ShapeNets: a deep representation for volumetric shapes[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2015: 1912-1920.
[33]	UY M A, PHAM Q H, HUA B S, et al. Revisiting point cloud classification: a new benchmark dataset and classification model on real-world data[C]//2019 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2019: 1588-1597.
[34]	HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 770-778.
[35]	XU Y F, FAN T Q, XU M Y, et al. SpiderCNN: deep learning on point sets with parameterized convolutional filters[M]// Computer Vision - ECCV 2018. Cham: Springer International Publishing, 2018: 90-105.
[36]	SHI S S, WANG X G, LI H S. PointRCNN: 3D object proposal generation and detection from point cloud[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 770-779.
[37]	WU W X, QI Z A, LI F X. PointConv: deep convolutional networks on 3D point clouds[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 9613-9622.
[38]	QIU S, ANWAR S, BARNES N. Geometric back-projection network for point cloud classification[J]. IEEE Transactions on Multimedia, 2022, 24: 1943-1955. DOI URL
[39]	CHENG S L, CHEN X W, HE X W, et al. PRA-net: point relation-aware network for 3D point cloud analysis[J]. IEEE Transactions on Image Processing, 2021, 30: 4436-4448. DOI URL
[40]	BERG A, OSKARSSON M, OʹCONNOR M. Points to patches: enabling the use of self-attention for 3D shape recognition[EB/OL]. [2022-01-12].https://arxiv.org/abs/2204.03957.
[41]	YU X, TANG L, RAO Y, et al. Point-bert: pre-training 3D point cloud transformers with masked point modeling[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 19313-19322.

PointMLP-FD：基于多级自适应下采样的点云分类模型

PointMLP-FD: a point cloud classification model based on multi-level adaptive downsampling

RichHTML

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

图/表 10

参考文献 41

相关文章 15

编辑推荐

Metrics

本文评价

[1]	张桂梅, 陶辉, 鲁飞飞, 彭昆. 基于双源判别器的域自适应城市场景语义分割[J]. 图学学报, 2023, 44(5): 907-917.
[2]	张云鹏, 周浦城, 薛模根. 基于张量低秩分解和非下采样剪切波变换的视频图像去雪方法[J]. 图学学报, 2023, 44(5): 947-954.
[3]	李雨, 闫甜甜, 周东生, 魏小鹏. 基于注意力机制与深度多尺度特征融合的自然场景文本检测[J]. 图学学报, 2023, 44(3): 473-481.
[4]	赵玉琨, 任爽, 张鑫云. 结合对抗样本检测和重构的三维点云防御框架[J]. 图学学报, 2023, 44(3): 560-569.
[5]	梁礼明, 雷坤, 詹涛, 周珑颂. 特征自适应过滤的视网膜病变分级算法[J]. 图学学报, 2022, 43(5): 815-824.
[6]	彭国琴, 张浩, 徐丹. 基于域自适应的云南重彩画无监督情感识别[J]. 图学学报, 2022, 43(4): 641-650.
[7]	王正, 邓雪原. 基于自适应尺度边缘特征的建筑施工图重叠字符识别方法研究 [J]. 图学学报, 2022, 43(4): 729-735.
[8]	白静 , 拖继文 , 白少进 , 杨瞻源 . 基于自适应多类中心和半异构网络的三维模型草图检索 [J]. 图学学报, 2022, 43(1): 36-43.
[9]	刘习洲, 王城璟, 王琥. 基于 h 型自适应有限元法在薄板冲压成型中的应用 [J]. 图学学报, 2021, 42(6): 970-978.
[10]	秦宇幸, 羿旭明. 结合显著性和边缘信息的水平集图像分割方法[J]. 图学学报, 2021, 42(5): 738-743.
[11]	牟琦, 张寒, 何志强, 李占利 . 基于深度估计和特征融合的尺度自适应目标跟踪算法[J]. 图学学报, 2021, 42(4): 563-571.
[12]	张鑫 , 周小平 , 王佳 , . 基于 IFC 标准的 BIM 自适应分词方法 [J]. 图学学报, 2021, 42(2): 316-324.
[13]	谷昱良，羿旭明 . 基于小波变换的权重自适应图像分割模型[J]. 图学学报, 2020, 41(5): 733-739.
[14]	杨弘凡，李航，陈凯阳，李嘉琪，王晓菲. 基于改进ORB 算法的图像特征点提取与匹配方法[J]. 图学学报, 2020, 41(4): 548-555.
[15]	魏晨晨，羿旭明 . 基于改进 DRLSE 水平集模型的图像分割[J]. 图学学报, 2019, 40(5): 885-891.