图学学报 ›› 2023, Vol. 44 ›› Issue (1): 112-119.DOI: 10.11996/JG.j.2095-302X.2023010112
梁奥1,2,3,4(), 李峙含1,2,3,4, 花海洋1,2(
)
收稿日期:
2022-05-09
修回日期:
2022-08-19
出版日期:
2023-10-31
发布日期:
2023-02-16
通讯作者:
花海洋
作者简介:
梁奥(1998-),男,硕士研究生。主要研究方向为基于激光雷达的目标检测及点云处理。E-mail:liangao@sia.cn
基金资助:
LIANG AO1,2,3,4(), LI Zhi-han1,2,3,4, HUA Hai-yang1,2(
)
Received:
2022-05-09
Revised:
2022-08-19
Online:
2023-10-31
Published:
2023-02-16
Contact:
HUA Hai-yang
About author:
LIANG Ao (1998-), master student. His main research interests cover LIDAR-based target detection and point cloud processing. mail:liangao@sia.cn
Supported by:
摘要:
针对受硬件条件、物体遮挡和背景杂波等客观因素的影响,传感器采集的目标点云具有较强的稀疏性和密度不均匀性,导致分类模型对点云特征的学习效率低、分类泛化能力差的问题,提出了一种基于多级自适应下采样的点云分类模型PointMLP-FD。该模型设计了多个MLP模块作为网络分支,以点云的浅层特征为输入得到每个点云类别维度上的特征表达,之后再根据特征表达进行排序,选择具有更强语义特征的点构成下采样点集。通过过滤背景和与目标相关性低的信息来自适应保留反应目标本质特征的信息。最后分别计算分支网络的损失,与骨干网络并行训练来优化点云特征,减少模型参数。该方法在ScanObjectNN数据集上进行测试,结果表明相较于PointMLP-elite分类精度更高,mAcc提升1%,OA提升0.8%,以更少的参数量接近SOTA模型的性能。
中图分类号:
梁奥, 李峙含, 花海洋. PointMLP-FD:基于多级自适应下采样的点云分类模型[J]. 图学学报, 2023, 44(1): 112-119.
LIANG AO, LI Zhi-han, HUA Hai-yang. PointMLP-FD: a point cloud classification model based on multi-level adaptive downsampling[J]. Journal of Graphics, 2023, 44(1): 112-119.
图2 PointMLP-FD的模型结构,主干部分与PointMLP相同,Class ATT为所提出的自适应下采样模块,是一个以点云浅层特征为输入的浅层MLP模型。添加了2个分支网络分别计算损失,与网络最终的分类误差一起共同参与网络训练
Fig. 2 Model structure of PointMLP-FD, the backbone part is the same as PointMLP, Class ATT is the proposed adaptive downsampling module, which is a shallow MLP model with shallow point cloud features as input. Two branching networks are added to calculate the loss separately and participate in the network training together with the final classification error of the network
Method | Overall Acc(%) | Mean Acc(%) | Param(M) |
---|---|---|---|
PointNet[ | 63.0 | 58.1 | - |
SpiderCNN[ | 68.2 | 63.4 | - |
PointNet++[ | 77.9 | 75.4 | 1.41 |
DGCNN[ | 78.1 | 73.6 | - |
PointCNN[ | 78.5 | 75.1 | - |
GBNet[ | 80.5 | 77.8 | 8.39 |
PRA-Net[ | 82.1 | 79.1 | - |
Point-TnT[ | 83.5 | 81.0 | - |
Point-BERT[ | 83.1 | - | 20.8 |
PointMLP(SOTA) | 85.4±0.3 | 83.9±0.5 | 12.6 |
PointMLP-elite | 83.8±0.6 | 81.8±0.8 | 0.68 |
PointMLP-FD(Ours) | 85.15 | 83.64 | 0.77 |
表1 实验结果
Table 1 Experimental results
Method | Overall Acc(%) | Mean Acc(%) | Param(M) |
---|---|---|---|
PointNet[ | 63.0 | 58.1 | - |
SpiderCNN[ | 68.2 | 63.4 | - |
PointNet++[ | 77.9 | 75.4 | 1.41 |
DGCNN[ | 78.1 | 73.6 | - |
PointCNN[ | 78.5 | 75.1 | - |
GBNet[ | 80.5 | 77.8 | 8.39 |
PRA-Net[ | 82.1 | 79.1 | - |
Point-TnT[ | 83.5 | 81.0 | - |
Point-BERT[ | 83.1 | - | 20.8 |
PointMLP(SOTA) | 85.4±0.3 | 83.9±0.5 | 12.6 |
PointMLP-elite | 83.8±0.6 | 81.8±0.8 | 0.68 |
PointMLP-FD(Ours) | 85.15 | 83.64 | 0.77 |
Method | Bag | Bin | Box | Cabinet | Chair | Desk | Display | Door | Shelf | Table | Bed | Pillow | Sink | Sofa | Toilet |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
PointMLP-elite | 0.59 | 0.89 | 0.60 | 0.81 | 0.94 | 0.78 | 0.88 | 0.90 | 0.87 | 0.71 | 0.84 | 0.77 | 0.78 | 0.92 | 0.82 |
Ours | 0.70 | 0.87 | 0.62 | 0.85 | 0.93 | 0.79 | 0.88 | 0.94 | 0.83 | 0.73 | 0.87 | 0.84 | 0.85 | 0.94 | 0.85 |
表2 ScanObjectNN数据集中每一类的测试精度
Table 2 Results for each category in the ScanObjectNN dataset
Method | Bag | Bin | Box | Cabinet | Chair | Desk | Display | Door | Shelf | Table | Bed | Pillow | Sink | Sofa | Toilet |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
PointMLP-elite | 0.59 | 0.89 | 0.60 | 0.81 | 0.94 | 0.78 | 0.88 | 0.90 | 0.87 | 0.71 | 0.84 | 0.77 | 0.78 | 0.92 | 0.82 |
Ours | 0.70 | 0.87 | 0.62 | 0.85 | 0.93 | 0.79 | 0.88 | 0.94 | 0.83 | 0.73 | 0.87 | 0.84 | 0.85 | 0.94 | 0.85 |
图6 采样的可视化结果,FPS(上)和Class ATT(下)下采样。红色框显示了2种采样方式在Chair中第一个样本背景上采样的结果((a)椅子;(b)桌子)
Fig. 6 Visualize the sampling results, FPS (top) and Class ATT (bottom) downsampling. The red box shows the results of both sampling methods on the background of the first sample in Chair ((a) Chair; (b) Table)
Method | Overall Acc(%) | Avg Acc(%) | Param(M) |
---|---|---|---|
PointMLP-elite | 83.8±0.6 | 81.8±0.8 | 0.68 |
PointMLP-eliteʹ | 82.41 | 79.57 | 0.31 |
PointMLP-FD- se(Ours) | 84.04 | 82.27 | 0.33 |
表3 抽取的PointMLP-FD-se与PointMLP-elite?的实验结果
Table 3 Experimental results of extracted PointMLP-FD-se with PointMLP-elite?
Method | Overall Acc(%) | Avg Acc(%) | Param(M) |
---|---|---|---|
PointMLP-elite | 83.8±0.6 | 81.8±0.8 | 0.68 |
PointMLP-eliteʹ | 82.41 | 79.57 | 0.31 |
PointMLP-FD- se(Ours) | 84.04 | 82.27 | 0.33 |
图7 网络分别使用最大池化((a)平均池化;(b)训练后的下采样结果)
Fig. 7 The downsampling results of the network after training with max-pooling ((a) Average pooling respectively; (b) Post-training down-sampling results)
[1] |
RAJ T, HASHIM F H, HUDDIN A B, et al. A survey on LiDAR scanning mechanisms[J]. Electronics, 2020, 9(5): 741.
DOI URL |
[2] | MEHENDALE N, NEOGE S. Review on lidar technology[EB/OL]. [2022-01-12]. https://www.researchgate.net/publication/342154967_Review_on_Lidar_Technology. |
[3] | FEI B, YANG W D, CHEN W M, et al. Comprehensive review of deep learning-based 3D point cloud completion processing and analysis[EB/OL]. [2022-01-12].https://arxiv.org/abs/2203.03311. |
[4] |
ZAMANAKOS G, TSOCHATZIDIS L, AMANATIADIS A, et al. A comprehensive survey of LIDAR-based 3D object detection methods with deep learning for autonomous driving[J]. Computers & Graphics, 2021, 99: 153-181.
DOI URL |
[5] | QIAN R, LAI X, LI X R. 3D object detection for autonomous driving: a survey[EB/OL]. [2022-01-12].https://arxiv.org/abs/2106.10823. |
[6] |
RORIZ R, CABRAL J, GOMES T. Automotive LiDAR technology: a survey[J]. IEEE Transactions on Intelligent Transportation Systems, 2022, 23(7): 6282-6297.
DOI URL |
[7] | LV D K, YING X X, CUI Y J, et al. Research on the technology of LIDAR data processing[C]//2017 First International Conference on Electronics Instrumentation & Information Systems. New York: IEEE Press, 2017: 1-5. |
[8] |
BELLO S A, YU S S, WANG C, et al. Review: deep learning on 3D point clouds[J]. Remote Sensing, 2020, 12(11): 1729.
DOI URL |
[9] | SU H, MAJI S, KALOGERAKIS E, et al. Multi-view convolutional neural networks for 3D shape recognition[C]//2015 IEEE International Conference on Computer Vision. New York: IEEE Press, 2015: 945-953. |
[10] | BAI S, BAI X, ZHOU Z C, et al. GIFT: a real-time and scalable 3D shape search engine[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 5023-5032. |
[11] | KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]//The 25th International Conference on Neural Information Processing Systems - Volume 1. New York: ACM, 2012: 1097-1105. |
[12] | SU H, MAJI S, KALOGERAKIS E, et al. Multi-view convolutional neural networks for 3D shape recognition[C]// 2015 IEEE International Conference on Computer Vision. New York: IEEE Press, 2015: 945-953. |
[13] | QI C R, SU H, NIEßNER M, et al. Volumetric and multi-view CNNs for object classification on 3D data[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 5648-5656. |
[14] | MATURANA D, SCHERER S.3D convolutional neural networks for landing zone detection from LiDAR[C]//2015 IEEE International Conference on Robotics and Automation. New York: IEEE Press, 2015: 3471-3478. |
[15] | MATURANA D, SCHERER S. VoxNet: a 3D convolutional neural network for real-time object recognition[C]//2015 IEEE/RSJ International Conference on Intelligent Robots and Systems. New York: IEEE Press, 2015: 922-928. |
[16] | WU Z R, SONG S R, KHOSLA A, et al. 3D ShapeNets: a deep representation for volumetric shapes[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2015: 1912-1920. |
[17] |
WANG C, CHENG M, SOHEL F, et al. NormalNet: a voxel-based CNN for 3D object classification and retrieval[J]. Neurocomputing, 2019, 323: 139-147.
DOI |
[18] | RIEGLER G, ULUSOY A O, GEIGER A. OctNet: learning deep 3D representations at high resolutions[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 6620-6629. |
[19] | TATARCHENKO M, DOSOVITSKIY A, BROX T. Octree generating networks: efficient convolutional architectures for high-resolution 3D outputs[C]// 2017 IEEE International Conference on Computer Vision. New York: IEEE Press, 2017: 2107-2115. |
[20] | CHARLES R Q, HAO S, MO K C, et al. PointNet: deep learning on point sets for 3D classification and segmentation[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 77-85. |
[21] | QI C R, YI L, SU H, et al. PointNet++: deep hierarchical feature learning on point sets in a metric space[C]//The 31st International Conference on Neural Information Processing Systems. New York: ACM, 2017: 5105-5114. |
[22] | LI J X, CHEN B M, LEE G H. SO-net: self-organizing network for point cloud analysis[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 9397-9406. |
[23] | ZHAO H S, JIANG L, FU C W, et al. PointWeb: enhancing local neighborhood features for point cloud processing[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 5560-5568. |
[24] | ZHANG W X, XIAO C X. PCAN: 3D attention map learning using contextual information for point cloud based retrieval[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 12428-12437. |
[25] | KLOKOV R, LEMPITSKY V. Escape from cells: deep Kd-networks for the recognition of 3D point cloud models[C]//2017 IEEE International Conference on Computer Vision. New York: IEEE Press, 2017: 863-872. |
[26] | WANG C, SAMARI B, SIDDIQI K. Local spectral graph convolution for point set feature learning[EB/OL]. [2022-01- 12]. https://arxiv.org/abs/1803.05827. |
[27] | ZHAO H S, JIANG L, JIA J Y, et al. Point transformer[C]//2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 16239-16248. |
[28] | ENGEL N, BELAGIANNIS V, DIETMAYER K. Point transformer[EB/OL]. [2022-01-12]. https://arxiv.org/abs/2011.00931. |
[29] |
GUO M H, CAI J X, LIU Z N, et al. PCT: point cloud transformer[J]. Computational Visual Media, 2021, 7(2): 187-199.
DOI |
[30] | PANG Y T, WANG W X, TAY F E H, et al. Masked autoencoders for point cloud self-supervised learning[EB/OL]. [2022-01-12]. https://arxiv.org/abs/2203.06604. |
[31] | ZHANG Y F, HU Q Y, XU G Q, et al. Not all points are equal: learning highly efficient point-based detectors for 3D LiDAR point clouds[EB/OL]. [2022-01-12]. https://arxiv.org/abs/2203.11139. |
[32] | WU Z R, SONG S R, KHOSLA A, et al. 3D ShapeNets: a deep representation for volumetric shapes[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2015: 1912-1920. |
[33] | UY M A, PHAM Q H, HUA B S, et al. Revisiting point cloud classification: a new benchmark dataset and classification model on real-world data[C]//2019 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2019: 1588-1597. |
[34] | HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 770-778. |
[35] | XU Y F, FAN T Q, XU M Y, et al. SpiderCNN: deep learning on point sets with parameterized convolutional filters[M]// Computer Vision - ECCV 2018. Cham: Springer International Publishing, 2018: 90-105. |
[36] | SHI S S, WANG X G, LI H S. PointRCNN: 3D object proposal generation and detection from point cloud[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 770-779. |
[37] | WU W X, QI Z A, LI F X. PointConv: deep convolutional networks on 3D point clouds[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 9613-9622. |
[38] |
QIU S, ANWAR S, BARNES N. Geometric back-projection network for point cloud classification[J]. IEEE Transactions on Multimedia, 2022, 24: 1943-1955.
DOI URL |
[39] |
CHENG S L, CHEN X W, HE X W, et al. PRA-net: point relation-aware network for 3D point cloud analysis[J]. IEEE Transactions on Image Processing, 2021, 30: 4436-4448.
DOI URL |
[40] | BERG A, OSKARSSON M, OʹCONNOR M. Points to patches: enabling the use of self-attention for 3D shape recognition[EB/OL]. [2022-01-12].https://arxiv.org/abs/2204.03957. |
[41] | YU X, TANG L, RAO Y, et al. Point-bert: pre-training 3D point cloud transformers with masked point modeling[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 19313-19322. |
[1] | 张桂梅, 陶辉, 鲁飞飞, 彭昆. 基于双源判别器的域自适应城市场景语义分割[J]. 图学学报, 2023, 44(5): 907-917. |
[2] | 张云鹏, 周浦城, 薛模根. 基于张量低秩分解和非下采样剪切波变换的视频图像去雪方法[J]. 图学学报, 2023, 44(5): 947-954. |
[3] | 李雨, 闫甜甜, 周东生, 魏小鹏. 基于注意力机制与深度多尺度特征融合的自然场景文本检测[J]. 图学学报, 2023, 44(3): 473-481. |
[4] | 赵玉琨, 任爽, 张鑫云. 结合对抗样本检测和重构的三维点云防御框架[J]. 图学学报, 2023, 44(3): 560-569. |
[5] | 梁礼明, 雷 坤, 詹 涛, 周珑颂. 特征自适应过滤的视网膜病变分级算法[J]. 图学学报, 2022, 43(5): 815-824. |
[6] | 彭国琴, 张浩, 徐丹. 基于域自适应的云南重彩画无监督情感识别[J]. 图学学报, 2022, 43(4): 641-650. |
[7] | 王正, 邓雪原.
基于自适应尺度边缘特征的建筑施工图重叠字符识别方法研究
[J]. 图学学报, 2022, 43(4): 729-735. |
[8] | 白 静 , 拖继文 , 白少进 , 杨瞻源 . 基于自适应多类中心和半异构网络的 三维模型草图检索 [J]. 图学学报, 2022, 43(1): 36-43. |
[9] | 刘习洲, 王城璟, 王 琥. 基于 h 型自适应有限元法在薄板冲压 成型中的应用 [J]. 图学学报, 2021, 42(6): 970-978. |
[10] | 秦宇幸, 羿旭明. 结合显著性和边缘信息的水平集图像分割方法[J]. 图学学报, 2021, 42(5): 738-743. |
[11] | 牟 琦, 张 寒, 何志强, 李占利 . 基于深度估计和特征融合的尺度自适应目标跟踪算法[J]. 图学学报, 2021, 42(4): 563-571. |
[12] | 张 鑫 , 周小平 , 王 佳 , . 基于 IFC 标准的 BIM 自适应分词方法 [J]. 图学学报, 2021, 42(2): 316-324. |
[13] | 谷昱良, 羿旭明 . 基于小波变换的权重自适应图像分割模型[J]. 图学学报, 2020, 41(5): 733-739. |
[14] | 杨弘凡, 李 航, 陈凯阳, 李嘉琪, 王晓菲. 基于改进ORB 算法的图像特征点提取与匹配方法[J]. 图学学报, 2020, 41(4): 548-555. |
[15] | 魏晨晨, 羿旭明 . 基于改进 DRLSE 水平集模型的图像分割[J]. 图学学报, 2019, 40(5): 885-891. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||