融合外部注意力和图卷积的点云分类模型

doi:10.11996/JG.j.2095-302X.202306116

图学学报 ›› 2023, Vol. 44 ›› Issue (6): 1162-1172.DOI: 10.11996/JG.j.2095-302X.202306116

• 图像处理与计算机视觉 • 上一篇下一篇

融合外部注意力和图卷积的点云分类模型

周锐闯(), 田瑾(), 闫丰亭, 朱天晓, 张玉金

上海工程技术大学电子电气工程学院，上海 201620

收稿日期:2023-06-15 接受日期:2023-09-20 出版日期:2023-12-31 发布日期:2023-12-17
通讯作者: 田瑾(1982-)，女，副教授，博士。主要研究方向为大规模数值计算、计算机电磁学和机器学习。E-mail：jintian0120@foxmail.com
作者简介:
周锐闯(1997-)，男，硕士研究生。主要研究方向为计算机图形学、深度学习。E-mail：m18916835630@163.com
基金资助:
国家基金委民航联合基金重点项目(U2033218)

Point cloud classification model incorporating external attention and graph convolution

ZHOU Rui-chuang(), TIAN Jin(), YAN Feng-ting, ZHU Tian-xiao, ZHANG Yu-jin

School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai 201620, China

Received:2023-06-15 Accepted:2023-09-20 Online:2023-12-31 Published:2023-12-17
Contact: TIAN Jin (1982-), associate professor, Ph.D. Her main research interests cover large-scale numerical computing, computer electromagnetics and machine learning. E-mail：jintian0120@foxmail.com
About author:
ZHOU Rui-chuang (1997-), master student. His main research interests cover computer graphics and deep learning.
E-mail：m18916835630@163.com
Supported by:
Key Project of Civil Aviation Joint Fund of National Fund Commission(U2033218)

摘要/Abstract

摘要：

针对点云数据的无序性和非结构化导致不能充分提取局部特征的问题，提出了一种融合外部注意力和图卷积的点云分类模型。首先将点云数据构建成局部有向图，然后采用融合了外部注意力的图卷积进行特征提取，以采集更丰富、更具代表性的局部特征。接着，引入残差结构来搭建更深层的网络，并融合不同层次的特征信息，以增强网络性能。最后，将具有树状层次结构的点云数据映射到具有负曲率的双曲空间，以增强点云数据表达的能力，并在双曲空间中进行嵌入计算得到最终的分类结果。在标准公开的数据集ModelNet40和ScanObjectNN上进行了实验，结果表明，模型在不同数据集上整体分类精度分别达到了93.8%和82.8%，相较于目前主流的高性能模型，模型整体精度提高了0.3%~4.9%，并具有较强的鲁棒性。

关键词: 深度学习, 点云分类, 外部注意力, 双曲空间, 图卷积

Abstract:

In response to the challenge of insufficiently extracting local features from disordered and unstructured point cloud data, a point cloud classification model fusing external attention and graph convolution was proposed. Firstly, the point cloud data was constructed into a local directed graph, and then the graph convolution fused with external attention was employed for feature extraction to capture richer and more representative local features. Next, residual structures were introduced to build a deeper network and fuse feature information at different levels, enhancing the network performance. Finally, the point cloud data with a tree-like hierarchical structure was mapped to a hyperbolic space with negative curvature, thereby enhancing the ability of point cloud data representation. Embedding computation was also performed in the hyperbolic space to obtain the final classification results. Experiments were conducted on the standard publicly available datasets ModelNet40 and ScanObjectNN. The results demonstrated that the overall classification accuracy of the model on different datasets reached 93.8% and 82.8%, respectively, improving the overall accuracy of the model by 0.3% to 4.9%, compared to the current mainstream high-performance models, exhibiting strong robustness.

Key words: deep learning, point cloud classification, external attention, hyperbolic space, graph convolution

中图分类号:

TP391.4
2

周锐闯, 田瑾, 闫丰亭, 朱天晓, 张玉金. 融合外部注意力和图卷积的点云分类模型[J]. 图学学报, 2023, 44(6): 1162-1172.

ZHOU Rui-chuang, TIAN Jin, YAN Feng-ting, ZHU Tian-xiao, ZHANG Yu-jin. Point cloud classification model incorporating external attention and graph convolution[J]. Journal of Graphics, 2023, 44(6): 1162-1172.

图/表 19

图1 图卷积特征提取

Fig. 1 Graph convolution feature extraction structure

图2 构建局部图结构

Fig. 2 Build a local graph structure

图3 自注意力机制

Fig. 3 Self-attention mechanism

图4 空间结构图((a)欧式空间-网状图；(b)双曲空间-树状结构)

Fig. 4 Space structure map ((a) European space - mesh; (b) Hyperbolic space - tree structure)

图5 点云分类模型总体架构

Fig. 5 General architecture of the point cloud classification model

图6 注意力图卷积模块

Fig. 6 Attention map convolution module

图7 外部注意力

Fig. 7 External attention

图8 双曲空间正则化网络模块

Fig. 8 Hyperbolic space regularization network module

表1 实验配置

Table 1 Experimental configuration

设备配置	型号及参数
Operation system	Linux Ubuntu18.04
CPU	Intel Core i5-12400F
RAM	16 G
GPU	RTX 3090
CUDA	10.1
Python	3.7
Pytorch	1.6

图9 三维点云模型可视化

Fig. 9 3D point cloud model visualization

表2 不同模型在ModelNet40数据集上的分类精度

Table 2 Classification accuracy of different models on the ModelNet40 dataset

Method	Input	A_oacc (%)	A_macc (%)
MVCNN	View	90.1	-
VoxNet	Voxels	85.5	82.8
PointNet	Points	90.0	84.3
PointNet++	Points	91.7	-
PointASNL^[30]	Points Points+Normal	92.8 93.2	- -
PointConv^[31]	Points	92.4	-
PCT	Points	93.2	-
SpiderCNN^[32]	Points+Normal	92.4	-
PointCNN^[33]	Point	92.2	88.1
DGCNN	Point	92.6	89.8
Point Transformer	Point	92.8	-
LFT-Net^[34]	Points+Normal	93.2	89.7
DTNet^[35]	Point	92.9	90.4
Ours	Point	93.8	90.7

表3 不同模型在ScanObjectNN数据集上的分类精度

Table 3 Classification accuracy of different models on the ScanObjectNN dataset

Method	Input	A_oacc (%)	A_macc (%)
PointNet	Points	68.2	63.4
PointNet++	Points	77.9	75.4
DGCNN	Point	78.1	73.6
PointCNN	Points	78.5	75.1
DRNet^[36]	Points	80.3	78.0
Ours	Point	82.8	80.4

表4 不同模块的消融实验

Table 4 Ablation studies about different modules

Method	GraphConv	Residual	External Attention	HSRN	A_oacc (%)	A_macc (%)
A	√	×	×	×	92.6	88.6
B	√	√	×	×	92.8	89.8
C	√	√	√	×	93.3	90.2
D	√	√	√	√	93.8	90.7

表5 不同卷积层数实验(%)

Table 5 Experiments with different convolutional layers (%)

Layer	A_oacc	A_macc
1	92.8	89.8
2	93.8	90.7
3	93.6	90.2

表6 K值实验(%)

Table 6 K value experiment (%)

K	A_oacc	A_macc
10	92.6	87.8
15	92.9	89.7
20	93.8	90.7
25	93.2	89.8
30	92.8	89.3

图10 K值对分类精度的影响

Fig. 10 Influence of K value on accuracy

图11 稀疏点云可视化((a)原始点云；(b)减少25%采样点；(c)减少50%采样点；(d)减少75%采样点)

Fig. 11 Visualizing sparse point clouds ((a) Original point cloud; (b) Reduce sampling points by 25%; (c) Reduce sampling points by 50%; (d) Reduce sampling points by 75%)

图12 采样点密度对分类精度的影响

Fig. 12 The influence of sampling point density on classification accuracy

图13 高斯噪音对分类精度的影响

Fig. 13 Effect of gaussian noise on classification accuracy accuracy

参考文献 36

[1]	LI Y, IBANEZ-GUZMAN J. Lidar for autonomous driving: the principles, challenges, and trends for automotive lidar and perception systems[J]. IEEE Signal Processing Magazine, 2020, 37(4): 50-61.
[2]	XIONG J H, HSIANG E L, HE Z Q, et al. Augmented reality and virtual reality displays: emerging technologies and future perspectives[J]. Light, Science & Applications, 2021, 10(1): 216.
[3]	WANG J Y, CHEN J L, SUN Y C, et al. RobOT: robustness-oriented testing for deep learning systems[C]// 2021 IEEE/ACM 43rd International Conference on Software Engineering. New York: IEEE Press, 2021: 300-311.
[4]	GU J X, WANG Z H, KUEN J, et al. Recent advances in convolutional neural networks[J]. Pattern Recognition, 2018, 77: 354-377. DOI URL
[5]	SU H, MAJI S, KALOGERAKIS E, et al. Multi-view convolutional neural networks for 3D shape recognition[C]// 2015 IEEE International Conference on Computer Vision. New York: IEEE Press, 2015: 945-953.
[6]	MATURANA D, SCHERER S. VoxNet: a 3D Convolutional Neural Network for real-time object recognition[C]// 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems. New York: IEEE Press, 2015: 922-928.
[7]	CHARLES R Q, HAO S, MO K C, et al. PointNet: deep learning on point sets for 3D classification and segmentation[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 652-660.
[8]	QI C R, YI L, SU H, et al. PointNet++: deep hierarchical feature learning on point sets in a metric space[C]// The 31st International Conference on Neural Information Processing Systems. New York: ACM, 2017: 5105-5114.
[9]	WANG Y E, SUN Y B, LIU Z W, et al. Dynamic graph CNN for learning on point clouds[J]. ACM Transactions on Graphics, 2019, 38(5): 1-12.
[10]	ZHANG K G, HAO M, WANG J, et al. Linked dynamic graph cnn: learning on point cloud via linking hierarchical features[EB/OL]. [2023-02-06]. https://arxiv.org/abs/1904.10014.
[11]	李维刚, 陈婷, 田志强. 基于孪生自适应图卷积算法的点云分类与分割[EB/OL]. 计算机应用. [2023-02-24]. https://kns.cnki.net/kcms/detail//51.1307.TP.20230223.1405.010.html.
	LI W G, CHEN T, TIAN Z Q. Point cloud classification and segmentation based on twin adaptive graph convolution algorithm[EB/OL]. Journal of Computer Applications: [2023-02-24]. https://kns.cnki.net/kcms/detail//51.1307.TP.20230223.1405.010.html (in Chinese).
[12]	刘斌, 樊云超. 基于改进动态图卷积的点云分类模型[J]. 中国科技论文, 2022, 17(11): 1230-1235, 1266.
	LIU B, FAN Y C. A point cloud classification model based on improved dynamic graph convolution[J]. China Sciencepaper, 2022, 17(11): 1230-1235, 1266 (in Chinese).
[13]	梁奥, 李峙含, 花海洋. PointMLP-FD: 基于多级自适应下采样的点云分类模型[J]. 图学学报, 2023, 44(1): 112-119. DOI
	LIANG A, LI Z H, HUA H Y. PointMLP-FD: a point cloud classification model based on multi-level adaptive downsampling[J]. Journal of Graphics, 2023, 44(1): 112-119 (in Chinese). DOI
[14]	GUO M H, CAI J X, LIU Z N, et al. PCT: point cloud transformer[J]. Computational Visual Media, 2021, 7(2): 187-199. DOI
[15]	ZHAO H S, JIANG L, JIA J Y, et al. Point transformer[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2022: 16259-16268.
[16]	梁振华, 王丰. 面向部件分割的PointNet注意力加权特征聚合网络[J] .计算机应用研究, 2023, 40(05): 1571-1576, 1582.
	LIANG Z H, WANG F. A PointNet attention-weighted feature aggregation network for part segmentation[J]. Application Research of Computers, 2023, 40(05): 1571-1576, 1582 (in Chinese).
[17]	刘玉珍, 李楠, 陶志勇. 基于环查询和通道注意力的点云分类与分割[J]. 图学学报, 2022, 43(4): 616-623.
	LIU Y Z, LI N, TAO Z Y. Point cloud classification and segmentation based on ring query and channel attention[J]. Journal of Graphics, 2022, 43(4): 616-623 (in Chinese).
[18]	GUO M H, LIU Z N, MU T J, et al. Beyond self-attention: external attention using two linear layers for visual tasks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 45(5): 5436-5447.
[19]	KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks[EB/OL]. [2023-02-13]. https://arxiv.org/abs/1609.02907.pdf.
[20]	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all You need[C]// The 31st International Conference on Neural Information Processing Systems. New York: ACM, 2017: 6000-6010.
[21]	WANG X L, GIRSHICK R, GUPTA A, et al. Non-local neural networks[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 7794-7803.
[22]	沈露, 杨家志, 周国清, 等. 集自注意力与边卷积的点云分类分割模型[J/OL]. 计算机工程与应用. [2023-02-24]. https://kns.cnki.net/kcms/detail/11.2127.TP.20221019.1414.006.html.
	SHEN L, YANG J Z, ZHOU G Q, et al. A point cloud classification segmentation model with integrated self-attentive and edge convolution[J/OL]. Computer Engineering and Applications. [2023-02-24]. tps://kns.cnki.net/kcms/detail/11.2127.TP.20221019.1414.006.html (in Chinese).
[23]	CHAMBERLAIN B P, CLOUGH J, DEISENROTH M P. Neural embeddings of graphs in hyperbolic space[EB/OL]. [2023-02-12]. https://arxiv.org/abs/1705.10359.pdf.
[24]	TIFREA A, BÉCIGNEUL G, GANEA O E. Poincar\'e GloVe: hyperbolic word embeddings[EB/OL]. [2023-02-12]. https://arxiv.org/abs/1810.06546.pdf.
[25]	CHAMI I, YING R, RÉ C, et al. Hyperbolic graph convolutional neural networks[J]. Advances in Neural Information Processing Systems, 2019, 32: 4869-4880. PMID
[26]	KHRULKOV V, MIRVAKHABOVA L, USTINOVA E, et al. Hyperbolic image embeddings[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 6418-6428.
[27]	BOTTOU L. Stochastic gradient descent tricks[M]// Neural Networks:Tricks of the Trade. Heidelberg: Springer, 2012: 421-436.
[28]	LOSHCHILOV I, HUTTER F. SGDR: stochastic gradient descent with warm restarts[EB/OL]. [2023-02-13]. https://arxiv.org/abs/1608.03983.pdf.
[29]	UY M A, PHAM Q H, HUA B S, et al. Revisiting point cloud classification: a new benchmark dataset and classification model on real-world data[C]// 2019 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2019: 1588-1597.
[30]	YAN X, ZHENG C D, LI Z, et al. PointASNL: robust point clouds processing using nonlocal neural networks with adaptive sampling[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 5589-5598.
[31]	WU W X, QI Z A, LI F X. PointConv: deep convolutional networks on 3D point clouds[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 9621-9630.
[32]	XU Y F, FAN T Q, XU M Y, et al. SpiderCNN: deep learning on point sets with parameterized convolutional filters[C]// European Conference on Computer Vision. Cham: Springer, 2018: 90-105.
[33]	LI Y Y, BU R, SUN M C, et al. PointCNN: convolution on Χ -transformed points[EB/OL]. [2023-02-13]. https://arxiv.org/abs/1801.07791.pdf.
[34]	GAO Y B, LIU X B, LI J, et al. LFT-net: local feature transformer network for point clouds analysis[J]. IEEE Transactions on Intelligent Transportation Systems, 2022, 24(2): 2158-2168.
[35]	HAN X F, JIN Y F, CHENG H X, et al. Dual transformer for point cloud analysis[EB/OL]. [2023-02-13]. https://ieeexplore.ieee.org/document/9855233.
[36]	QIU S, ANWAR S, BARNES N. Dense-resolution network for point cloud classification and segmentation[C]// 2021 IEEE Winter Conference on Applications of Computer Vision. New York: IEEE Press, 2021: 3813-3822.

融合外部注意力和图卷积的点云分类模型

Point cloud classification model incorporating external attention and graph convolution

RichHTML

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

图/表 19

参考文献 36

相关文章 15

编辑推荐

Metrics

本文评价

[1]	王稚儒, 常远, 鲁鹏, 潘成伟 . 神经辐射场加速算法综述 [J]. 图学学报, 2024, 45(1): 1-13.
[2]	郭宗洋, 刘立东, 蒋东华, 刘子翔, 朱熟康, 陈京华 . 基于语义引导神经网络的人体动作识别算法 [J]. 图学学报, 2024, 45(1): 26-34.
[3]	王欣雨, 刘慧, 朱积成, 盛玉瑞, 张彩明. 基于高低频特征分解的深度多模态医学图像融合网络 [J]. 图学学报, 2024, 45(1): 65-77.
[4]	李佳琦, 王辉, 郭宇. 基于Transformer的三角形网格分类分割网络 [J]. 图学学报, 2024, 45(1): 78-89.
[5]	韩亚振, 尹梦晓, 马伟钊, 杨诗耕, 胡锦飞, 朱丛洋 . DGOA：基于动态图和偏移注意力的点云上采样 [J]. 图学学报, 2024, 45(1): 219-229.
[6]	王江安, 黄乐, 庞大为, 秦林珍, 梁温茜. 基于自适应聚合循环递归的稠密点云重建网络 [J]. 图学学报, 2024, 45(1): 230-239.
[7]	王吉, 王森, 蒋智文, 谢志峰, 李梦甜. 基于深度条件扩散模型的零样本文本驱动虚拟人生成方法[J]. 图学学报, 2023, 44(6): 1218-1226.
[8]	杨陈成, 董秀成, 侯兵, 张党成, 向贤明, 冯琪茗. 基于参考的Transformer纹理迁移深度图像超分辨率重建[J]. 图学学报, 2023, 44(5): 861-867.
[9]	党宏社, 许怀彪, 张选德. 融合结构信息的深度学习立体匹配算法[J]. 图学学报, 2023, 44(5): 899-906.
[10]	翟永杰, 郭聪彬, 王乾铭, 赵宽, 白云山, 张冀. 基于隐含空间知识融合的输电线路多金具检测方法[J]. 图学学报, 2023, 44(5): 918-927.
[11]	杨红菊, 高敏, 张常有, 薄文, 武文佳, 曹付元. 一种面向图像修复的局部优化生成模型[J]. 图学学报, 2023, 44(5): 955-965.
[12]	毕春艳, 刘越. 基于深度学习的视频人体动作识别综述[J]. 图学学报, 2023, 44(4): 625-639.
[13]	曹义亲, 周一纬, 徐露. 基于E-YOLOX的实时金属表面缺陷检测算法[J]. 图学学报, 2023, 44(4): 677-690.
[14]	邵俊棋, 钱文华, 徐启豪. 基于条件残差生成对抗网络的风景图生成[J]. 图学学报, 2023, 44(4): 710-717.
[15]	余伟群, 刘佳涛, 张亚萍. 融合注意力的拉普拉斯金字塔单目深度估计[J]. 图学学报, 2023, 44(4): 728-738.