3D model reconstruction based on retrieval and deformation techniques

doi:10.11996/JG.j.2095-302X.2026020368

Abstract

Abstract:

As Virtual Reality (VR) and Augmented Reality (AR) technologies advance rapidly, the demand for high-quality 3D models has increased significantly. Traditional 3D modeling methods have drawbacks such as slow processing speed and poor adaptability to complex shapes. Consequently, a novel 3D model construction method based on 3D model retrieval and deformation was proposed. Firstly, a 3D model retrieval framework based on semantic keypoints was constructed, where sparse geometric feature points with semantic consistency were utilized to build a deformation-aware embedding space, enabling dynamic aggregation of global and local features. Meanwhile, Adaptive Global-CHANNEL Attention (AGCA) was embedded into a Transformer to form a joint attention mechanism, thereby enhancing the model’s expressiveness and retrieval accuracy. Then, for the retrieved models, a DGCNN-based keypoint-driven neural cage deformation algorithm was designed. The self-attention mechanism was utilized to calculate the influence weights of keypoints on vertices within local support regions. This process established a deformation mapping between feature keypoints and the neural cage structure, driving neural cage deformation to achieve fine-grained and constrained shape control. Finally, the loss function was improved by incorporating Chamfer distance and EMD distance constraints. This ensured that while focusing on local feature differences, geometric details were more accurately aligned, resulting in more precise 3D model reconstruction. Experiments were conducted on the Partnet and the Scan2CAD datasets to compare the proposed method with existing networks such as U-RED, ShapeFlow, and KP-RED. The results demonstrated that the proposed 3D model construction method could effectively handle noise and occlusion. The average value of the loss function was reduced by 33.33% and 41.67% on the Partnet dataset. moreover, on the Scan2CAD dataset, the average loss value was reduced by 3.6% compared with the baseline.

Key words: 3D model retrieval, deep learning, deformation of nerve cage, self attention mechanism, loss function

CLC Number:

PANG Min, LI Zhentang, ZHANG Yuan, CUI Xiaokang, XIONG Fengguang. 3D model reconstruction based on retrieval and deformation techniques[J]. Journal of Graphics, 2026, 47(2): 368-379.

Figures/Tables 13

References 20

[1]	周伟, 苍慜楠, 程浩宗. 基于AR技术的文物数字化三维图像重建方法[J]. 图学学报, 2025, 46(2): 369-381. DOI
	ZHOU W, CANG M N, CHENG H Z. Research on the method of 3D image reconstruction for cultural relics based on AR technology[J]. Journal of Graphics, 2025, 46(2): 369-381 (in Chinese). DOI
[2]	WANG Y, SUN Y B, LIU Z W, et al. Dynamic graph CNN for learning on point clouds[J]. ACM Transactions on Graphics, 2019, 38(5): 146.
[3]	王亚, 郑博文, 张欣. 基于多模态融合的三维模型检索算法研究[J]. 计算机应用研究, 2021, 38(3): 685-688, 695.
	WANG Y, ZHENG B W, ZHANG X. 3D model retrieval algorithm based on multimodal fusion[J]. Application Research of Computers, 2021, 38(3): 685-688, 695 (in Chinese).
[4]	ZHAO Y X, JIAO J C, LI N, et al. MANet: multimodal attention network based point-view fusion for 3D shape recognition[C]// The 25th International Conference on Pattern Recognition. New York: IEEE Press, 2021: 134-141.
[5]	SU H, MAJI S, KALOGERAKIS E, et al. Multi-view convolutional neural networks for 3D shape recognition[C]// 2015 IEEE International Conference on Computer Vision. New York: IEEE Press, 2015: 945-953.
[6]	GAO Z, LI Y M, WAN S H. Exploring deep learning for view-based 3D model retrieval[J]. ACM Transactions on Multimedia Computing, Communications, and Applications, 2020, 16(1): 18.
[7]	CHARLES R Q, YI L, SU H, et al. PointNet++: deep hierarchical feature learning on point sets in a metric space[C]// The 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 5105-5114.
[8]	陈素雅, 何宏. 基于特征点动态选择的三维人脸点云模型重建[J]. 计算机应用研究, 2024, 41(2): 629-634.
	CHEN S Y, HE H. 3D face point cloud model reconstruction based on dynamic selection of feature points[J]. Application Research of Computers, 2024, 41(2): 629-634 (in Chinese).
[9]	SORKINE O, ALEXA M. As-rigid-as-possible surface modeling[C]// The 5th Eurographics Symposium on Geometry Processing. Goslar: Eurographics Association, 2007: 109-116.
[10]	PANG M, HE L G, XIONG F G, et al. Developing an image-based 3D model editing method[J]. IEEE Access, 2020, 8: 167950-167964. DOI URL
[11]	DENG Y, YANG J L, TONG X. Deformed implicit field: modeling 3D shapes with learned dense correspondence[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 10281-10291.
[12]	陈海龙, 刘璐, 康星火, 等. 基于神经网络关节点估计的人体参数化三维模型变形方法: CN202011046379.1[P]. 2021-01-15.
	CHEN H L, LIU L, KANG X H, et al. Human body parameterized three-dimensional model deformation method based on neural network joint point estimation: CN202011046379.1[P]. 2021-01-15. (in Chinese).
[13]	DI Y, ZHANG C Y, ZHANG R D, et al. U-RED: unsupervised 3D shape retrieval and deformation for partial point clouds[C]// 2023 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2023: 8850-8861.
[14]	JIANG C M, HUANG J W, TAGLIASACCHI A, et al. ShapeFlow: learnable deformations among 3D shapes[C]// The 34th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2020: 817.
[15]	ZHANG R D, ZHANG C Y G, DI Y, et al. KP-RED: exploiting semantic keypoints for joint 3D shape retrieval and deformation[C]// 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2024: 20540-20550.
[16]	CHARLES R Q, HAO SU, KAICHUN MO, et al. PointNet: deep learning on point sets for 3D classification and segmentation[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 77-85.
[17]	LIU M Y, TUZEL O, VEERARAGHAVAN A, et al. Fast directional chamfer matching[C]// 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2010: 1696-1703.
[18]	MO K C, ZHU S L, CHANG A X, et al. PartNet: a large-scale benchmark for fine-grained and hierarchical part-level 3D object understanding[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 909-918.
[19]	AVETISYAN A, DAHNERT M, DAI A, et al. Scan2CAD: learning CAD model alignment in RGB-D scans[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 2609-2618.
[20]	UY M A, KIM V G, SUNG M, et al. Joint learning of 3D shape retrieval and deformation[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 11708-11717.

方法	椅子	桌子	柜子	平均
文献[20]	0.638	0.629	0.688	0.637
U-RED^[13]	0.834	0.326	0.474	0.551
ShapeFlow^[14]	0.238	0.400	0.514	0.340
KP-RED^[15]	0.122	0.163	0.141	0.142
Ours	0.095	0.123	0.148	0.122

方法	椅子	桌子	柜子	平均
文献[20]	0.638	0.629	0.688	0.637
U-RED^[13]	0.834	0.326	0.474	0.551
ShapeFlow^[14]	0.238	0.400	0.514	0.340
KP-RED^[15]	0.122	0.163	0.141	0.142
Ours	0.095	0.123	0.148	0.122

方法	椅子	桌子	柜子	平均
文献[20]	0.158	0.190	0.676	0.210
U-RED^[13]	0.227	0.132	0.316	0.207
ShapeFlow^[14]	0.230	0.302	0.345	0.265
KP-RED^[15]	0.093	0.110	0.069	0.091
Ours	0.062	0.089	0.064	0.072

方法	椅子	桌子	柜子	平均
文献[20]	0.158	0.190	0.676	0.210
U-RED^[13]	0.227	0.132	0.316	0.207
ShapeFlow^[14]	0.230	0.302	0.345	0.265
KP-RED^[15]	0.093	0.110	0.069	0.091
Ours	0.062	0.089	0.064	0.072

方法	椅子	桌子	柜子	平均
U-RED^[13]	0.221	0.207	0.211	0.213
ShapeFlow^[14]	0.205	0.193	0.196	0.198
KP-RED^[15]	0.173	0.164	0.165	0.167
Ours	0.152	0.161	0.169	0.161