图学学报 ›› 2026, Vol. 47 ›› Issue (2): 390-401.DOI: 10.11996/JG.j.2095-302X.2026020390
收稿日期:2025-09-02
接受日期:2025-12-12
出版日期:2026-04-30
发布日期:2026-05-20
通讯作者:杜冬,E-mail:dongdu@njust.edu.cn基金资助:
LIU Jinghao, YOU Zhenguo, DU Dong(
)
Received:2025-09-02
Accepted:2025-12-12
Published:2026-04-30
Online:2026-05-20
Contact:
DU Dong,E-mail:dongdu@njust.edu.cnSupported by:摘要:
基于传统计算机辅助设计(CAD)创建兼具可制造性与可编辑性的三维模型是一项复杂且耗时的任务。近年来,深度学习技术在CAD模型自动化生成方面展现出巨大潜力并成为研究热点。然而,多数CAD生成模型未能充分利用点云、图像和草图等输入数据中蕴含的几何与语义信息,难以通过灵活的条件输入精准控制生成方向。针对这一问题,通过挖掘潜在空间的表征能力,采用去噪扩散概率模型,以这类条件输入数据作为引导,实现CAD模型定向生成。具体而言,首先构建基于Transformer架构的自编码器,将CAD参数命令序列编码至潜在空间;进而在此空间内搭建去噪扩散概率模型,融合点云、图像或草图条件编码信息,生成CAD特征向量;最后通过解码器还原为三维CAD模型。实验结果表明,所生成的CAD模型结构合理、表面光滑且几何特征清晰,相较于现有方法,在生成形状多样性、分布相似性与保真度之间实现了较好的平衡,且当以点云、图像或草图作为条件输入时,均能有效提升CAD模型的生成质量。相关代码已开源,详情可见
中图分类号:
刘景豪, 游振国, 杜冬. 基于潜在扩散模型的CAD条件生成[J]. 图学学报, 2026, 47(2): 390-401.
LIU Jinghao, YOU Zhenguo, DU Dong. Conditional generation of CAD models based on latent diffusion models[J]. Journal of Graphics, 2026, 47(2): 390-401.
| 命令类型 | 参数 | 含义 |
|---|---|---|
| 一个回路的开始 | ||
| 线的端点 | ||
| 圆弧端点 | ||
| 扫掠角度 | ||
| 逆时针标志 | ||
| 圆心 | ||
| 半径 | ||
| 草图平面方向 | ||
| 草图平面原点 | ||
| 比例因子 | ||
| 挤出距离 | ||
| 布尔类型 | ||
| 挤出类型 | ||
| 整个序列的结束 |
表1 CAD命令类型及其对应参数
Table 1 Types of CAD commands and their corresponding parameters
| 命令类型 | 参数 | 含义 |
|---|---|---|
| 一个回路的开始 | ||
| 线的端点 | ||
| 圆弧端点 | ||
| 扫掠角度 | ||
| 逆时针标志 | ||
| 圆心 | ||
| 半径 | ||
| 草图平面方向 | ||
| 草图平面原点 | ||
| 比例因子 | ||
| 挤出距离 | ||
| 布尔类型 | ||
| 挤出类型 | ||
| 整个序列的结束 |
| 方法 | COV/%↑ | JSD↓ | MMD↓ |
|---|---|---|---|
| DeepCAD | 78.6 | 4.086 | 1.509 |
| SkexGen | 76.8 | 2.110 | 1.395 |
| BrepGen | 75.1 | 1.457 | 1.245 |
| FlexCAD | 76.5 | 2.625 | 1.532 |
| 本文 | 79.1 | 3.051 | 1.348 |
表2 CAD模型的形状生成表现
Table 2 Shape generation performance of CAD models
| 方法 | COV/%↑ | JSD↓ | MMD↓ |
|---|---|---|---|
| DeepCAD | 78.6 | 4.086 | 1.509 |
| SkexGen | 76.8 | 2.110 | 1.395 |
| BrepGen | 75.1 | 1.457 | 1.245 |
| FlexCAD | 76.5 | 2.625 | 1.532 |
| 本文 | 79.1 | 3.051 | 1.348 |
| 方法 | 推理时间/ms |
|---|---|
| DeepCAD | 2.88 |
| SkexGen | 49.55 |
| BrepGen | 7164.12 |
| FlexCAD | 9485.88 |
| 本文方法 | 6.43 |
表3 无条件生成推理速度对比
Table 3 Comparison of inference speed for unconditional generation
| 方法 | 推理时间/ms |
|---|---|
| DeepCAD | 2.88 |
| SkexGen | 49.55 |
| BrepGen | 7164.12 |
| FlexCAD | 9485.88 |
| 本文方法 | 6.43 |
| 方法 | Point Cloud | Image | Sketch | |||
|---|---|---|---|---|---|---|
| ACC_cmd | ACC_param | ACC_cmd | ACC_param | ACC_cmd | ACC_param | |
| DeepCAD | 74.91 | 61.22 | 63.15 | 52.04 | 61.96 | 47.36 |
| 本文 | 86.54 | 71.80 | 76.98 | 65.86 | 68.19 | 56.02 |
表4 不同条件下CAD模型重建指标对比
Table 4 Comparison of CAD model reconstruction metrics under different conditions
| 方法 | Point Cloud | Image | Sketch | |||
|---|---|---|---|---|---|---|
| ACC_cmd | ACC_param | ACC_cmd | ACC_param | ACC_cmd | ACC_param | |
| DeepCAD | 74.91 | 61.22 | 63.15 | 52.04 | 61.96 | 47.36 |
| 本文 | 86.54 | 71.80 | 76.98 | 65.86 | 68.19 | 56.02 |
图7 基于不同条件数据的CAD生成结果((a) 基于点云的生成结果;(b) 基于图像的生成结果;(c) 基于草图的生成结果)
Fig. 7 CAD generation results based on different conditional data ((a) Generation results based on point clouds; (b) Generation results based on images; (c) Generation results based on sketches)
| [1] | 刘爱军, 黄松波, 闫光荣. 三维CAD混合建模技术研究[J]. 图学学报, 2013, 34(6): 61-63. |
| LIU A J, HUANG S B, YAN G R. Research on hybrid modeling technology for CAD model[J]. Journal of Graphics, 2013, 34(6): 61-63 (in Chinese). | |
| [2] | 黄学良, 李娜, 陈立平. 三维装配几何约束组合的分类求解策略[J]. 图学学报, 2014, 35(2): 236-242. |
| HUANG X L, LI N, CHEN L P. Classification and solution of 3D assembly geometric constraint system between two rigid bodies[J]. Journal of Graphics, 2014, 35(2): 236-242 (in Chinese). | |
| [3] | ROMBACH R, BLATTMANN A, LORENZ D, et al. High-resolution image synthesis with latent diffusion models[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 10674-10685. |
| [4] | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// The 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 6000-6010. |
| [5] | WU J J, ZHANG C K, XUE T F, et al. Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling[C]// The 30th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2016: 82-90. |
| [6] |
GOODFELLOW I, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial networks[J]. Communications of the ACM, 2020, 63(11): 139-144.
DOI URL |
| [7] | YANG G D, HUANG X, HAO Z K, et al. PointFlow: 3D point cloud generation with continuous normalizing flows[C]// 2019 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2019: 4540-4549. |
| [8] | WANG N Y, ZHANG Y D, LI Z W, et al. Pixel2Mesh: generating 3D mesh models from single RGB images[C]// The 15th European Conference on Computer Vision. Cham: Springer, 2018: 55-71. |
| [9] | MESCHEDER L, OECHSLE M, NIEMEYER M, et al. Occupancy networks: learning 3D reconstruction in function space[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 4455-4465. |
| [10] | PARK J J, FLORENCE P, STRAUB J, et al. DeepSDF: learning continuous signed distance functions for shape representation[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 165-174. |
| [11] | MILDENHALL B, SRINIVASAN P P, TANCIK M, et al. NeRF: representing scenes as neural radiance fields for view synthesis[J]. Communications of the ACM, 2022, 65(1): 99-106. |
| [12] | SHARMA G, GOYAL R, LIU D F, et al. CSGNet: neural shape parser for constructive solid geometry[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 5515-5523. |
| [13] | KANIA K, ZIĘBA M, KAJDANOWICZ T. UCSG-NET- unsupervised discovering of constructive solid geometry tree[C]// The 34th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2020: 736. |
| [14] | WANG X G, XU Y L, XU K, et al. PIE-NET: parametric inference of point cloud edges[C]// The 34th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2020: 1693. |
| [15] | JAYARAMAN P K, LAMBOURNE J G, DESAI N, et al. SolidGen: an autoregressive model for direct B-rep synthesis[EB/OL]. [2025-05-05]. https://arxiv.org/abs/2203.13944. |
| [16] | VINYALS O, FORTUNATO M, JAITLY N. Pointer networks[C]// The 29th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2015: 2692-2700. |
| [17] | XU X, LAMBOURNE J, JAYARAMAN P, et al. BrepGen: a B-rep generative diffusion model with structured latent geometry[J]. ACM Transactions on Graphics, 2024, 43(4): 119. |
| [18] | LI J, FU Y H, CHEN F L. DTGBrepGen: a novel B-rep generative model through decoupling topology and geometry[C]// 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2025: 21438-21447. |
| [19] | WU R D, XIAO C, ZHENG C X. DeepCAD: a deep generative network for computer-aided design models[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 6752-6762. |
| [20] | XU X, WILLIS K D D, LAMBOURNE J G, et al. SkexGen: autoregressive generation of cad construction sequences with disentangled codebooks[EB/OL]. [2025-07-02]. https://proceedings.mlr.press/v162/xu22k.html. |
| [21] | XU X, JAYARAMAN P K, LAMBOURNE J G, et al. Hierarchical neural coding for controllable CAD model generation[EB/OL]. [2025-07-02]. https://proceedings.mlr.press/v202/xu23f.html. |
| [22] | ZHANG Z W, SUN S Z, WANG W X, et al. FlexCAD: unified and versatile controllable CAD generation with fine-tuned large language models[EB/OL]. [2025-12-08]. https://arxiv.org/abs/2411.05823. |
| [23] | LI C J, PAN H, BOUSSEAU A, et al. Sketch2CAD: sequential cad modeling by sketching in context[J]. ACM Transactions on Graphics, 2020, 39(6): 164. |
| [24] | Li C J, PAN H, BOUSSEAU A, et al. Free2CAD: parsing freehand drawings into CAD commands[J]. ACM Transactions on Graphics, 2022, 41(4): 93. |
| [25] | HÄHNLEIN F, LI C J, MITRA N J, et al. CAD2Sketch: generating concept sketches from CAD sequences[J]. ACM Transactions on Graphics, 2022, 41(6): 279. |
| [26] | UY M A, CHANG Y Y, SUNG M, et al. Point2Cyl: reverse engineering 3D objects from point clouds to extrusion cylinders[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 11840-11850. |
| [27] | REN D X, ZHENG J M, CAI J F, et al. ExtrudeNet: unsupervised inverse sketch-and-extrude for shape parsing[C]// The 17th European Conference on Computer Vision. Cham: Springer, 2022: 482-498. |
| [28] | LI P, GUO J W, ZHANG X P, et al. SECAD-net: self-supervised CAD reconstruction by learning sketch-extrude operations[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 16816-16826. |
| [29] | ZHOU S D, TANG T Y, ZHOU B. CADParser: a learning approach of sequence modeling for B-rep CAD[EB/OL]. [2025-07-02]. https://dblp.org/rec/conf/ijcai/ZhouTZ23.html?view=bibtex. |
| [30] | HO J, JAIN A, ABBEEL P. Denoising diffusion probabilistic models[C]// The 34th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2020: 574. |
| [31] | CHOI J, KIM S, JEONG Y, et al. ILVR: conditioning method for denoising diffusion probabilistic models[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 14347-14356. |
| [32] | GU S Y, CHEN D, BAO J M, et al. Vector quantized diffusion model for text-to-image synthesis[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 10686-10696. |
| [33] | VAN DEN OORD A, VINYALS O, KAVUKCUOGLU K. Neural discrete representation learning[C]// The 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 6309-6318. |
| [34] |
ZHANG A J, JIA W Q, ZOU Q, et al. Diffusion-CAD: controllable diffusion model for generating computer-aided design models[J]. IEEE Transactions on Visualization and Computer Graphics, 2025, 31(12): 10188-10199.
DOI URL |
| [35] | ALAM M F, AHMED F. GenCAD: image-conditioned computer-aided design generation with transformer-based contrastive representation and diffusion priors[EB/OL]. [2025-05-05]. https://arxiv.org/abs/2409.16294. |
| [36] |
WANG H X, ZHAO M Y, Wang Y Q, et al. VQ-CAD: computer-aided design model generation with vector quantized diffusion[J]. Computer Aided Geometric Design, 2024, 111: 102327.
DOI URL |
| [37] | QI C R, YI L, SU H, et al. PointNet++: deep hierarchical feature learning on point sets in a metric space[C]// The 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 5105-5114. |
| [38] | HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 770-778. |
| [39] |
CANNY J. A computational approach to edge detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1986, PAMI-8(6): 679-698.
DOI URL |
| [40] | WILLIS K D D, PU Y W, LUO J L, et al. Fusion 360 gallery: a dataset and environment for programmatic CAD construction from human design sequences[J]. ACM Transactions on Graphics, 2021, 40(4): 54. |
| [1] | 赵振兵, 张靖梁, 唐辰康, 毕雨轩, 李浩鹏. 面向积水干扰的变电设备渗漏油精准分割方法[J]. 图学学报, 2026, 47(2): 296-310. |
| [2] | 陈梦琪, 赵俊莉, 邓晓丹. 基于大模型的皮肤病图像掩膜生成与分割[J]. 图学学报, 2026, 47(2): 322-331. |
| [3] | 周金, 周一, 徐鹏飞, 黄惠. PDF-Sketch:基于笔画段距离场与离散扩散的布局式草图生成方法[J]. 图学学报, 2026, 47(2): 380-389. |
| [4] | 邓鹏, 谭文正, 罗慧明, 李帅, 杨斌. 基于Revit二次开发的独立基础参数化建模方法研究[J]. 图学学报, 2026, 47(1): 194-203. |
| [5] | 刘德丰, 陈伟政, 白亚强, 刘凯, 王琦. 基于条件生成模型的船型概念方案正向设计方法探索[J]. 图学学报, 2025, 46(6): 1209-1215. |
| [6] | 叶文龙, 陈斌. PanoLoRA:基于Stable Diffusion的全景图像生成的高效微调方法[J]. 图学学报, 2025, 46(5): 980-989. |
| [7] | 雷松林, 赵征鹏, 阳秋霞, 普园媛, 谷金晶, 徐丹. 基于可解耦扩散模型的零样本风格迁移[J]. 图学学报, 2025, 46(4): 727-738. |
| [8] | 孙禾衣, 李艺潇, 田希, 张松海. 结合程序内容生成与扩散模型的图像到三维瓷瓶生成技术[J]. 图学学报, 2025, 46(2): 332-344. |
| [9] | 李纪远, 管哲予, 宋海川, 谭鑫, 马利庄. 人在环路的细分行业logo生成方法[J]. 图学学报, 2025, 46(2): 382-392. |
| [10] | 涂晴昊, 李元琪, 刘一凡, 过洁, 郭延文. 基于扩散模型的文本生成材质贴图的泛化性优化方法[J]. 图学学报, 2025, 46(1): 139-149. |
| [11] | 张冀, 崔文帅, 张荣华, 王文彬, 李亚琦. 基于关键视图的文本驱动3D场景编辑方法[J]. 图学学报, 2024, 45(4): 834-844. |
| [12] | 王吉, 王森, 蒋智文, 谢志峰, 李梦甜. 基于深度条件扩散模型的零样本文本驱动虚拟人生成方法[J]. 图学学报, 2023, 44(6): 1218-1226. |
| [13] | 邹强. 浅谈实体建模:历史、现状与未来[J]. 图学学报, 2022, 43(6): 987-1001. |
| [14] | 高岱 1, 王宏扬 2, 杜嘉赫 2, 蔡子昂 2, 洛桑次仁 2. 中国古典建筑构件BIM 参数化建模方法研究[J]. 图学学报, 2018, 39(2): 333-338. |
| [15] | 胡冶昌, 魏志芳, 王志伟. 多股螺旋扭转弹簧参数化建模方法研究[J]. 图学学报, 2017, 38(6): 820-825. |
| 阅读次数 | ||||||
|
全文 |
|
|||||
|
摘要 |
|
|||||