Journal of Graphics ›› 2025, Vol. 46 ›› Issue (6): 1337-1345.DOI: 10.11996/JG.j.2095-302X.2025061337
• Computer Graphics and Virtual Reality • Previous Articles Next Articles
LIU Yuanyuan1,2(
), FANG Youjiang1,2, MENG Tianyu1,2, MENG Zhengyu1,2, LUO Pengwei1,2, YANG Peigen1,2, JIANG Yutong3, WEI Xiaopeng1,2, ZHANG Qiang1,2, YANG Xin1,2(
)
Received:2024-10-09
Accepted:2025-04-15
Online:2025-12-30
Published:2025-12-27
Contact:
YANG Xin
About author:First author contact:LIU Yuanyuan (1999-), PhD candidate. Her main research interests cover computer graphics and scene semantic understanding. E-mail:Lyy990415@gmail.com
Supported by:CLC Number:
LIU Yuanyuan, FANG Youjiang, MENG Tianyu, MENG Zhengyu, LUO Pengwei, YANG Peigen, JIANG Yutong, WEI Xiaopeng, ZHANG Qiang, YANG Xin. Geometry hypergraph aware 3D scene graph generation[J]. Journal of Graphics, 2025, 46(6): 1337-1345.
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.txxb.com.cn/EN/10.11996/JG.j.2095-302X.2025061337
| 模型 | 对象类别预测 | 谓词预测 | 关系预测 | |||
|---|---|---|---|---|---|---|
| R@5 | R@10 | R@3 | R@5 | R@50 | R@100 | |
| PointNet† | 63.39 | 74.54 | 89.07 | 96.03 | 50.05 | 55.73 |
| MSDN† | 61.07 | 72.41 | 85.99 | 93.60 | 46.55 | 53.20 |
| KERN† | 66.58 | 76.52 | 90.13 | 96.61 | 51.36 | 58.49 |
| 3DSSG | 66.41 | 77.26 | 82.58 | 94.34 | 51.16 | 56.48 |
| BGNN† | 71.19 | 81.98 | 86.98 | 93.80 | 55.20 | 60.85 |
| SGFormer | 70.66 | 80.98 | 83.98 | 91.82 | 56.20 | 60.75 |
| CSGG | 73.40 | 82.59 | 89.90 | 96.10 | 61.94 | 68.24 |
| Ours | 75.68 | 82.97 | 90.96 | 97.41 | 63.55 | 69.72 |
Table 1 Quantitative comparison on 3DSSG datasets
| 模型 | 对象类别预测 | 谓词预测 | 关系预测 | |||
|---|---|---|---|---|---|---|
| R@5 | R@10 | R@3 | R@5 | R@50 | R@100 | |
| PointNet† | 63.39 | 74.54 | 89.07 | 96.03 | 50.05 | 55.73 |
| MSDN† | 61.07 | 72.41 | 85.99 | 93.60 | 46.55 | 53.20 |
| KERN† | 66.58 | 76.52 | 90.13 | 96.61 | 51.36 | 58.49 |
| 3DSSG | 66.41 | 77.26 | 82.58 | 94.34 | 51.16 | 56.48 |
| BGNN† | 71.19 | 81.98 | 86.98 | 93.80 | 55.20 | 60.85 |
| SGFormer | 70.66 | 80.98 | 83.98 | 91.82 | 56.20 | 60.75 |
| CSGG | 73.40 | 82.59 | 89.90 | 96.10 | 61.94 | 68.24 |
| Ours | 75.68 | 82.97 | 90.96 | 97.41 | 63.55 | 69.72 |
| 模型 | 对象类别预测 | 谓词预测 | 关系预测 |
|---|---|---|---|
| mR@10 | mR@5 | mR@100 | |
| MSDN† | 35.51 | 62.10 | 50.17 |
| KERN† | 35.89 | 61.97 | 49.14 |
| 3DSSG | 34.43 | 63.93 | 52.21 |
| BGNN† | 41.79 | 58.98 | 54.21 |
| SGFormer | 42.37 | 47.59 | 53.52 |
| CSGG | 45.18 | 64.16 | 61.50 |
| Ours | 45.81 | 65.22 | 63.17 |
Table 2 mR comparison on 3DSSG datasets
| 模型 | 对象类别预测 | 谓词预测 | 关系预测 |
|---|---|---|---|
| mR@10 | mR@5 | mR@100 | |
| MSDN† | 35.51 | 62.10 | 50.17 |
| KERN† | 35.89 | 61.97 | 49.14 |
| 3DSSG | 34.43 | 63.93 | 52.21 |
| BGNN† | 41.79 | 58.98 | 54.21 |
| SGFormer | 42.37 | 47.59 | 53.52 |
| CSGG | 45.18 | 64.16 | 61.50 |
| Ours | 45.81 | 65.22 | 63.17 |
| 模型 | 模块 | 对象类别预测 | 谓词预测 | 关系预测 | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| VP | MsP | GL | HL | R@5 | R@10 | R@3 | R@5 | R@50 | R@100 | |
| M0 | Union | √ | 61.07 | 72.41 | 85.99 | 93.60 | 46.55 | 53.20 | ||
| M1 | InS | √ | 61.62 | 73.02 | 83.72 | 92.99 | 46.98 | 53.92 | ||
| M2 | I+U | √ | 69.77 | 79.05 | 91.62 | 95.79 | 61.05 | 65.44 | ||
| M3 | InS | √ | √ | 73.40 | 82.59 | 89.90 | 96.10 | 61.94 | 68.24 | |
| M4 | Union | √ | √ | 71.40 | 81.98 | 87.39 | 93.64 | 60.82 | 66.43 | |
| M5 | InS | √ | √ | √ | 73.25 | 81.67 | 90.43 | 96.97 | 62.94 | 69.24 |
| M6 | Union | √ | √ | √ | 73.27 | 81.82 | 87.63 | 94.92 | 60.41 | 66.76 |
| M7 | I+U | √ | √ | √ | 75.68 | 82.97 | 90.96 | 97.41 | 63.55 | 69.72 |
Table 3 Ablation experiment
| 模型 | 模块 | 对象类别预测 | 谓词预测 | 关系预测 | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| VP | MsP | GL | HL | R@5 | R@10 | R@3 | R@5 | R@50 | R@100 | |
| M0 | Union | √ | 61.07 | 72.41 | 85.99 | 93.60 | 46.55 | 53.20 | ||
| M1 | InS | √ | 61.62 | 73.02 | 83.72 | 92.99 | 46.98 | 53.92 | ||
| M2 | I+U | √ | 69.77 | 79.05 | 91.62 | 95.79 | 61.05 | 65.44 | ||
| M3 | InS | √ | √ | 73.40 | 82.59 | 89.90 | 96.10 | 61.94 | 68.24 | |
| M4 | Union | √ | √ | 71.40 | 81.98 | 87.39 | 93.64 | 60.82 | 66.43 | |
| M5 | InS | √ | √ | √ | 73.25 | 81.67 | 90.43 | 96.97 | 62.94 | 69.24 |
| M6 | Union | √ | √ | √ | 73.27 | 81.82 | 87.63 | 94.92 | 60.41 | 66.76 |
| M7 | I+U | √ | √ | √ | 75.68 | 82.97 | 90.96 | 97.41 | 63.55 | 69.72 |
| [1] |
LYU Y, SHI Y M, ZHANG X G. Improving target-driven visual navigation with attention on 3D spatial relationships[J]. Neural Processing Letters, 2022, 54(5): 3979-3998.
DOI |
| [2] |
KIM U H, PARK J M, SONG T J, et al. 3-D scene graph: a sparse and semantic representation of physical environments for intelligent agents[J]. IEEE Transactions on Cybernetics, 2020, 50(12): 4921-4933.
DOI URL |
| [3] | ZHOU Y, WHILE Z, KALOGERAKIS E. SceneGraphNet: neural message passing for 3D indoor scene augmentation[C]// 2019 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2019: 7383-7391. |
| [4] | DHAMO H, MANHARDT F, NAVAB N, et al. Graph-to-3D: end-to-end generation and manipulation of 3D scenes using scene graphs[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 16332-16341. |
| [5] | YANG G C, ZHANG J Y, ZHANG Y, et al. Probabilistic modeling of semantic ambiguity for scene graph generation[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 12522-12531. |
| [6] | SUHAIL M, MITTAL A, SIDDIQUIE B, et al. Energy-based learning for scene graph generation[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 13931-13940. |
| [7] | HERZIG R, RABOH M, CHECHIK G, et al. Mapping images to scene graphs with permutation-invariant structured prediction[C]// The 32nd International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc, 2018: 7211-7221. |
| [8] | ZAREIAN A, KARAMAN S, CHANG S F. Bridging knowledge graphs to generate scene graphs[C]// The 16th European Conference on Computer Vision. Cham: Springer, 2020: 606-623. |
| [9] | NEWELL A, DENG J. Pixels to graphs by associative embedding[C]// The 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 2168-2177. |
| [10] | ZHANG J, SHIH K J, ELGAMMAL A, et al. Graphical contrastive losses for scene graph parsing[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 11527-11535. |
| [11] |
LIU Y Y, LONG C J, ZHANG Z X, et al. Explore contextual information for 3D scene graph generation[J]. IEEE Transactions on Visualization and Computer Graphics, 2023, 29(12): 5556-5568.
DOI URL |
| [12] | WALD J, DHAMO H, NAVAB N, et al. Learning 3D semantic scene graphs from 3D indoor reconstructions[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 3960-3969. |
| [13] | ARMENI I, HE Z Y, ZAMIR A, et al. 3D scene graph:a structure for unified semantics, 3D space, and camera[C]// 2019 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2019: 5663-5672. |
| [14] | WU S C, WALD J, TATENO K, et al. SceneGraphFusion: incremental 3D scene graph prediction from RGB-D sequences[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 7511-7521. |
| [15] | KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks[EB/OL]. [2024-04-08]. https://openreview.net/forum?id=SJU4ayYgl. |
| [16] | ZHANG C Y, YU J H, SONG Y, et al. Exploiting edge-oriented reasoning for 3D point-based scene graph analysis[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 9700-9710. |
| [17] | ZHANG S L, LI S, HAO A M, et al. Knowledge-inspired 3D scene graph prediction in point cloud[C]// The 35th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2021: 18620-18632. |
| [18] | GU J X, ZHAO H D, LIN Z, et al. Scene graph generation with external knowledge and image reconstruction[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 1969-1978. |
| [19] | LU C, KRISHNA R, BERNSTEIN M, et al. Visual relationship detection with language priors[C]// The 14th European Conference on Computer Vision. Cham: Springer, 2016: 852-869. |
| [20] | LV C S, QI M S, LI X, et al. SGFormer: semantic graph transformer for point cloud-based 3D scene graph generation[C]// The 38th AAAI Conference on Artificial Intelligence. Washington: AAAI, 2024: 4035-4043. |
| [21] |
BATTISTON F, CENCETTI G, IACOPINI I, et al. Networks beyond pairwise interactions: structure and dynamics[J]. Physics Reports, 2020, 874: 1-92.
DOI URL |
| [22] |
BAI S, ZHANG F H, TORR P H S. Hypergraph convolution and hypergraph attention[J]. Pattern Recognition, 2021, 110: 107637.
DOI URL |
| [23] | SUN X G, YIN H Z, LIU B, et al. Heterogeneous hypergraph embedding for graph classification[C]// The 14th ACM International Conference on Web Search and Data Mining. New York: ACM, 2021: 725-733. |
| [24] | SCHÖLKOPF B, PLATT J, HOFMANN T. Learning with hypergraphs: clustering, classification, and embedding[C]// 2006 Neural Information Processing Systems 19. New York:IEEE Press, 2007: 1601-1608. |
| [25] | FAN H Y, ZHANG F B, WEI Y X, et al. Heterogeneous hypergraph variational autoencoder for link prediction[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(8): 4125-4138. |
| [26] | YANG D Q, QU B Q, YANG J, et al. Lbsn2vec++: heterogeneous hypergraph embedding for location-based social networks[J]. IEEE Transactions on Knowledge and Data Engineering, 2022, 34(4): 1843-1855. |
| [27] | ZHANG R C, ZOU Y S, MA J. Hyper-SAGNN: a self-attention based graph neural network for hypergraphs[EB/OL]. [2024-04-08]. https://openreview.net/group?id=ICLR.cc/2020/Conference. |
| [28] | WANG J L, DING K Z, HONG L J, et al. Next-item recommendation with sequential hypergraphs[C]// The 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2020: 1101-1110. |
| [29] | WANG J L, DING K Z, ZHU Z W, et al. Session-based recommendation with hypergraph attention networks[C]// The 21st SIAM International Conference on Data Mining. Online: SIAM, 2021: 82-90. |
| [30] | YU J L, YIN H Z, LI J D, et al. Self-supervised multi-channel hypergraph convolutional network for social recommendation[C]// 2021 Web Conference. New York: Association for Computing Machinery, 2021: 413-424. |
| [31] |
FENG Y F, JI S Y, LIU Y S, et al. Hypergraph-based multi-modal representation for open-set 3D object retrieval[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, 46(4): 2206-2223.
DOI URL |
| [32] | MARCU A, PIRVU M, COSTEA D, et al. Self-supervised hypergraphs for learning multiple world interpretations[C]// 2023 IEEE/CVF International Conference on Computer Vision Workshops. New York: IEEE Press, 2023: 983-992. |
| [33] | ZAREIAN A, WANG Z C, YOU H X, et al. Learning visual commonsense for robust scene graph generation[C]// The 16th European Conference on Computer Vision. Cham: Springer, 2020: 642-657. |
| [34] | KHANDELWAL S, SUHAIL M, SIGAL L. Segmentation-grounded scene graph generation[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 15859-15869. |
| [35] | LU Y C, RAI H, CHANG J, et al. Context-aware scene graph generation with Seq2Seq transformers[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 15911-15921. |
| [36] | GUO Y Y, SONG J K, GAO L L, et al. One-shot scene graph generation[C]// The 28th ACM International Conference on Multimedia. New York: ACM, 2020: 3090-3098. |
| [37] | ZELLERS R, YATSKAR M, THOMSON S, et al. Neural motifs: scene graph parsing with global context[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 5831-5840. |
| [38] | CHIOU M J, DING H H, YAN H S, et al. Recovering the unbiased scene graphs from the biased ones[C]// The 29th ACM International Conference on Multimedia. New York: ACM, 2021: 1581-1590. |
| [39] |
REN G H, REN L J, LIAO Y, et al. Scene graph generation with hierarchical context[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(2): 909-915.
DOI URL |
| [40] | WOO S, KIM D, CHO D, et al. LinkNet: relational embedding for scene graph[C]// The 32nd International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2018: 558-568. |
| [41] | XU D F, ZHU Y K, CHOY C B, et al. Scene graph generation by iterative message passing[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 3097-3106. |
| [42] | CAI S F, LI L, DENG J C, et al. Rethinking graph neural architecture search from message-passing[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 6653-6662. |
| [43] |
WU Z H, PAN S R, CHEN F W, et al. A comprehensive survey on graph neural networks[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(1): 4-24.
DOI URL |
| [44] | YANG X, TANG K H, ZHANG H W, et al. Auto-encoding scene graphs for image captioning[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 10677-10686. |
| [45] | QI M S, LI W J, YANG Z Y, et al. Attentive relational networks for mapping images to scene graphs[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 3952-3961. |
| [46] | YANG J W, LU J S, LEE S, et al. Graph R-CNN for scene graph generation[C]// The 15th European Conference on Computer Vision. Cham: Springer, 2018: 690-706. |
| [47] | YAO Y, ZHANG A, HAN X, et al. Visual distant supervision for scene graph generation[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 15796-15806. |
| [48] | WANG W B, WANG R P, SHAN S G, et al. Sketching image gist: human-mimetic hierarchical scene graph generation[C]// The 16th European Conference on Computer Vision. Cham: Springer, 2020: 222-239. |
| [49] | CONG Y R, ACKERMANN H, LIAO W T, et al. NODIS: neural ordinary differential scene understanding[C]// The 16th European Conference on Computer Vision. Cham: Springer, 2020: 636-653. |
| [50] | YIN G J, SHENG L, LIU B, et al. Zoom-Net: mining deep feature interactions for visual relationship recognition[C]// The 15th European Conference on Computer Vision. Cham: Springer, 2018: 330-347. |
| [51] | LI Y K, OUYANG W L, ZHOU B L, et al. Factorizable net: an efficient subgraph-based framework for scene graph generation[C]// The 15th European Conference on Computer Vision. Cham: Springer, 2018: 346-363. |
| [52] | CHARLES R Q, SU H, KAICHUN M, et al. PointNet: deep learning on point sets for 3D classification and segmentation[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 77-85. |
| [53] | LI R J, ZHANG S Y, WAN B, et al. Bipartite graph network with adaptive message passing for unbiased scene graph generation[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 11104-11114. |
| [54] | LI Y K, OUYANG W L, ZHOU B L, et al. Scene graph generation from objects, phrases and region captions[C]// 2017 IEEE International Conference on Computer Vision. New York: IEEE Press, 2017: 1270-1279. |
| [55] | CHEN T S, YU W H, CHEN R Q, et al. Knowledge-embedded routing network for scene graph generation[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 6156-6164. |
| [56] | HAN X G, ZHANG Z X, DU D, et al. Deep reinforcement learning of volume-guided progressive view inpainting for 3D point scene completion from a single depth image[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 234-243. |
| [57] | YANG X, WANG Y B, WANG Y R, et al. Active object reconstruction using a guided view planner[C]// The 27th International Joint Conference on Artificial Intelligence. Washington: AAAI, 2018: 4965-4971. |
| [1] | CAO Lujing, LU Peng. A video colorization method based on multiple reference images [J]. Journal of Graphics, 2025, 46(6): 1316-1326. |
| [2] | FAN Lexiang, MA Ji, ZHOU Dengwen. Lightweight blind super-resolution network based on degradation separation [J]. Journal of Graphics, 2025, 46(6): 1304-1315. |
| [3] | LI Xingchen, LI Zongmin, YANG Chaozhi. Test-time adaptation algorithm based on trusted pseudo-label fine-tuning [J]. Journal of Graphics, 2025, 46(6): 1292-1303. |
| [4] | ZHANG Xinyun, ZHANG Liwen, ZHOU Li, LUO Xiaonan. Coffee fruit maturity prediction model based on image blocking interaction [J]. Journal of Graphics, 2025, 46(6): 1274-1280. |
| [5] | YU Nannan, MENG Zhengyu, FANG Youjiang, SUN Chuanyu, YIN Xuefeng, ZHANG Qiang, WEI Xiaopeng, YANG Xin. Frequency-aware hypergraph fusion for event-based semantic segmentation [J]. Journal of Graphics, 2025, 46(6): 1267-1273. |
| [6] | HE Mengmeng, ZHANG Xiaoyan, LI Hongan. Lightweight skin lesion image segmentation network based on Mamba structure [J]. Journal of Graphics, 2025, 46(6): 1257-1266. |
| [7] | YUE Zijia, WANG Wensong, CHEN Shuangmin, XIN Shiqing, TU Changhe. Geodesic distance propagation across open boundaries [J]. Journal of Graphics, 2025, 46(5): 1042-1049. |
| [8] | HUANG Kaiqi, WU Meiqi, CHEN Honghao, FENG Xiaokun, ZHANG Dailing. The three realms of visual turing: from seeing to imagining in the LLM era [J]. Journal of Graphics, 2025, 46(5): 919-930. |
| [9] | HUANG Jing, SHI Ruihao, SONG Wenming, GUO Hepan, WEI Huang, WEI Xiaosong, YAO Jian. A review of autonomous driving image synthesis methods: from simulators to new paradigms [J]. Journal of Graphics, 2025, 46(5): 931-949. |
| [10] | ZHAI Yongjie, ZHAI Bangchao, HU Zhedong, YANG Ke, WANG Qianming, ZHAO Xiaoyu. Adaptive feature fusion pyramid and attention mechanism-based method for transmission line insulator defect detection [J]. Journal of Graphics, 2025, 46(5): 950-959. |
| [11] | LENG Shuo, WANG Wei, OU Jiayong, XUE Zhigang, SONG Yinglong, MO Sijun. On-Site construction safety monitoring based on large vision language models [J]. Journal of Graphics, 2025, 46(5): 960-968. |
| [12] | YE Wenlong, CHEN Bin. PanoLoRA: an efficient finetuning method for panoramic image generation based on Stable Diffusion [J]. Journal of Graphics, 2025, 46(5): 980-989. |
| [13] | ZHU Hongmiao, ZHONG Guojie, ZHANG Yanci. Semantic segmentation of small-scale point clouds based on integration of mean shift and deep learning [J]. Journal of Graphics, 2025, 46(5): 998-1009. |
| [14] | GUO Ruidong, LAN Guiwen, FAN Donglin, ZHONG Zhan, XU Zirui, REN Xinyue. An object detection algorithm for powerline inspection based on the feature focus & diffusion network [J]. Journal of Graphics, 2025, 46(4): 719-726. |
| [15] | LEI Songlin, ZHAO Zhengpeng, YANG Qiuxia, PU Yuanyuan, GU Jinjing, XU Dan. Zero-shot style transfer based on decoupled diffusion models [J]. Journal of Graphics, 2025, 46(4): 727-738. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||