| [1] |
CHANG X J, REN P Z, XU P F, et al. A comprehensive survey of scene graphs: generation and application[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(1): 1-26.
DOI
URL
|
| [2] |
JOHNSON J, KRISHNA R, STARK M, et al. Image retrieval using scene graphs[C]// 2015 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2015: 3668-3678.
|
| [3] |
WU S C, WALD J, TATENO K, et al. SceneGraphFusion: incremental 3D scene graph prediction from RGB-D sequences[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 7511-7521.
|
| [4] |
WU S C, TATENO K, NAVAB N, et al. Incremental 3D semantic scene graph prediction from RGB sequences[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 5064-5074.
|
| [5] |
LU Y G, HU Y, FENG H Y, et al. Generating reconstructable collaborative virtual environments via graph matching for mixed reality remote collaboration[J]. The Visual Computer, 2025, 41(8): 5935-5947.
DOI
|
| [6] |
DAHNERT M, HOU J, NIEßNER M, et al. Panoptic 3D scene reconstruction from a single RGB image[C]// The 35th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2021: 633.
|
| [7] |
WALD J, DHAMO H, NAVAB N, et al. Learning 3D semantic scene graphs from 3D indoor reconstructions[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 3960-3969.
|
| [8] |
WALD J, NAVAB N, TOMBARI F. Learning 3D semantic scene graphs with instance embeddings[J]. International Journal of Computer Vision, 2022, 130(3): 630-651.
DOI
|
| [9] |
KOCH S, HERMOSILLA P, VASKEVICIUS N, et al. SGRec3D: self-supervised 3D scene graph learning via object-level scene reconstruction[C]// 2024 IEEE/CVF Winter Conference on Applications of Computer Vision. New York: IEEE Press, 2024: 3392-3402.
|
| [10] |
ARMENI I, HE Z Y, ZAMIR A, et al. 3D scene graph:a structure for unified semantics, 3D space, and camera[C]// 2019 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2019: 5663-5672.
|
| [11] |
HUGHES N, CHANG Y, CARLONE L. Hydra:a real-time spatial perception system for 3D scene graph construction and optimization[EB/OL]. [2025-08-23]. https://dblp.org/db/conf/rss/rss2022.html#HughesCC22.
|
| [12] |
KOCH S, HERMOSILLA P, VASKEVICIUS N, et al. Lang3DSG: language-based contrastive pre-training for 3D Scene Graph prediction[C]// 2024 International Conference on 3D Vision. New York: IEEE Press, 2024: 1037-1047.
|
| [13] |
RADFORD A, KIM J W, HALLACY C, et al. Learning transferable visual models from natural language supervision[EB/OL]. [2025-08-23]. http://proceedings.mlr.press/v139/radford21a.html.
|
| [14] |
LV C S, QI M S, LI X, et al. SGFormer: semantic graph transformer for point cloud-based 3D scene graph generation[C]// The 38th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2024: 4035-4043.
|
| [15] |
CHANG H N, KOWNDINYA B, LU S Y, et al. Context-aware entity grounding with open vocabulary 3D scene graphs[C]// The 7th Conference on Robot Learning. New York: PMLR Press, 2023: 1950-1974.
|
| [16] |
REIMERS N, GUREVYCH I. Sentence-BERT: sentence embeddings using Siamese BERT-networks[EB/OL]. [2025-08-23]. https://aclanthology.org/D19-1410/.
|
| [17] |
KOCH S, VASKEVICIUS N, COLOSI M, et al. Open3DSG: open-vocabulary 3D scene graphs from point clouds with queryable objects and open-set relationships[C]// 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2024: 14183-14193.
|
| [18] |
GHIASI G, GU X Y, CUI Y, et al. Scaling open-vocabulary image segmentation with image-level labels[C]// The 17th European Conference on Computer Vision. Cham: Springer, 2022: 540-557.
|
| [19] |
LI J N, LI D X, XIONG C M, et al. BLIP: bootstrapping language-image pre-training for unified vision-language understanding and generation[EB/OL]. [2025-08-23]. https://proceedings.mlr.press/v162/li22n.html.
|
| [20] |
LI J N, LI D X, SAVARESE S, et al. BLIP-2:bootstrapping language-image pre-training with frozen image encoders and large language models[EB/OL]. [2025-08-23]. https://proceedings.mlr.press/v202/li23q.html.
|
| [21] |
DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[EB/OL]. [2025-08-23]. https://aclanthology.org/N19-1423/.
|
| [22] |
CHEN L G X, WANG X J, LU J L, et al. CLIP-driven open-vocabulary 3D scene graph generation via cross-modality contrastive learning[C]// 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2024: 27863-27873.
|
| [23] |
WANG Z Q, CHENG B W, ZHAO L C, et al. VL-Sat: visual-linguistic semantics assisted training for 3D semantic scene graph prediction in point cloud[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 21560-21569.
|
| [24] |
QI C R, YI L, SU H, et al. PointNet++: deep hierarchical feature learning on point sets in a metric space[EB/OL]. [2025-08-23]. https://proceedings.neurips.cc/paper_files/paper/2017/file/d8bf84be3800d12f74d8b05e9b89836f-Paper.pdf.
|
| [25] |
ARMENI I, SENER O, ZAMIR A R, et al. 3D semantic parsing of large-scale indoor spaces[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 1534-1543.
|
| [26] |
ZHAO L, TAO W B. JSNet: joint instance and semantic segmentation of 3D point clouds[C]// The 34th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2020: 12951-12958.
|