基于特征点引导干扰物识别的神经辐射场重建

doi:10.11996/JG.j.2095-302X.2026010111

摘要/Abstract

摘要：

针对神经辐射场（NeRF）在干扰物体影响下难以实现高质量三维重建的问题，提出一种基于运动恢复结构(SfM)与多视图立体匹配(SAM)模型协同优化的方法。以SfM重建过程中的SIFT算法为基础，利用动态场景中的几何不一致性进行特征点的识别与匹配，将未匹配的特征点视为动态干扰物，进而引导可以接受点引导分割的SAM模型实现动态遮挡物分割，生成静态场景掩膜。基于分割结果，使用掩码感知体积渲染技术预测颜色，并建立四重损失函数：重建损失、结构一致性损失、对抗损失和自监督修补损失。通过联合优化目标的方式约束被修补区域的颜色输出，经多次迭代训练后，实现多视角下被遮挡区域的几何结构与外观的一致性修复，保证辐射场完整性的同时，实现遮挡物的消除。经公开动态场景数据验证表明，利用掩膜体积渲染和联合优化后的重建效果相较于基线模型和主流遮挡物消除方法峰值信噪比(PSNR)平均提升了5.24 dB，感知图像相似度(LPIPS)降低35%，该方法为复杂动态环境下的三维重建提供了新范式。

关键词: 神经辐射场, 三维重建, 动态场景, 遮挡物消除, 计算机视觉

Abstract:

To address the challenge of achieving high-quality 3D reconstruction with Neural Radiation Fields (NeRF) under the influence of occluding objects, a method based on the collaborative optimization of Structure-from-Motion (SfM) and the Segment Anything Model (SAM) was propose. Building upon the Scale-Invariant Feature Transform (SIFT) algorithm within the SfM reconstruction process, geometric inconsistencies in dynamic scenes were leveraged for feature point identification and matching. Unmatched feature points were treated as dynamic occluders, guiding the SAM model—capable of point-guided segmentation—to perform dynamic occluder segmentation and generate a static scene mask. Based on the segmentation results, mask-aware volumetric rendering was used to predict colors and a quadruple loss function was established: comprising reconstruction loss, structural consistency loss, adversarial loss, and self-supervised patching loss. These objectives were jointly optimized to constrain the color output in patched regions. After iterative training, consistent restoration of geometric structure and appearance in occluded areas across multiple viewpoints was achieved. The radiometric integrity was preserved while occlusions were removed. Validation on public dynamic scene datasets demonstrated that the mask-based volumetric rendering combined with joint optimization produced an average Peak Signal-to-Noise Ratio (PSNR) improvement of 5.24 dB over baseline models and mainstream occlusion removal methods, alongside a 35% reduction in Learned Perceptual Image Patch Similarity (LPIPS). This approach established a new paradigm for 3D reconstruction in complex dynamic environments.

Key words: neural radiation field, 3D reconstruction, dynamic scene, occlusion removal, computer vision

中图分类号:

TP391.41

任皓, 李少波, 弓茂, 王博. 基于特征点引导干扰物识别的神经辐射场重建[J]. 图学学报, 2026, 47(1): 111-119.

REN Hao, LI Shaobo, GONG Mao, WANG Bo. Neural radiation field reconstruction based on feature point-guided interference identification[J]. Journal of Graphics, 2026, 47(1): 111-119.

图/表 11

图1 本文算法流图

Fig. 1 Algorithm flowchart for this paper

图2 重建目标与干扰物((a) 无垃圾桶；(b) 有垃圾桶)

Fig. 2 Reconstruction target and disturbance ((a) No trash can; (b) There are trash bins)

表1 特征点分割准确率评估

Table 1 Feature point segmentation accuracy evaluation

数据集	特征点总数	未匹配点数量	未匹配点为干扰物数量
Bag	182	46	39
Pillow	157	31	22
Cars.	126	30	22
Chair	243	62	45
平均	174	169	128

表2 3种方法的结果对比/%

Table 2 Comparison of results from three methods/%

数据集	手工标注		SAM		本文
数据集	Acc	IoU	Acc	IoU	Acc	IoU
Bag	—	—	98.1	92.5	98.4	95.2
Pillow	—	—	97.6	90.6	98.4	95.3
Cars	—	—	97.4	93.1	97.7	94.2
Chair	—	—	97.1	90.0	98.3	94.8

图3 Yoda数据集渲染效果((a) 干扰图像；(b) 消除后的图像；(c) NeRF；(d) 3DGS)

Fig. 3 Yoda dataset rendering effect ((a) Interference image; (b) The image after elimination; (c) NeRF; (d) 3DGS)

图4 Android数据集渲染效果((a) 干扰图像；(b) 消除后的图像；(c) NeRF；(d) 3DGS)

Fig. 4 Android dataset rendering effect ((a) Interference image; (b) The image after elimination; (c) NeRF; (d) 3DGS)

图5 Carb数据集渲染效果((a) 干扰图像；(b) 消除后的图像；(c) NeRF；(d) 3DGS)

Fig. 5 Carb dataset rendering effect ((a) Interference image; (b) The image after elimination; (c) NeRF; (d) 3DGS)

图6 Corner数据集渲染效果((a) 干扰图像；(b) 消除后的图像；(c) NeRF；(d) 3DGS)

Fig. 6 Corner dataset rendering effect ((a) Interference image; (b) The image after elimination; (c) NeRF; (d) 3DGS)

图7 Station数据集渲染效果((a) 干扰图像；(b) 消除后的图像；(c) NeRF；(d) 3DGS)

Fig. 7 Station dataset rendering effect ((a) Interference image; (b) The image after elimination; (c) NeRF; (d) 3DGS)

表3 重建后的效果评估 Fig 3 Post-reconstruction effect evaluation

方法	Yoda			Android			Crab
方法	LPIPS $\downarrow $	Loss $\downarrow $	PSNR $\uparrow $	LPIPS $\downarrow $	Loss $\downarrow $	PSNR $\uparrow $	LPISP $\downarrow $	Loss $\downarrow $	PSNR $\uparrow $
NeRF	0.36	0.07	20.46	0.29	0.03	22.03	0.18	0.06	22.65
3DGS	0.14	0.02	27.08	0.16	0.04	23.88	0.09	0.02	27.75
NeRF-W	0.16	0.03	26.64	0.25	0.03	20.62	0.15	0.03	26.91
MipNerf360	0.23	0.02	23.75	0.18	0.02	21.81	0.09	0.02	26.25
On-the-go	0.19	0.03	28.96	0.13	0.04	23.10	0.11	0.03	27.55
本文方法	0.11	0.01	27.55	0.13	0.02	25.32	0.09	0.01	28.01

图8 分割失败案例((a) 干扰图像；(b) 成功分割；(c) 成功重建；(d) 失败分割；(e) 失败重建)

Fig. 8 Failed partitioning cases ((a) Interference image; (b) Successful segmentation; (c) Successful reconstruction; (d) Failure segmentation; (e) Failure reconstruction)

参考文献 18

[1]	王稚儒, 常远, 鲁鹏, 等. 神经辐射场加速算法综述[J]. 图学学报, 2024, 45(1): 1-13.
	WANG Z R, CHANG Y, LU P, et al. A review on neural radiance fields acceleration[J]. Journal of Graphics, 2024, 45(1): 1-13 (in Chinese). DOI
[2]	SCHÖNBERGER J L, FRAHM J M. Structure-from-motion revisited[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 4104-4113.
[3]	SEITZ S M, CURLESS B, DIEBEL J, et al. A comparison and evaluation of multi-view stereo reconstruction algorithms[C]// 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2006: 519-528.
[4]	MILDENHALL B, SRINIVASAN P P, TANCIK M, et al. NeRF: representing scenes as neural radiance fields for view synthesis[J]. Communications of the ACM, 2021, 65(1): 99-106.
[5]	PINKUS A. Approximation theory of the MLP model in neural networks[J]. Acta Numerica, 1999, 8: 143-195. DOI URL
[6]	GARBIN S J, KOWALSKI M, JOHNSON M, et al. FastNeRF: high-fidelity neural rendering at 200FPS[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 14326-14335.
[7]	MÜLLER T, EVANS A, SCHIED C, et al. Instant neural graphics primitives with a multiresolution hash encoding[EB/OL]. [2025-04-01]. https://doi.org/10.1145/3528223.3530127.
[8]	MARTIN-BRUALLA R, RADWAN N, SAJJADI M S M, et al. NeRF in the wild: neural radiance fields for unconstrained photo collections[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 7206-7215.
[9]	BOJANOWSKI P, JOULIN A, LOPEZ-PAZ D, et al. Optimizing the latent space of generative networks[EB/OL]. [2025-03-30]. https://dblp.uni-trier.de/db/conf/icml/icml2018.html#BojanowskiJLS18.
[10]	CHEN X Y, ZHANG Q, LI X Y, et al. Hallucinated neural radiance fields in the wild[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 12933-12942.
[11]	LEE J, KIM I, HEO H, et al. Semantic-aware occlusion filtering neural radiance fields in the wild[EB/OL]. [2025-04-01]. https://arxiv.org/abs/2303.03966.
[12]	TANCIK M, CASSER V, YAN X C, et al. Block-NeRF: scalable large scene neural view synthesis[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 8238-8248.
[13]	KIRILLOV A, MINTUN E, RAVI N, et al. Segment anything[C]// 2023 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2023: 3992-4003.
[14]	何高湘, 朱斌, 解博, 等. 基于神经辐射场的新视角合成研究进展[J]. 激光与光电子学进展, 2024, 61(12): 1200005.
	HE G X, ZHU B, XIE B, et al. Progress in novel view synthesis using neural radiance fields[J]. Laser & Optoelectronics Progress, 2024, 61(12): 1200005 (in Chinese).
[15]	KIM I, CHOI M, KIM H J. UP-NeRF: unconstrained pose-prior-free neural radiance fields[EB/OL]. [2025-04-01]. https://arxiv.org/abs/2311.03784.
[16]	LOWE D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004, 60(2): 91-110. DOI
[17]	MIRZAEI A, AUMENTADO-ARMSTRONG T, DERPANIS K G, et al. SPIn-NeRF: multiview segmentation and perceptual inpainting with neural radiance fields[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 20669-20679.
[18]	KERBL B, KOPANAS G, LEIMKUEHLER T, et al. 3D Gaussian splatting for real-time radiance field rendering[J]. ACM Transactions on Graphics, 2023, 42(4): 139.