图学学报 ›› 2026, Vol. 47 ›› Issue (1): 90-98.DOI: 10.11996/JG.j.2095-302X.2026010090
收稿日期:2025-06-24
接受日期:2025-08-27
出版日期:2026-02-28
发布日期:2026-03-16
通讯作者:黄志勇,E-mail:hzy@hzy.org.cn基金资助:
XIANG Mengli, HUANG Zhiyong(
), SHE Yali, DING Tuojun
Received:2025-06-24
Accepted:2025-08-27
Published:2026-02-28
Online:2026-03-16
Supported by:摘要:
针对现有图像匹配方法在大视角变换场景中匹配精度和匹配数量显著下降的问题,提出了一种改进的E-LoFTR图像匹配方法。首先,采用先视角调整后精细匹配策略,提出了一种新颖的双阶段SIFT视角矫正模块,该模块结合了尺度不变特征变换(SIFT)算法的视角不变性与单应性变换(homography)的几何对齐能力,提高了模型对大视角变换的适应能力。然后,设计了方向感知门控注意力机制,使用多方向卷积和动态门控的级联结构提取查询(Q)、键(K)、值(V),注入的几何先验显著提升了模型的鲁棒性。最后,为了避免特征融合过程中的信息损失问题,使用Fusion-DySample上采样模块提升匹配性能。在公开数据集MegaDepth上的实验结果表明,所提出的方法在旋转误差阈值为5°,10°和20°下的相对位姿估计累计曲线下面积分别为57.1%,72.7%和83.9%,较E-LoFTR分别提升0.7%,0.5%和0.4%;在基于MegaDepth构建的全新数据集NewMega和私有工业数据集上,匹配点对数量和匹配正确率均显著提升。
中图分类号:
向梦丽, 黄志勇, 佘雅丽, 丁妥君. 一种大视角变换场景下的图像匹配方法[J]. 图学学报, 2026, 47(1): 90-98.
XIANG Mengli, HUANG Zhiyong, SHE Yali, DING Tuojun. An image matching method for large viewpoint variation scenarios[J]. Journal of Graphics, 2026, 47(1): 90-98.
图1 方法总体架构((a) 预处理;(b) 特征提取;(c) 特征重构;(d) 粗匹配;(e) 特征融合;(f) 精匹配;(g) 逆变换)
Fig. 1 Overall framework of the method ((a) Preprocessing;(b) Feature extraction;(c) Feature reconstruction;(d) Coarse matching;(e) Feature fusion;(f) Fine matching;(g) Inverse transformation)
图2 双阶段SIFT视角矫正过程((a) 第一阶段;(b) RANSAC筛选;(c) 第二阶段;(d) 单应性变换)
Fig. 2 Two-stage SIFT-based viewpoint rectification Process ((a) First stage; (b) RANSAC filtering; (c) Second stage; (d) Homography warping)
| 方法 | @3px | @5px | @10px |
|---|---|---|---|
| DISK+NN | 52.3 | 64.9 | 78.9 |
| SP+SuperGlue | 53.9 | 68.3 | 81.7 |
| SP+LightGlue | 54.5 | 68.7 | 82.1 |
| LoFTR | 65.9 | 75.6 | 84.6 |
| MatchFormer | 66.2 | 76.1 | 85.6 |
| E-LoFTR | 66.5 | 76.4 | 85.5 |
| Ours | 67.0 | 77.1 | 86.2 |
表1 单应性估计结果对比/%
Table 1 Comparison of homography estimation results/%
| 方法 | @3px | @5px | @10px |
|---|---|---|---|
| DISK+NN | 52.3 | 64.9 | 78.9 |
| SP+SuperGlue | 53.9 | 68.3 | 81.7 |
| SP+LightGlue | 54.5 | 68.7 | 82.1 |
| LoFTR | 65.9 | 75.6 | 84.6 |
| MatchFormer | 66.2 | 76.1 | 85.6 |
| E-LoFTR | 66.5 | 76.4 | 85.5 |
| Ours | 67.0 | 77.1 | 86.2 |
图4 NewMega示例图((a) 原始图像;(b) 旋转60°后的图像;(c) 旋转90°后的图像;(d) 弱单应性变换图像;(e) 强单应性变换图像)
Fig. 4 Example images from NewMega dataset ((a) Original image; (b) Image rotated by 60°; (c) Image rotated by 90°; (d) Image with weak homography transformation; (e) Image with strong homography transformation)
| 方法 | NewMega (r = 60) | NewMega (r =90) | NewMega (H-easy) | NewMega (H-hard) | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| m | p | MSR/% | m | p | MSR/% | m | p | MSR/% | m | p | MSR/% | |
| SIFT | 5 995 | 6 340 | 94.6 | 7 446 | 7 643 | 97.4 | 2 558 | 2 877 | 88.9 | 571 | 759 | 75.2 |
| LightGlue | 0 | 15 | 0 | 0 | 14 | 0 | 790 | 1 449 | 54.5 | 649 | 1 207 | 53.7 |
| E-LoFTR | 0 | 0 | 0 | 0 | 2 | 0 | 5 602 | 6 542 | 85.6 | 1 755 | 3 063 | 57.2 |
| Ours | 13 310 | 13 366 | 99.6 | 12 404 | 12 515 | 99.1 | 16 330 | 16 341 | 99.9 | 14 531 | 15 879 | 91.5 |
表2 NewMega上的图像匹配结果对比/(m/p/MSR)
Table 2 Image matching results on NewMega dataset/(m/p/MSR)
| 方法 | NewMega (r = 60) | NewMega (r =90) | NewMega (H-easy) | NewMega (H-hard) | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| m | p | MSR/% | m | p | MSR/% | m | p | MSR/% | m | p | MSR/% | |
| SIFT | 5 995 | 6 340 | 94.6 | 7 446 | 7 643 | 97.4 | 2 558 | 2 877 | 88.9 | 571 | 759 | 75.2 |
| LightGlue | 0 | 15 | 0 | 0 | 14 | 0 | 790 | 1 449 | 54.5 | 649 | 1 207 | 53.7 |
| E-LoFTR | 0 | 0 | 0 | 0 | 2 | 0 | 5 602 | 6 542 | 85.6 | 1 755 | 3 063 | 57.2 |
| Ours | 13 310 | 13 366 | 99.6 | 12 404 | 12 515 | 99.1 | 16 330 | 16 341 | 99.9 | 14 531 | 15 879 | 91.5 |
图6 OurData示例图((a) 原始图像;(b) 旋转60°后的图像;(c) 旋转90°后的图像;(d) 弱单应性变换图像;(e) 强单应性变换图像)
Fig. 6 Example images from OurData dataset ((a) Original image; (b) Image rotated by 60°; (c) Image rotated by 90°; (d) Image with weak homography transformation; (e) Image with strong homography transformation)
| 方法 | OurData (r=60) | OurData (r=90) | OurData (H-easy) | OurData (H-hard) | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| m | p | MSR/% | m | p | MSR/% | m | p | MSR/% | m | p | MSR/% | |
| SIFT | 856 | 959 | 89.3 | 1 305 | 1 349 | 96.7 | 329 | 503 | 65.4 | 117 | 251 | 46.6 |
| LightGlue | 1 | 20 | 5.0 | 0 | 35 | 0 | 485 | 789 | 61.4 | 373 | 691 | 53.9 |
| E-LoFTR | 6 | 71 | 8.5 | 9 | 647 | 1.4 | 3 352 | 4 107 | 81.6 | 1 432 | 2 847 | 50.3 |
| Ours | 7 843 | 8 640 | 90.8 | 7 920 | 8 682 | 91.2 | 9 575 | 10 093 | 94.9 | 9 286 | 9 623 | 96.5 |
表3 OurData上的图像匹配结果对比
Table 3 Image matching results on OurData dataset
| 方法 | OurData (r=60) | OurData (r=90) | OurData (H-easy) | OurData (H-hard) | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| m | p | MSR/% | m | p | MSR/% | m | p | MSR/% | m | p | MSR/% | |
| SIFT | 856 | 959 | 89.3 | 1 305 | 1 349 | 96.7 | 329 | 503 | 65.4 | 117 | 251 | 46.6 |
| LightGlue | 1 | 20 | 5.0 | 0 | 35 | 0 | 485 | 789 | 61.4 | 373 | 691 | 53.9 |
| E-LoFTR | 6 | 71 | 8.5 | 9 | 647 | 1.4 | 3 352 | 4 107 | 81.6 | 1 432 | 2 847 | 50.3 |
| Ours | 7 843 | 8 640 | 90.8 | 7 920 | 8 682 | 91.2 | 9 575 | 10 093 | 94.9 | 9 286 | 9 623 | 96.5 |
| 方法 | @5° | @10° | @20° |
|---|---|---|---|
| 基础模型E-LoFTR | 56.4 | 72.2 | 83.5 |
| A | 56.9 | 72.5 | 83.8 |
| B | 56.7 | 72.4 | 83.7 |
| C | 56.6 | 72.4 | 83.6 |
| D | 57.1 | 72.7 | 83.9 |
表4 消融实验
Table 4 Ablation study
| 方法 | @5° | @10° | @20° |
|---|---|---|---|
| 基础模型E-LoFTR | 56.4 | 72.2 | 83.5 |
| A | 56.9 | 72.5 | 83.8 |
| B | 56.7 | 72.4 | 83.7 |
| C | 56.6 | 72.4 | 83.6 |
| D | 57.1 | 72.7 | 83.9 |
| [1] |
MUR-ARTAL R, TARDÓS J D. ORB-SLAM2: an open-source SLAM system for monocular, stereo, and RGB-D cameras[J]. IEEE Transactions on Robotics, 2017, 33(5): 1255-1262.
DOI URL |
| [2] |
BROWN M, LOWE D G. Automatic panoramic image stitching using invariant features[J]. International Journal of Computer Vision, 2007, 74(1): 59-73.
DOI URL |
| [3] |
CADENA C, CARLONE L, CARRILLO H, et al. Past, present, and future of simultaneous localization and mapping: toward the robust-perception age[J]. IEEE Transactions on Robotics, 2016, 32(6): 1309-1332.
DOI URL |
| [4] | 石虹, 徐伟, 刘少清. 基于LW-LoFTR的增强现实三维注册算法[J]. 西安工程大学学报, 2025, 39(2): 75-83. |
| SHI H, XU W, LIU S Q. Augmented reality 3D registration algorithm based on LW-LoFTR[J]. Journal of Xi’an Polytechnic University, 2025, 39(2): 75-83 (in Chinese). | |
| [5] | 舒军, 王江舸, 杨莉, 等. 改进R-LoFTR++的智能巡检特征匹配算法[J]. 重庆理工大学学报(自然科学), 2025, 39(2): 86-96. |
| SHU J, WANG J G, YANG L, et al. Intelligent inspection feature matching algorithm of R-LoFTR++[J]. Journal of Chongqing University of Technology (Natural Science), 2025, 39(2): 86-96 (in Chinese). | |
| [6] |
LOWE D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004, 60(2): 91-110.
DOI |
| [7] |
BAY H, ESS A, TUYTELAARS T, et al. Speeded-up robust features (SURF)[J]. Computer Vision and Image Understanding, 2008, 110(3): 346-359.
DOI URL |
| [8] | RUBLEE E, RABAUD V, KONOLIGE K, et al. ORB: an efficient alternative to SIFT or SURF[C]// 2011 International Conference on Computer Vision. New York: IEEE Press, 2011: 2564-2571. |
| [9] | ROSTEN E, DRUMMOND T. Machine learning for high-speed corner detection[C]// The 9th European Conference on Computer Vision. Cham: Springer, 2006: 430-443. |
| [10] | CALONDER M, LEPETIT V, STRECHA C, et al. BRIEF: binary robust independent elementary features[C]// The 11th European Conference on Computer Vision. Cham: Springer, 2010: 778-792. |
| [11] | SARLIN P E, DETONE D, MALISIEWICZ T, et al. SuperGlue: learning feature matching with graph neural networks[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 4937-4946. |
| [12] | DETONE D, MALISIEWICZ T, RABINOVICH A. SuperPoint: self-supervised interest point detection and description[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. New York: IEEE Press, 2018: 337-349. |
| [13] | LINDENBERGER P, SARLIN P E, POLLEFEYS M. LightGlue: local feature matching at light speed[C]// 2023 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2023: 17581-17592. |
| [14] | SHI Y, CAI J X, SHAVIT Y, et al. ClusterGNN: cluster-based coarse-to-fine graph neural network for efficient feature matching[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 12507-12516. |
| [15] | JIANG H W, KARPUR A, CAO B Y, et al. OmniGlue: generalizable feature matching with foundation model guidance[C]// 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2024: 19865-19875. |
| [16] |
ZHENG F J, CAO C Q, ZHANG Z Y, et al. Ada-Matcher: a deep detector-based local feature matcher with adaptive weight sharing[J]. Knowledge-Based Systems, 2025, 316: 113350.
DOI URL |
| [17] | SUN J M, SHEN Z H, WANG Y, et al. LoFTR: detector-free local feature matching with transformers[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 8918-8927. |
| [18] | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// The 31st International Conference on Neural Information Processing Systems. New York: ACM, 2017: 6000-6010. |
| [19] | WANG Q, ZHANG J M, YANG K L, et al. MatchFormer: interleaving attention in transformers for feature matching[C]// The 16th Asian Conference on Computer Vision. Cham: Springer, 2023: 256-273. |
| [20] | CHEN H K, LUO Z X, ZHOU L, et al. Aspanformer: detector-free image matching with adaptive span transformer[C]// The 17th European Conference on Computer Vision. Cham: Springer, 2022: 20-36. |
| [21] | YU J H, CHANG J H, HE J F, et al. Adaptive spot-guided transformer for consistent local feature matching[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 21898-21908. |
| [22] | 张震宇, 杨小冈, 卢瑞涛, 等. Ada-LoFTR:自适应图像块增强的局部特征匹配算法[EB/OL]. 电光与控制. (2024-07-14) [2025-03-25]. https://link.cnki.net/urlid/41.1227.TN.20240711.1543.005. |
| ZHANG Z Y, YANG X G, LU R T, et al. Ada-LoFTR:local feature matching algorithm for adaptive image block enhancement[EB/OL]. Electronics Optics & Control.(2024-07-14) [2025-03-25]. https://link.cnki.net/urlid/41.1227.TN.20240711.1543.005 (in Chinese). | |
| [23] |
郭印宏, 王立春, 李爽. 基于重复性和特异性约束的图像特征匹配[J]. 图学学报, 2023, 44(4): 739-746.
DOI |
| GUO Y H, WANG L C, LI S. Image feature matching based on repeatability and specificity constraints[J]. Journal of Graphics, 2023, 44(4): 739-746 (in Chinese). | |
| [24] | WANG Y F, HE X Y, PENG S D, et al. Efficient LoFTR: semi-dense local feature matching with sparse-like speed[C]// 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2024: 21666-21675. |
| [25] | DAI A, CHANG A X, SAVVA M, et al. ScanNet: richly- annotated 3D reconstructions of indoor scenes[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 2432-2443. |
| [26] | LI Z Q, SNAVELY N. MegaDepth: learning single-view depth prediction from internet photos[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 2041-2050. |
| [27] | DING X H, ZHANG X Y, MA N N, et al. RepVGG: making VGG-style convnets great again[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 13728-13737. |
| [28] | LIU W Z, LU H, FU H T, et al. Learning to upsample by learning to sample[C]// 2023 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2023: 6004-6014. |
| [1] | 翟永杰, 王紫萱, 张祯琪, 周迅琪, 王乾铭. 融合双重注意力与加权动态卷积的车辆损伤分类模型[J]. 图学学报, 2026, 47(1): 17-28. |
| [2] | 杨彪, 王学, 官铮, 龙萍. BSD-YOLO:基于动态稀疏注意力与自适应检测头的小目标车辆检测方法[J]. 图学学报, 2026, 47(1): 99-110. |
| [3] | 薄文, 琚晨, 刘维青, 张焱, 胡晶晶, 程婧晗, 张常有. 基于退化感知时序建模的装备维保时机预测方法[J]. 图学学报, 2025, 46(6): 1233-1246. |
| [4] | 于男男, 孟政宇, 房友江, 孙传昱, 殷雪峰, 张强, 魏小鹏, 杨鑫. 融合多频超图的图像语义分割方法[J]. 图学学报, 2025, 46(6): 1267-1273. |
| [5] | 张馨匀, 张力文, 周李, 罗笑南. 基于图像分块交互的咖啡果实成熟度预测模型[J]. 图学学报, 2025, 46(6): 1274-1280. |
| [6] | 肖凯, 袁玲, 储珺. 基于周期一致性和动态记忆增强的无监督无人机目标跟踪[J]. 图学学报, 2025, 46(6): 1281-1291. |
| [7] | 曹璐静, 鲁鹏. 一种基于多参考图的视频上色方法[J]. 图学学报, 2025, 46(6): 1316-1326. |
| [8] | 刘伯凯, 殷雪峰, 孙传昱, 葛慧林, 魏子麒, 姜雨彤, 朴海音, 周东生, 杨鑫. 基于深度强化学习的无人机三维场景导航方法研究[J]. 图学学报, 2025, 46(5): 1010-1017. |
| [9] | 左屿琪, 张云峰, 张秋悦, 徐英城. 基于超图表示学习和Transformer模型优化的知识感知推荐[J]. 图学学报, 2025, 46(5): 1050-1060. |
| [10] | 翟永杰, 翟邦朝, 胡哲东, 杨珂, 王乾铭, 赵晓瑜. 基于自适应特征融合金字塔与注意力机制的输电线路绝缘子缺陷检测方法[J]. 图学学报, 2025, 46(5): 950-959. |
| [11] | 杨佳熙, 于乐天, 包骐瑞, 毕胜, 麻晓斗, 杨晟琦, 姜雨彤, 方建儒, 魏小鹏, 杨鑫. 面向高光子通量环境的目标深度估计方法[J]. 图学学报, 2025, 46(4): 756-762. |
| [12] | 牛杭, 葛鑫雨, 赵晓瑜, 杨珂, 王乾铭, 翟永杰. 基于改进YOLOv8的防振锤缺陷目标检测算法[J]. 图学学报, 2025, 46(3): 532-541. |
| [13] | 于冰, 程广, 黄东晋, 丁友东. 基于双流网络融合的三维人体网格重建[J]. 图学学报, 2025, 46(3): 625-634. |
| [14] | 雷玉林, 刘利刚. 基于深度强化学习的可缓冲的物体运输和装箱[J]. 图学学报, 2025, 46(3): 697-708. |
| [15] | 张立立, 杨康, 张珂, 魏薇, 李晶, 谭洪鑫, 张翔宇. 面向柴油车辆排放黑烟的改进型YOLOv8检测算法研究[J]. 图学学报, 2025, 46(2): 249-258. |
| 阅读次数 | ||||||
|
全文 |
|
|||||
|
摘要 |
|
|||||