Welcome to Journal of Graphics share: 

Journal of Graphics ›› 2026, Vol. 47 ›› Issue (1): 90-98.DOI: 10.11996/JG.j.2095-302X.2026010090

• Image Processing and Computer Vision • Previous Articles     Next Articles

An image matching method for large viewpoint variation scenarios

XIANG Mengli, HUANG Zhiyong(), SHE Yali, DING Tuojun   

  1. College of Computer and Information Technology, China Three Gorges University, Yichang Hubei 443000, China
  • Received:2025-06-24 Accepted:2025-08-27 Online:2026-02-28 Published:2026-03-16
  • Contact: HUANG Zhiyong
  • Supported by:
    National Natural Science Foundation of China(62371271)

Abstract:

To address the significant decline in matching accuracy and the number of correspondences exhibited by existing image-matching methods under large viewpoint variations, an improved image-matching approach based on E-LoFTR was proposed. Firstly, based on a strategy of viewpoint rectification followed by fine-grained matching, a novel two-stage SIFT-based viewpoint-rectification module was proposed, which leveraged the viewpoint invariance of the Scale-Invariant Feature Transform (SIFT) algorithm and the geometric alignment capability of homography to enhance matching accuracy under large viewpoint variations. Then, a directional-gated attention mechanism was designed that employed a cascaded structure of multi-directional convolutions and dynamic gating to extract queries (Q), keys (K), and values (V). The injected geometric priors significantly enhanced the model’s robustness. Lastly, to mitigate information loss during the upsampling of fused features, the Fusion-DySample module was incorporated to further improve performance. Experimental results on the public MegaDepth dataset showed that our method achieved relative pose estimation AUCs of 57.1%, 72.7%, and 83.9% under rotation error thresholds of 5°, 10°, and 20°, respectively, outperforming E-LoFTR by 0.7%, 0.5%, and 0.4%. On the newly constructed NewMega dataset based on MegaDepth and on a private industrial dataset, our method also demonstrated substantial improvements in both the number of matches and matching accuracy.

Key words: image matching, E-LoFTR, large perspective variation, SIFT, attention mechanism

CLC Number: