Research on multi-scale remote sensing image change detection using Swin Transformer

doi:10.11996/JG.j.2095-302X.2024050941

Abstract

Abstract:

Due to the complexity of terrain information and the diversity of change detection data, it is difficult to ensure the adequacy and effectiveness of feature extraction in remote sensing images, resulting in low reliability of detection results obtained by change detection methods. Although convolutional neural networks are widely applied in remote sensing change detection due to their advantage of effectively extracting semantic features, the inherent locality of convolutional operations limits the receptive field, making it difficult to capture global spatiotemporal information, thus limiting the modeling of long-range dependencies in the feature space. To capture long-distance semantic dependencies and extract deep global semantic features, a multi-scale feature fusion network SwinChangeNet based on the Swin Transformer was designed. Firstly, SwinChangeNet utilized a twin multi-level Swin Transformer feature encoder for long-range context modeling. Secondly, a feature difference extraction module was introduced into the encoder to calculate the multi-level feature differences before and after changes at different scales, and then the multi-scale feature maps were fused through an adaptive fusion layer. Finally, residual connections and channel attention mechanisms were introduced to decode the fused feature information, thereby generating a complete and accurate change map. Compared with seven classic and cutting-edge change detection methods on two publicly available datasets, CDD and CD-Data_GZ, the proposed model demonstrated the best performance in both datasets. In the CDD dataset, compared with the second-best performing model, the F1 score increased by 1.11% and the accuracy by 2.38%. The proposed model outperformed the others in the CD-Data_GZ dataset. Compared to the second best-performing model, the F1 score, accuracy, and recall increased by 4.78%, 4.32%, and 4.09%, respectively, showing significant improvements. The comparative experimental results demonstrated that the proposed model has superior detection performance. The stability and effectiveness of each improved module in the model were also validated through the ablation experiment. In conclusion, the model proposed in this article focused on the task of remote sensing image change detection, introducing the Swin Transformer structure. This enabled the network to more effectively encode local and global features of remote sensing images, resulting in more accurate detection results, while ensuring that the network converges efficiently on datasets with a wide variety of land features.

Key words: change detection, siamese network, Swin Transformer, multi-scale feature fusion, attention mechanism, feature difference extraction

CLC Number:

TP751

LIU Li, ZHANG Qifan, BAI Yuang, HUANG Kaiye. Research on multi-scale remote sensing image change detection using Swin Transformer[J]. Journal of Graphics, 2024, 45(5): 941-956.

Figures/Tables 28

Fig. 1 Basic flowchart of change detection

Fig. 2 Swin Transformer structure diagram

Fig. 3 Swin Transformer Block structure diagram

Fig. 4 Attention calculation window division ((a) W-MSA window division; (b) SW-MSA window division)

Fig. 5 Sliding window operation

Fig. 6 SwinChangeNet network architecture diagram

Fig. 7 Swin Transformer encoder structure diagram

Fig. 8 Siamese network structure

Fig. 9 Patch Merging diagram

Fig. 10 Fusion module structure diagram

Fig. 11 Lightweight attention decoder architecture

Fig. 12 CBAM attention module structure

Fig. 13 Channel attention module structure

Fig. 14 Spatial attention module structure

Fig. 15 Residual module structure

Fig. 16 CDD dataset cropping example ((a) T1; (b) T2; (c) Label)

Fig. 17 CD_Data_GZ dataset cropping example ((a) T1; (b) T2; (c) Label)

Table 1 Experimental evaluation indicator information

评价指标	公式	含义
总体精度 (OA)	$O A = T P + F N T P + T N + F P + F N$	正确分类的像素在总像素中占比
精确率 (P)	$P = T P T P + F P$	正确分类的像素在总正类中占比
F1分数 (F1)	$F 1 = 2 P R P + R$	评价模型综合性能
召回率 (R)	$R = T P T P + F N$	正确分类的像素在正类中占比

Table 1 Experimental evaluation indicator information

评价指标	公式	含义
总体精度 (OA)	$O A = T P + F N T P + T N + F P + F N$	正确分类的像素在总像素中占比
精确率 (P)	$P = T P T P + F P$	正确分类的像素在总正类中占比
F1分数 (F1)	$F 1 = 2 P R P + R$	评价模型综合性能
召回率 (R)	$R = T P T P + F N$	正确分类的像素在正类中占比

Table 2 Comparative experimental results of CDD dataset/%

Method	OA	P	F1	R
FC-EF	93.67	79.52	69.89	62.35
FC-Siam-conc	95.64	89.26	79.48	71.64
FC-Siam-diff	95.32	89.34	77.57	68.54
BIT	98.70	95.34	94.44	93.56
USSFC-NET	98.60	96.43	94.49	92.64
GCD-DDPM	98.86	94.76	94.93	95.10
DASUNet	98.71	94.93	94.32	93.71
SwinChangeNet	98.97	97.14	96.04	94.97

Fig. 18 Visualization analysis results of CDD dataset ((a) T1 moment image; (b) T2 moment image; (c) Label; (d) FC-EF; (e) FC-Siam-conv; (f) FC-Siam-diff; (g) BIT; (h) USSFC-NET; (i) GCD-DDPM; (j) DASUNet; (k) SwinChangeNet)

Table 3 Comparative experimental results of CD_Data_GZ dataset/%

Method	OA	P	F1	R
FC-EF	97.90	85.60	76.98	69.94
FC-Siam-conc	97.92	81.06	78.68	76.44
FC-Siam-diff	97.77	80.82	76.63	72.85
BIT	98.27	84.64	82.29	80.07
USSFC-NET	98.17	82.97	81.44	79.96
GCD-DDPM	97.09	87.92	85.26	83.86
DASUNet	96.93	86.71	83.23	77.97
SwinChangeNet	99.04	92.24	90.04	87.95

Fig. 19 Visualization analysis results of CD_Data_GZ dataset ((a) T1 moment image; (b) T2 moment image; (c) Label; (d) FC-EF; (e) FC-Siam-conv; (f) FC-Siam-diff; (g) BIT; (h) USSFC-NET; (i) GCD-DDPM; (j) DASUNet; (k) SwinChangeNet)

Table 4 Results of ablation experiments on the CDD dataset

模型结构	ST编码器	特征差异提取	CBAM	OA/%	P/%	F1-score/%	R/%
A	×	√	√	98.48	96.19	94.01	91.93
B	√	×	√	98.88	95.84	95.61	94.37
C	√	√	×	98.99	95.89	95.72	95.56
SwinChangeNet	√	√	√	98.97	97.14	96.04	94.97

Fig. 20 Visualization results of CDD dataset ablation experiments ((a) T1 moment image; (b) T2 moment image; (c) Label; (d) Model A; (e) Model B; (f) Model C; (g) SwinChangeNet)

Table 5 Results of ablation experiments on the CD_Data_GZ dataset

模型结构	ST编码器	特征差异提取	CBAM	OA/%	P/%	F1-score/%	R/%
A	×	√	√	98.07	81.36	80.59	79.83
B	√	×	√	98.83	92.91	87.52	82.72
C	√	√	×	98.94	91.16	89.11	87.15
SwinChangeNet	√	√	√	99.04	92.24	90.04	87.95

Fig. 21 Visualization results of CD_Data_GZ dataset ablation experiments ((a) T1 moment image; (b) T2 moment image; (c) Label; (d) Model A; (e) Model B; (f) Model C; (g) SwinChangeNet)

Fig. 22 Changes in the loss function during the training process ((a) FC-Siam-diff; (b) BIT; (c) USSFC-NET; (d) GCD-DDPM; (e) DASUNet; (f) SwinChangeNet)

Fig. 23 SwinChangeNet training process diagram

References 43

[1]	张雷, 陈兆倩, 谢宝蓉, 等. 遥感图像的变化检测算法综述[J]. 电脑与信息技术, 2022, 30(5): 13-15.
	ZHANG L, CHEN Z Q, XIE B R, et al. A survey of change detection algorithms for remote sensing images[J]. Computer and Information Technology, 2022, 30(5): 13-15 (in Chinese).
[2]	SHAFIQUE A, CAO G, KHAN Z, et al. Deep learning-based change detection in remote sensing images: a review[J]. Remote Sensing, 2022, 14(4): 871.
[3]	WEISMILLER R A, KRISTOF S J, SCHOLZ D K, et al. Change detection in coastal zone environments[J]. Photogrammetric Engineering and Remote Sensing, 1977, 43(12): 1533-1539.
[4]	张钰屏, 刘籽琳, 陈天其. 遥感影像变化检测研究进展[J]. 电脑与信息技术, 2023, 31(2): 15-18.
	ZHANG Y P, LIU Z L, CHEN T Q. Research progress in remote sensing image change detection[J]. Computer and Information Technology, 2023, 31(2): 15-18 (in Chinese).
[5]	杨彬, 毛银, 陈晋, 等. 深度学习的遥感变化检测综述: 文献计量与分析[J]. 遥感学报, 2023, 27(9): 1988-2005.
	YANG B, MAO Y, CHEN J, et al. Review of remote sensing change detection in deep learning: bibliometric and analysis[J]. National Remote Sensing Bulletin, 2023, 27(9): 1988-2005 (in Chinese).
[6]	LV Z Y, HUANG H T, LI X H, et al. Land cover change detection with heterogeneous remote sensing images: review, progress, and perspective[J]. Proceedings of the IEEE, 2022, 110(12): 1976-1991.
[7]	唐天俊, 王铜川. 基于半监督学习的遥感影像变化检测研究综述[J]. 现代计算机, 2024, 30(1): 61-65.
	TANG T J, WANG T C. A review of remote sensing image change detection research based on semi-supervised learning[J]. Modern Computer, 2024, 30(1): 61-65 (in Chinese).
[8]	JIANG H W, PENG M, ZHONG Y J, et al. A survey on deep learning-based change detection from high-resolution remote sensing images[J]. Remote Sensing, 2022, 14(7): 1552.
[9]	ZHENG Z, ZHONG Y F, TIAN S Q, et al. ChangeMask: deep multi-task encoder-transformer-decoder architecture for semantic change detection[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2022, 183: 228-239.
[10]	WANG D C, ZHAO F, WANG C, et al. Y-Net: a multiclass change detection network for bi-temporal remote sensing images[J]. International Journal of Remote Sensing, 2022, 43(2): 565-592.
[11]	庄姊琪. 基于Siamese卷积神经网络的高分影像城市地物变化检测[D]. 武汉: 武汉大学, 2018.
	ZHUANG Z Q. Change detection in urban area based on Siamese convolutional neural network with high-spatial- resolution remote-sensing images[D]. Wuhan: Wuhan University, 2018 (in Chinese).
[12]	QUARMBY N A, CUSHNIE J L. Monitoring urban land cover changes at the urban fringe from SPOT HRV imagery in south-east England[J]. International Journal of Remote Sensing, 1989, 10(6): 953-963.
[13]	HOWARTH P J, WICKWARE G M. Procedures for change detection using Landsat digital data[J]. International Journal of Remote Sensing, 1981, 2(3): 277-291.
[14]	LUDEKE A K, MAGGIO R C, REID L M. An analysis of anthropogenic deforestation using logistic regression and GIS[J]. Journal of Environmental Management, 1990, 31(3): 247-259.
[15]	HUANG C Q, SONG K, KIM S, et al. Use of a dark object concept and support vector machines to automate forest cover change analysis[J]. Remote Sensing of Environment, 2008, 112(3): 970-985.
[16]	IM J, JENSEN J R. A change detection model based on neighborhood correlation image analysis and decision tree classification[J]. Remote Sensing of Environment, 2005, 99(3): 326-340.
[17]	MOLINIER M, ANTROPOV O, MUTANEN T. Clear-cut mapping in Landsat 8 images with a change detection method based on the random forest algorithm[EB/OL]. [2024-02-05]. https://www.researchgate.net/publication/285768595_Clear-cut_mapping_in_Landsat_8_images_with_a_change_detection_method_based_on_the_random_forest_algorithm.
[18]	王守峰, 杨学志, 董张玉, 等. 基于Relief-PCA特征选择的遥感图像变化检测[J]. 图学学报, 2019, 40(1): 117-123. DOI
	WANG S F, YANG X Z, DONG Z Y, et al. Remote sensing image change detection based on Relief-PCA feature selection[J]. Journal of Graphics, 2019, 40(1): 117-123 (in Chinese).
[19]	LI Q Y, ZHONG R F, DU X, et al. TransUNetCD: a hybrid transformer network for change detection in optical remote-sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 5622519.
[20]	SHI W Z, ZHANG M, ZHANG R, et al. Change detection based on artificial intelligence: state-of-the-art and challenges[J]. Remote Sensing, 2020, 12(10): 1688.
[21]	YU X, FAN J F, CHEN J H, et al. NestNet: a multiscale convolutional neural network for remote sensing image change detection[J]. International Journal of Remote Sensing, 2021, 42(13): 4898-4921.
[22]	ZHANG X W, YUE Y Z, GAO W X, et al. DifUnet++: a satellite images change detection network based on Unet++ and differential pyramid[J]. IEEE Geoscience and Remote Sensing Letters, 2022, 19: 8006605.
[23]	QU J H, XU Y S, DONG W Q, et al. Dual-branch difference amplification graph convolutional network for hyperspectral image change detection[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 5519912.
[24]	CHENG G, WANG G X, HAN J W. ISNet: towards improving separability for remote sensing image change detection[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 5623811.
[25]	WU C, DU B, ZHANG L P. Fully convolutional change detection framework with generative adversarial network for unsupervised, weakly supervised and regional supervised change detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(8): 9774-9788.
[26]	FANG S, LI K Y, LI Z. Changer: feature interaction is what you need for change detection[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 5610111.
[27]	HU M Q, WU C, DU B, et al. Binary change guided hyperspectral multiclass change detection[J]. IEEE Transactions on Image Processing, 2023, 32: 791-806.
[28]	ZHANG H T, CHEN K Y, LIU C Y, et al. CDMamba: remote sensing image change detection with mamba[EB/OL]. [2024-06-04]. https://arxiv.org/abs/2406.04207.
[29]	CODEGONI A, LOMBARDI G, FERRARI A. TINYCD: a (not so) deep learning model for change detection[J]. Neural Computing and Applications, 2023, 35(11): 8471-8486.
[30]	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// The 31st International Conference on Neural Information Processing Systems. New York: ACM, 2017: 6000-6010.
[31]	CHEN H, QI Z P, SHI Z W. Remote sensing image change detection with transformers[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 5607514.
[32]	NOMAN M, FIAZ M, CHOLAKKAL H, et al. Remote sensing change detection with transformers trained from scratch[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 4704214.
[33]	BANDARA W G C, PATEL V M. A transformer-based Siamese network for change detection[C]// 2022 IEEE International Geoscience and Remote Sensing Symposium. New York: IEEE Press, 2022: 207-210.
[34]	DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: transformers for image recognition at scale[EB/OL]. [2024-02-05]. https://arxiv.org/ abs/2010.11929.
[35]	李佳琦, 王辉, 郭宇. 基于Transformer的三角形网格分类分割网络[J]. 图学学报, 2024, 45(1): 78-89. DOI
	LI J Q, WANG H, GUO Y. Classification and segmentation network based on Transformer for triangular mesh[J]. Journal of Graphics, 2024, 45(1): 78-89 (in Chinese). DOI
[36]	LIU Z, LIN Y T, CAO Y, et al. Swin transformer: hierarchical vision transformer using shifted windows[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 10012-10022.
[37]	官申珂, 林晓, 郑晓妹, 等. 结合超像素分割的多尺度特征融合图像语义分割算法[J]. 图学学报, 2021, 42(3): 406-413.
	GUAN S K, LIN X, ZHENG X M, et al. A semantic segmentation algorithm using multi-scale feature fusion with combination of superpixel segmentation[J]. Journal of Graphics, 2021, 42(3): 406-413 (in Chinese).
[38]	史彩娟, 陈厚儒, 葛录录, 等. 注意力残差多尺度特征增强的显著性实例分割[J]. 图学学报, 2021, 42(6): 883-890.
	SHI C J, CHEN H R, GE L L, et al. Salient instance segmentation via attention residual multi-scale feature enhancement[J]. Journal of Graphics, 2021, 42(6): 883-890 (in Chinese).
[39]	LEBEDEV M A, VIZILTER Y V, VYGOLOV O V, et al. Change detection in remote sensing images using conditional adversarial networks[EB/OL]. [2024-02-03]. https://www.xueshufan.com/publication/2805152403.
[40]	DAUDT R C, LE SAUX B, BOULCH A. Fully convolutional Siamese networks for change detection[C]// 2018 25th IEEE International Conference on Image Processing. New York: IEEE Press, 2018: 4063-4067.
[41]	LEI T, GENG X Z, NING H L, et al. Ultralightweight spatial-spectral feature cooperation network for change detection in remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 4402114.
[42]	WEN Y H, MA X P, ZHANG X K, et al. GCD-DDPM: a generative change detection model based on difference- feature-guided DDPM[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 5404416.
[43]	MIAO R, MENG G, ZHOU K, et al. DASUNet: a deeply supervised change detection network integrating full-scale features[J]. Scientific Reports, 2024, 14(1): 12464. DOI PMID