Journal of Graphics ›› 2025, Vol. 46 ›› Issue (6): 1292-1303.DOI: 10.11996/JG.j.2095-302X.2025061292
• Image Processing and Computer Vision • Previous Articles Next Articles
LI Xingchen1(
), LI Zongmin1,2(
), YANG Chaozhi1
Received:2025-02-17
Accepted:2025-04-23
Online:2025-12-30
Published:2025-12-27
Contact:
LI Zongmin
About author:First author contact:LI Xingchen (2003-), undergraduate student. His main research interest covers computer vision. E-mail:17852021063@163.com
Supported by:CLC Number:
LI Xingchen, LI Zongmin, YANG Chaozhi. Test-time adaptation algorithm based on trusted pseudo-label fine-tuning[J]. Journal of Graphics, 2025, 46(6): 1292-1303.
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.txxb.com.cn/EN/10.11996/JG.j.2095-302X.2025061292
| Methods | Acc-Top 1 | F1-Score | ||
|---|---|---|---|---|
| MNIST | Fashion | MNIST | Fashion | |
| VGG | 98.64 | 90.23 | 98.62 | 90.15 |
| ResNet | 97.98 | 90.26 | 97.97 | 90.16 |
| DenseNet | 97.79 | 87.21 | 97.75 | 86.72 |
| ViT | 71.18 | 73.38 | 70.24 | 70.96 |
| Swin-Transform | 81.74 | 71.72 | 80.13 | 68.75 |
Table 1 Results before enhancement
| Methods | Acc-Top 1 | F1-Score | ||
|---|---|---|---|---|
| MNIST | Fashion | MNIST | Fashion | |
| VGG | 98.64 | 90.23 | 98.62 | 90.15 |
| ResNet | 97.98 | 90.26 | 97.97 | 90.16 |
| DenseNet | 97.79 | 87.21 | 97.75 | 86.72 |
| ViT | 71.18 | 73.38 | 70.24 | 70.96 |
| Swin-Transform | 81.74 | 71.72 | 80.13 | 68.75 |
| Methods | Acc-Top 1 | F1-Score | ||
|---|---|---|---|---|
| MNIST | Fashion | MNIST | Fashion | |
| VGG | 98.90 | 88.56 | 98.88 | 88.09 |
| ResNet | 98.24 | 85.97 | 98.22 | 84.49 |
| DenseNet | 98.06 | 86.47 | 98.02 | 85.71 |
| ViT | 26.49 | 26.78 | 21.16 | 18.15 |
| Swin-Transform | 64.80 | 20.69 | 56.69 | 13.78 |
Table 2 Combined sampling (entropy threshold is 40%)
| Methods | Acc-Top 1 | F1-Score | ||
|---|---|---|---|---|
| MNIST | Fashion | MNIST | Fashion | |
| VGG | 98.90 | 88.56 | 98.88 | 88.09 |
| ResNet | 98.24 | 85.97 | 98.22 | 84.49 |
| DenseNet | 98.06 | 86.47 | 98.02 | 85.71 |
| ViT | 26.49 | 26.78 | 21.16 | 18.15 |
| Swin-Transform | 64.80 | 20.69 | 56.69 | 13.78 |
| Methods | Acc-Top 1 | F1-Score | ||
|---|---|---|---|---|
| MNIST | Fashion | MNIST | Fashion | |
| VGG | 98.89 | 90.12 | 98.88 | 90.08 |
| ResNet | 98.32 | 90.15 | 98.30 | 90.13 |
| DenseNet | 98.39 | 87.79 | 98.39 | 87.40 |
| ViT | 68.72 | 70.10 | 68.55 | 70.10 |
| Swin-Transform | 74.95 | 63.76 | 71.28 | 58.64 |
Table 3 Sampling by category (entropy threshold is 40%)
| Methods | Acc-Top 1 | F1-Score | ||
|---|---|---|---|---|
| MNIST | Fashion | MNIST | Fashion | |
| VGG | 98.89 | 90.12 | 98.88 | 90.08 |
| ResNet | 98.32 | 90.15 | 98.30 | 90.13 |
| DenseNet | 98.39 | 87.79 | 98.39 | 87.40 |
| ViT | 68.72 | 70.10 | 68.55 | 70.10 |
| Swin-Transform | 74.95 | 63.76 | 71.28 | 58.64 |
| Methods | Acc-Top 1 | F1-Score | ||
|---|---|---|---|---|
| MNIST | Fashion | MNIST | Fashion | |
| VGG | 99.30 | 91.99 | 99.29 | 91.99 |
| ResNet | 98.75 | 91.06 | 98.73 | 91.11 |
| DenseNet | 98.40 | 90.35 | 98.39 | 90.26 |
| ViT | 69.87 | 73.20 | 69.56 | 71.67 |
| Swin-Tansform | 11.65 | 39.41 | 7.28 | 34.38 |
Table 4 Sampling by category (entropy threshold is 40%) + original training set
| Methods | Acc-Top 1 | F1-Score | ||
|---|---|---|---|---|
| MNIST | Fashion | MNIST | Fashion | |
| VGG | 99.30 | 91.99 | 99.29 | 91.99 |
| ResNet | 98.75 | 91.06 | 98.73 | 91.11 |
| DenseNet | 98.40 | 90.35 | 98.39 | 90.26 |
| ViT | 69.87 | 73.20 | 69.56 | 71.67 |
| Swin-Tansform | 11.65 | 39.41 | 7.28 | 34.38 |
Fig. 8 The relationship between different entropy values and accuracy Acc is selected in the ResNet model (blue and red parts are the results on MNIST and Fashion, respectively)
Fig. 9 The relationship between different entropy values and accuracy Acc in the ViT model (blue and red parts are the results of MNIST and Fashion, respectively)
Fig. 16 Comparison of t-SNE data of the DenseNet model before and after FTP processing on the Fashion dataset (blue is before FTP processing, and green is after FTP processing)
| [1] |
RAWAT W, WANG Z H. Deep convolutional neural networks for image classification: a comprehensive review[J]. Neural Computation, 2017, 29(9): 2352-2449.
DOI PMID |
| [2] | VOULODIMOS A, DOULAMIS N, DOULAMIS A, et al. Deep learning for computer vision: a brief review[J]. Computational Intelligence and Neuroscience, 2018, 2018(1): 7068349. |
| [3] |
ZHUANG F Z, QI Z Y, DUAN K Y, et al. A comprehensive survey on transfer learning[J]. Proceedings of the IEEE, 2021, 109(1): 43-76.
DOI URL |
| [4] | 张明华, 牛玉莹, 杜艳玲, 等. 基于残差3DCNN和三维Gabor滤波器的高光谱图像分类[J]. 图学学报, 2021, 42(5): 729-737. |
| ZHANG M H, NIU Y Y, DU Y L, et al. Hyperspectral image classification based on residual 3DCNN and 3D Gabor filter[J]. Journal of Graphics, 2021, 42(5): 729-737 (in Chinese). | |
| [5] |
边坤, 梁慧. 基于机器学习的图案分类研究进展[J]. 图学学报, 2023, 44(3): 415-426.
DOI |
|
BIAN K, LIANG H. Research progress of pattern classification based on machine learning[J]. Journal of Graphics, 2023, 44(3): 415-426 (in Chinese).
DOI |
|
| [6] | ZHANG X X, CUI P, XU R Z, et al. Deep stable learning for out-of-distribution generalization[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 5368-5378. |
| [7] |
TANG Z C, CHEN G X, YANG H L, et al. DSIL-DDI: a domain-invariant substructure interaction learning for generalizable drug-drug interaction prediction[J]. IEEE Transactions on Neural Networks and Learning Systems, 2024, 35(8): 10552-10560.
DOI URL |
| [8] | TANG Z C, HUANG J H, CHEN G X, et al. Comprehensive view embedding learning for single-cell multimodal integration[C]// The 38th AAAI Conference on Artificial Intelligence. Washington: AAAI Press, 2024: 15292-15300. |
| [9] |
LI Y K, WANG S X, TAN G. ID-NeRF: indirect diffusion- guided neural radiance fields for generalizable view synthesis[J]. Expert Systems with Applications, 2025, 266: 126068.
DOI URL |
| [10] |
TANG Z C, CHEN G X, CHEN S Z, et al. Modal-nexus auto-encoder for multi-modality cellular data integration and imputation[J]. Nature Communications, 2024, 15(1): 9021.
DOI PMID |
| [11] | 周锐闯, 田瑾, 闫丰亭, 等. 融合外部注意力和图卷积的点云分类模型[J]. 图学学报, 2023, 44(6): 1162-1172. |
| ZHOU R C, TIAN J, YAN F T, et al. Point cloud classification model incorporating external attention and graph convolution[J]. Journal of Graphics, 2023, 44(6): 1162-1172 (in Chinese). | |
| [12] |
LIN Z M, AKIN H, RAO R O S, et al. Evolutionary-scale prediction of atomic-level protein structure with a language model[J]. Science, 2023, 379(6637): 1123-1130.
DOI PMID |
| [13] |
HE H H, CHEN G X, TANG Z C, et al. Dual modality feature fused neural network integrating binding site information for drug target affinity prediction[J]. NPJ Digital Medicine, 2025, 8(1): 67.
DOI PMID |
| [14] |
SCHÖLKOPF B, LOCATELLO F, BAUER S, et al. Toward causal representation learning[J]. Proceedings of the IEEE, 2021, 109(5): 612-634.
DOI URL |
| [15] |
LIANG J, HE R, TAN T N. A comprehensive survey on test-time adaptation under distribution shifts[J]. International Journal of Computer Vision, 2025, 133(1): 31-64.
DOI |
| [16] | PARK H, GUPTA A, WONG A. Test- time adaptation for depth completion[C]// 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2024: 20519-20529. |
| [17] | SU Y Y, XU X, JIA K. Towards real-world test-time adaptation: tri-net self-training with balanced normalization[C]// The 38th AAAI Conference on Artificial Intelligence. Washington: AAAI Press, 2024: 15126-15135. |
| [18] |
SHORTEN C, KHOSHGOFTAAR T M. A survey on image data augmentation for deep learning[J]. Journal of Big Data, 2019, 6(1): 60.
DOI |
| [19] |
GANAIE M A, HU M H, MALIK A K, et al. Ensemble deep learning: a review[J]. Engineering Applications of Artificial Intelligence, 2022, 115: 105151.
DOI URL |
| [20] | NIU S C, WU J X, ZHANG Y F, et al. Efficient test-time model adaptation without forgetting[EB/OL]. [2025-02-16]. https://proceedings.mlr.press/v162/niu22a.html. |
| [21] | LEE J, JUNG D, LEE S, et al. Entropy is not enough for test-time adaptation: from the perspective of disentangled factors[EB/OL]. [2025-02-16]. https://openreview.net/forum?id=9w3iw8wDuE. |
| [22] |
MA W J, LU J Y, WU H. Cellcano: supervised cell type identification for single cell ATAC-seq data[J]. Nature Communications, 2023, 14(1): 1864.
DOI PMID |
| [23] |
TANG Z C, CHEN G X, CHEN S Z, et al. Knowledge-based inductive bias and domain adaptation for cell type annotation[J]. Communications Biology, 2024, 7(1): 1440.
DOI PMID |
| [24] | SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: visual explanations from deep networks via gradient-based localization[C]// 2017 IEEE International Conference on Computer Vision. New York: IEEE Press, 2017: 618-626. |
| [25] | O'SHEA K, NASH P. An introduction to convolutional neural networks[EB/OL]. [2025-02-16]. https://arxiv.org/abs/1511.08458. |
| [26] | KHAN S, NASEER M, HAYAT M, et al. Transformers in vision: a survey[J]. ACM Computing Surveys (CSUR), 2022, 54(10s): 200. |
| [27] | SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL]. [2025-02-16]. https://arxiv.org/abs/1409.1556. |
| [28] | HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 770-778. |
| [29] | HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 2261-2269. |
| [30] | DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: transformers for image recognition at scale[EB/OL]. [2025-02-16]. https://openreview.net/forum?id=YicbFdNTTy. |
| [31] | LIU Z, LIN Y T, CAO Y, et al. Swin transformer: hierarchical vision transformer using shifted windows[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 9992-10002. |
| [32] |
BREIMAN L. Bagging predictors[J]. Machine Learning, 1996, 24(2): 123-140.
DOI |
| [33] | CHEN T Q, GUESTRIN C. XGBoost: a scalable tree boosting system[C]// The 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2016: 785-794. |
| [34] |
FREUND Y, SCHAPIRE R E. A decision-theoretic generalization of on-line learning and an application to boosting[J]. Journal of Computer and System Sciences, 1997, 55(1): 119-139.
DOI URL |
| [35] | DENG J, DONG W, SOCHER R, et al. ImageNet: a large-scale hierarchical image database[C]// 2009 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2009: 248-255. |
| [36] | AMODEI D, OLAH C, STEINHARDT J, et al. Concrete problems in AI safety[EB/OL]. [2025-02-16]. https://arxiv.org/abs/1606.06565. |
| [37] | GRAVES A. Generating sequences with recurrent neural networks[EB/OL]. [2025-02-16]. https://arxiv.org/abs/1308.0850. |
| [38] | ZHONG Z, ZHENG L, KANG G L, et al. Random erasing data augmentation[C]// The 34th AAAI Conference on Artificial Intelligence. Washington: AAAI Press, 2020: 13001-13008. |
| [39] | ZHANG H Y, CISSE M, DAUPHIN Y N, et al. Mixup: beyond empirical risk minimization[EB/OL]. [2025-02-16]. https://arxiv.org/abs/1710.09412. |
| [40] | DEVRIES T, TAYLOR G W. Improved regularization of convolutional neural networks with cutout[EB/OL]. [2025-02-16]. https://arxiv.org/abs/1708.04552. |
| [41] | HENDRYCKS D, MU N, CUBUK E D, et al. AugMix: a simple data processing method to improve robustness and uncertainty[EB/OL]. [2025-02-16]. https://openreview.net/forum?id=S1gmrxHFvB. |
| [42] | LYZHOV A, MOLCHANOVA Y, ASHUKHA A, et al. Greedy policy search: a simple baseline for learnable test-time augmentation[EB/OL]. [2025-02-16]. https://proceedings.mlr.press/v124/lyzhov20a.html. |
| [43] | KIMURA M. Understanding test-time augmentation[C]// The 28th International Conference on Neural Information Processing. Cham: Springer, 2021: 558-569. |
| [44] |
TURSUN O, DENMAN S, SRIDHARAN S, et al. Learning test-time augmentation for content-Based image retrieval[J]. Computer Vision and Image Understanding, 2022, 222: 103494.
DOI URL |
| [45] | CONDE P, PREMEBIDA C. Adaptive-TTA: accuracy-consistent weighted test time augmentation method for the uncertainty calibration of deep learning classifiers[EB/OL]. [2025-02-16]. https://bmvc2022.mpi-inf.mpg.de/0869.pdf. |
| [46] | WANG D Q, SHELHAMER E, LIU S T, et al. Tent: fully test-time adaptation by entropy minimization[EB/OL]. [2025-02-16]. https://openreview.net/forum?id=uXl3bZLkr3c. |
| [47] | WANG Q, FINK O, VAN GOOL L, et al. Continual test-time domain adaptation[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 7191-7201. |
| [48] | SREENIVAS M, BISWAS S. Efficient open-world test time adaptation of vision language models[EB/OL]. [2025-02-16]. https://openreview.net/forum?id=lF9QXpfNHm. |
| [49] | VAN DER MAATEN L, HINTON G. Visualizing data using t-SNE[J]. Journal of Machine Learning Research, 2008, 9(86): 2579-2605. |
| [1] | JU Chen, DING Jiaxin, WANG Zexing, LI Guangzhao, GUAN Zhenxiang, ZHANG Changyou. Graph neural network-based method for approximating finite element shape functions [J]. Journal of Graphics, 2025, 46(6): 1161-1171. |
| [2] | YI Bin, ZHANG Libin, LIU Danying, TANG Jun, FANG Junjun, LI Wenqi. Prediction model of laser drilling ventilation rate in cigarette manufacturing process based on AMTA-Net [J]. Journal of Graphics, 2025, 46(6): 1224-1232. |
| [3] | BO Wen, JU Chen, LIU Weiqing, ZHANG Yan, HU Jingjing, CHENG Jinghan, ZHANG Changyou. Degradation-driven temporal modeling method for equipment maintenance interval prediction [J]. Journal of Graphics, 2025, 46(6): 1233-1246. |
| [4] | ZHAO Zhenbing, Ouyang Wenbin, FENG Shuo, LI Haopeng, MA Jun. A thermal image detection method for insulators incorporating within-class sparse prior knowledge and improved YOLOv8 [J]. Journal of Graphics, 2025, 46(6): 1247-1256. |
| [5] | HE Mengmeng, ZHANG Xiaoyan, LI Hongan. Lightweight skin lesion image segmentation network based on Mamba structure [J]. Journal of Graphics, 2025, 46(6): 1257-1266. |
| [6] | FAN Lexiang, MA Ji, ZHOU Dengwen. Lightweight blind super-resolution network based on degradation separation [J]. Journal of Graphics, 2025, 46(6): 1304-1315. |
| [7] | WANG Haihan. Multi object detection method for surface defects of steel arch towers based on YOLOv8-OSRA [J]. Journal of Graphics, 2025, 46(6): 1327-1336. |
| [8] | ZHU Hongmiao, ZHONG Guojie, ZHANG Yanci. Semantic segmentation of small-scale point clouds based on integration of mean shift and deep learning [J]. Journal of Graphics, 2025, 46(5): 998-1009. |
| [9] | WANG Ziyu, CAO Weiwei, CAO Yuzhu, LIU Meng, CHEN Jun, LIU Zhaobang, ZHENG Jian. Semi-supervised pulmonary airway segmentation based on dynamically decoupling intra-class regions [J]. Journal of Graphics, 2025, 46(4): 763-774. |
| [10] | WANG Daolei, DING Zijian, YANG Jun, ZHENG Shaokai, ZHU Rui, ZHAO Wenbin. Large scene reconstruction method based on voxel grid feature of NeRF [J]. Journal of Graphics, 2025, 46(3): 502-509. |
| [11] | SUN Hao, XIE Tao, HE Long, GUO Wenzhong, YU Yongfang, WU Qijun, WANG Jianwei, DONG Hui. Research on multimodal text-visual large model for robotic terrain perception algorithm [J]. Journal of Graphics, 2025, 46(3): 558-567. |
| [12] | ZHAI Yongjie, WANG Luyao, ZHAO Xiaoyu, HU Zhedong, WANG Qianming, WANG Yaru. Multi-fitting detection for transmission lines based on a cascade query-position relationship method [J]. Journal of Graphics, 2025, 46(2): 288-299. |
| [13] | PAN Shuyan, LIU Liqun. MSFAFuse: sar and optical image fusion model based on multi-scale feature information and attention mechanism [J]. Journal of Graphics, 2025, 46(2): 300-311. |
| [14] | ZHANG Tiansheng, ZHU Minfeng, REN Yiwen, WANG Chenhan, ZHANG Lidong, ZHANG Wei, CHEN Wei. BPA-SAM: box prompt augmented SAM for traditional Chinese realistic painting [J]. Journal of Graphics, 2025, 46(2): 322-331. |
| [15] | SUN Heyi, LI Yixiao, TIAN Xi, ZHANG Songhai. Image to 3D vase generation technology combining procedural content generation and diffusion models [J]. Journal of Graphics, 2025, 46(2): 332-344. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||