Journal of Graphics ›› 2023, Vol. 44 ›› Issue (2): 260-270.DOI: 10.11996/JG.j.2095-302X.2023020260
Previous Articles Next Articles
ZENG Wu1(), ZHU Heng-liang1, XING Shu-li1, LIN Jiang-hong1, MAO Guo-jun1,2(
)
Received:
2022-06-02
Accepted:
2022-08-21
Online:
2023-04-30
Published:
2023-05-01
Contact:
MAO Guo-jun (1966-), professor, Ph.D. His main research interests cover data mining, big data and distributed computing. E-mail:About author:
ZENG Wu (1997-), master student. His main research interests cover image data augmentation and few-shot learning. E-mail:2201905122@smail.fjut.edu.cn
Supported by:
CLC Number:
ZENG Wu, ZHU Heng-liang, XING Shu-li, LIN Jiang-hong, MAO Guo-jun. Saliency detection-guided for image data augmentation[J]. Journal of Graphics, 2023, 44(2): 260-270.
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.txxb.com.cn/EN/10.11996/JG.j.2095-302X.2023020260
Fig. 1 Schematic diagram of the effect of different data enhancement methods to generate new samples ((a) Original image; (b) Patch image; (c) Cutout; (d) SaliencyOut; (e) CutMix; (f) SaliencyCutMix)
数据集 | 类别数 | 训练数 | 测试数 |
---|---|---|---|
CIFAR-10 | 10 | 50000 | 10000 |
CIFAR-100 | 100 | 50000 | 10000 |
Mini-ImageNet | 100 | 50000 | 10000 |
ImageNet | 1000 | 1300000 | 50000 |
Table 1 The main indicators of the dataset
数据集 | 类别数 | 训练数 | 测试数 |
---|---|---|---|
CIFAR-10 | 10 | 50000 | 10000 |
CIFAR-100 | 100 | 50000 | 10000 |
Mini-ImageNet | 100 | 50000 | 10000 |
ImageNet | 1000 | 1300000 | 50000 |
模型+方法 | 准确率 | |
---|---|---|
CIFAR-10 Top-1 | CIFAR-100 Top-1 | |
ResNet34 | 93.18 | 71.85 |
ResNet34+Cutout | 93.65 | 72.15 |
ResNet34+SaliencyOut (本文) | 93.86 | 72.87 |
ResNet34+Mixup | 93.88 | 73.13 |
ResNet34+StyleMix | 93.34 | 71.91 |
ResNet34+CutMix | 94.21 | 73.81 |
ResNet34+StyleCutMix | 94.25 | 73.73 |
ResNet34+SaliencyMix | 94.33 | 74.21 |
ResNet34+SaliencyCutMix (本文) | 94.63 | 74.70 |
ResNet110 | 94.09 | 75.94 |
ResNet110+Cutout | 94.39 | 76.43 |
ResNet110+SaliencyOut (本文) | 94.76 | 77.87 |
ResNet110+Mixup | 94. 67 | 77.58 |
ResNet110+StyleMix | 94.45 | 76.66 |
ResNet110+CutMix | 95.34 | 78.29 |
ResNet110+StyleCutMix | 95.39 | 78.02 |
ResNet110+SaliencyMix | 95.23 | 78.46 |
ResNet110+SaliencyCutMix (本文) | 95.86 | 78.96 |
PyramidNet110 ( | 95.68 | 80.24 |
PyramidNet110 + Cutout | 96.13 | 80.58 |
PyramidNet110+SaliencyOut (本文) | 96.46 | 81.21 |
PyramidNet110+Mixup | 96.11 | 81.34 |
PyramidNet110+StyleMix | 95.50 | 80.43 |
PyramidNet110+CutMix | 96.54 | 81.84 |
PyramidNet110+StyleCutMix | 96.41 | 81.99 |
PyramidNet110+SaliencyMix | 96.51 | 82.15 |
PyramidNet110+SaliencyCutMix (本文) | 96.69 | 82.27 |
Table 2 Comparison of experimental results between CIFAR-10 and CIFAR-100 (%)
模型+方法 | 准确率 | |
---|---|---|
CIFAR-10 Top-1 | CIFAR-100 Top-1 | |
ResNet34 | 93.18 | 71.85 |
ResNet34+Cutout | 93.65 | 72.15 |
ResNet34+SaliencyOut (本文) | 93.86 | 72.87 |
ResNet34+Mixup | 93.88 | 73.13 |
ResNet34+StyleMix | 93.34 | 71.91 |
ResNet34+CutMix | 94.21 | 73.81 |
ResNet34+StyleCutMix | 94.25 | 73.73 |
ResNet34+SaliencyMix | 94.33 | 74.21 |
ResNet34+SaliencyCutMix (本文) | 94.63 | 74.70 |
ResNet110 | 94.09 | 75.94 |
ResNet110+Cutout | 94.39 | 76.43 |
ResNet110+SaliencyOut (本文) | 94.76 | 77.87 |
ResNet110+Mixup | 94. 67 | 77.58 |
ResNet110+StyleMix | 94.45 | 76.66 |
ResNet110+CutMix | 95.34 | 78.29 |
ResNet110+StyleCutMix | 95.39 | 78.02 |
ResNet110+SaliencyMix | 95.23 | 78.46 |
ResNet110+SaliencyCutMix (本文) | 95.86 | 78.96 |
PyramidNet110 ( | 95.68 | 80.24 |
PyramidNet110 + Cutout | 96.13 | 80.58 |
PyramidNet110+SaliencyOut (本文) | 96.46 | 81.21 |
PyramidNet110+Mixup | 96.11 | 81.34 |
PyramidNet110+StyleMix | 95.50 | 80.43 |
PyramidNet110+CutMix | 96.54 | 81.84 |
PyramidNet110+StyleCutMix | 96.41 | 81.99 |
PyramidNet110+SaliencyMix | 96.51 | 82.15 |
PyramidNet110+SaliencyCutMix (本文) | 96.69 | 82.27 |
模型+方法 | Top-1准确率 |
---|---|
ResNet34 | 79.20 |
ResNet34 + Cutout | 78.79 |
ResNet34 + SaliencyOut (本文) | 79.34 |
ResNet34 + Mixup | 79.70 |
ResNet34 + StyleMix | 79.01 |
ResNet34 + CutMix | 80.38 |
ResNet34 + StyleCutMix | 80.10 |
ResNet34 + SaliencyMix | 80.72 |
ResNet34 + SaliencyCutMix (本文) | 81.56 |
Table 3 Comparison of Mini-ImageNet experimental results (%)
模型+方法 | Top-1准确率 |
---|---|
ResNet34 | 79.20 |
ResNet34 + Cutout | 78.79 |
ResNet34 + SaliencyOut (本文) | 79.34 |
ResNet34 + Mixup | 79.70 |
ResNet34 + StyleMix | 79.01 |
ResNet34 + CutMix | 80.38 |
ResNet34 + StyleCutMix | 80.10 |
ResNet34 + SaliencyMix | 80.72 |
ResNet34 + SaliencyCutMix (本文) | 81.56 |
模型+方法 | Top-1准确率 |
---|---|
ResNet50 | 74.92 |
ResNet50 + Cutout | 75.50 |
ResNet50 + SaliencyOut (本文) | 75.67 |
ResNet50 + Mixup | 75.79 |
ResNet50 + CutMix | 76.64 |
ResNet50 + StyleCutMix | 76.30 |
ResNet50 + SaliencyMix | 76.77 |
ResNet50 + SaliencyCutMix (本文) | 76.89 |
Table 4 Comparison of ImageNet experimental results (%)
模型+方法 | Top-1准确率 |
---|---|
ResNet50 | 74.92 |
ResNet50 + Cutout | 75.50 |
ResNet50 + SaliencyOut (本文) | 75.67 |
ResNet50 + Mixup | 75.79 |
ResNet50 + CutMix | 76.64 |
ResNet50 + StyleCutMix | 76.30 |
ResNet50 + SaliencyMix | 76.77 |
ResNet50 + SaliencyCutMix (本文) | 76.89 |
方法 | 准确率 | ||
---|---|---|---|
FGSM(1) Top-1 | FGSM(2) Top-1 | FGSM(4) Top-1 | |
Baseline | 24.33 | 15.12 | 9.45 |
Cutout | 23.46 | 13.57 | 9.98 |
SaliencyOut (本文) | 24.57 | 16.28 | 10.98 |
Mixup | 24.85 | 16.01 | 11.93 |
CutMix | 25.73 | 16. 01 | 10.48 |
StyleCutMix | 25.96 | 17.22 | 11.62 |
SaliencyMix | 26.30 | 16.50 | 10.48 |
SaliencyCutMix (本文) | 26.52 | 17.41 | 13.30 |
Table 5 Experimental results of attacks using FGSM (%)
方法 | 准确率 | ||
---|---|---|---|
FGSM(1) Top-1 | FGSM(2) Top-1 | FGSM(4) Top-1 | |
Baseline | 24.33 | 15.12 | 9.45 |
Cutout | 23.46 | 13.57 | 9.98 |
SaliencyOut (本文) | 24.57 | 16.28 | 10.98 |
Mixup | 24.85 | 16.01 | 11.93 |
CutMix | 25.73 | 16. 01 | 10.48 |
StyleCutMix | 25.96 | 17.22 | 11.62 |
SaliencyMix | 26.30 | 16.50 | 10.48 |
SaliencyCutMix (本文) | 26.52 | 17.41 | 13.30 |
方法 | Baseline | Cutout | SaliencyOut (本文) | Mixup | CutMix | StyleCutMix | SaliencyMix | SaliencyCutMix (本文) |
---|---|---|---|---|---|---|---|---|
时间 | 1.40 | 1.41 | 1.45 | 1.44 | 1.41 | 4.51 | 1.45 | 1.45 |
Table 6 Average calculation time for one iteration of different methods (min/epochs)
方法 | Baseline | Cutout | SaliencyOut (本文) | Mixup | CutMix | StyleCutMix | SaliencyMix | SaliencyCutMix (本文) |
---|---|---|---|---|---|---|---|---|
时间 | 1.40 | 1.41 | 1.45 | 1.44 | 1.41 | 4.51 | 1.45 | 1.45 |
方法 | Top-1准确率 |
---|---|
SaliencyOut (Min) | 74.21 |
SaliencyOut (本文) | 74.96 |
SaliencyCutMix (Min) | 75.41 |
SaliencyCutMix (本文) | 76.61 |
Table 7 Experiment with saliency region clipping (%)
方法 | Top-1准确率 |
---|---|
SaliencyOut (Min) | 74.21 |
SaliencyOut (本文) | 74.96 |
SaliencyCutMix (Min) | 75.41 |
SaliencyCutMix (本文) | 76.61 |
模型+方法 | 准确率 | |
---|---|---|
CIFAR-10 Top-1 | CIFAR-100 Top-1 | |
ResNet110+SaliencyOut (不加ρ) | 94.72 | 77.82 |
ResNet110+SaliencyOut (加ρ) | 94.76 | 77.87 |
ResNet110+SaliencyCutMix (不加ρ) | 95.79 | 78.88 |
ResNet110+SaliencyCutMix (加ρ) | 95.86 | 78.96 |
Table 8 Comparison of experimental results between CIFAR-10 and CIFAR-100 (%)
模型+方法 | 准确率 | |
---|---|---|
CIFAR-10 Top-1 | CIFAR-100 Top-1 | |
ResNet110+SaliencyOut (不加ρ) | 94.72 | 77.82 |
ResNet110+SaliencyOut (加ρ) | 94.76 | 77.87 |
ResNet110+SaliencyCutMix (不加ρ) | 95.79 | 78.88 |
ResNet110+SaliencyCutMix (加ρ) | 95.86 | 78.96 |
方法 | Top-1准确率 |
---|---|
SPRE | 75.29 |
MONSO | 76.61 |
Table 9 Experimental comparison of multiple significance detection methods (%)
方法 | Top-1准确率 |
---|---|
SPRE | 75.29 |
MONSO | 76.61 |
Fig. 6 Various methods generate class activation maps for new samples ((a) Original image; (b) Cutout; (c) SaliencyOut; (d) CutMix; (e) SaliencyCtuMix)
[1] | 常东良, 尹军辉, 谢吉洋, 等. 面向图像分类的基于注意力引导的Dropout[J]. 图学学报, 2021, 42(1): 32-36. |
CHANG D L, YIN J H, XIE J Y, et al. Attention-guided Dropout for image classification[J]. Journal of Graphics, 2021, 42(1): 32-36. (in Chinese) | |
[2] | KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]// The 25th International Conference on Neural Information Processing Systems-Volume 1. New York:ACM, 2012: 1097-1105. |
[3] | DEVRIES T, TAYLOR G W. Improved regularization of convolutional neural networks with cutout[EB/OL]. [2022-03- 05]. https://arxiv.org/abs/1708.04552. |
[4] | YUN S, HAN D, CHUN S, et al. CutMix: regularization strategy to train strong classifiers with localizable features[C]// 2019 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2019: 6022-6031. |
[5] | GONG C Y, WANG D L, LI M, et al. KeepAugment: a simple information-preserving data augmentation approach[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 1055-1064. |
[6] | DABOUEI A, SOLEYMANI S, TAHERKHANI F, et al. SuperMix: supervising the mixing data augmentation[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 13789-13798. |
[7] | UDDIN A F M S, MONIRA M S, SHIN W, et al. SaliencyMix: a saliency guided data augmentation strategy for better regularization[EB/OL]. [2022-03-05]. https://arxiv.org/abs/2006.01791. |
[8] | ZHONG Z, ZHENG L, KANG G L, et al. Random erasing data augmentation[C]// The AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2017: 13001-13008. |
[9] | ZHANG H Y, CISSE M, DAUPHIN Y N, et al. Mixup: beyond empirical risk minimization[EB/OL]. [2022-03-05]. https://arxiv.org/abs/1710.09412. |
[10] | HENDRYCKS D, MU N, CUBUK E D, et al. AugMix: a simple data processing method to improve robustness and uncertainty[EB/OL]. [2022-03-02]. https://arxiv.org/abs/1912.02781. |
[11] | HONG M, CHOI J, KIM G. StyleMix: separating content and style for enhanced data augmentation[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 14857-14865. |
[12] |
MONTABONE S, SOTO A. Human detection using a mobile platform and novel features derived from a visual saliency mechanism[J]. Image and Vision Computing, 2010, 28(3): 391-402.
DOI URL |
[13] |
CHENG M M, MITRA N J, HUANG X L, et al. Global contrast based salient region detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3): 569-582.
DOI URL |
[14] | LI C Y, YUAN Y C, CAI W D, et al. Robust saliency detection via regularized random walks ranking[C]// 2015 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2015: 2710-2717. |
[15] | DENG Z J, HU X W, ZHU L, et al. R³Net: recurrent residual refinement network for saliency detection[C]// The 27th International Joint Conference on Artificial Intelligence. California: International Joint Conferences on Artificial Intelligence Organization, 2018: 684-690. |
[16] | ZHANG L, DAI J, LU H C, et al. A Bi-directional message passing model for salient object detection[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 1741-1750. |
[17] | KRIZHEVSKY A. Learning multiple layers of features from tiny images[D]. Petersburg City: University of Tront, 2009. |
[18] | VINYALS O, BLUNDELL C, LILLICRAP T, et al. Matching networks for one shot learning[EB/OL]. [2022-03-02]. https://arxiv.org/abs/1606.04080. |
[19] |
RUSSAKOVSKY O, DENG J, SU H, et al. ImageNet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115(3): 211-252.
DOI URL |
[20] | HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 770-778. |
[21] | HAN D, KIM J, KIM J,. Deep pyramidal residual networks[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 6307-6315. |
[22] | GOODFELLOW I J, SHLENS J, SZEGEDY C. Explaining and harnessing adversarial examples[EB/OL]. [2022-04-06] https://arxiv.53yu.com/pdf/1412.6572.pdf. |
[23] | HOU X D, ZHANG L Q. Saliency detection: a spectral residual approach[C]// 2007 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2007: 1-8. |
[24] | SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: visual explanations from deep networks via gradient-based localization[C]// 2017 IEEE International Conference on Computer Vision. New York: IEEE Press, 2017: 618-626. |
[1] | BI Chun-yan, LIU Yue. A survey of video human action recognition based on deep learning [J]. Journal of Graphics, 2023, 44(4): 625-639. |
[2] |
CAO Yi-qin , ZHOU Yi-wei , XU Lu.
A real-time metallic surface defect detection algorithm based on E-YOLOX
[J]. Journal of Graphics, 2023, 44(4): 677-690.
|
[3] |
WANG Dao-lei, KANG Bo, ZHU Rui.
Text detection method for electrical equipment nameplates
based on deep learning
[J]. Journal of Graphics, 2023, 44(4): 691-698.
|
[4] |
SHAO Jun-qi, QIAN Wen-hua, XU Qi-hao .
Landscape image generation based on conditional residual
generative adversarial network
[J]. Journal of Graphics, 2023, 44(4): 710-717.
|
[5] |
YU Wei-qun, LIU Jia-tao, ZHANG Ya-ping.
Monocular depth estimation based on Laplacian
pyramid with attention fusion
[J]. Journal of Graphics, 2023, 44(4): 728-738.
|
[6] |
GUO Yin-hong, WANG Li-chun, LI Shuang.
Image feature matching based on repeatability and
specificity constraints
[J]. Journal of Graphics, 2023, 44(4): 739-746.
|
[7] | MAO Ai-kun, LIU Xin-ming, CHEN Wen-zhuang, SONG Shao-lou. Improved substation instrument target detection method for YOLOv5 algorithm [J]. Journal of Graphics, 2023, 44(3): 448-455. |
[8] | WANG Jia-jing, WANG Chen, ZHU Yuan-yuan, WANG Xiao-mei. Graph element detection matching based on Republic of China banknotes [J]. Journal of Graphics, 2023, 44(3): 492-501. |
[9] | YANG Liu, WU Xiao-qun. 3D shape completion via deep learning: a method survey [J]. Journal of Graphics, 2023, 44(2): 201-215. |
[10] | LUO Qi-ming, WU Hao, XIA Xin, YUAN Guo-wu. Prediction of damaged areas in Yunnan murals using Dual Dense U-Net [J]. Journal of Graphics, 2023, 44(2): 304-312. |
[11] | LI Hong-an , ZHENG Qiao-xue , TAO Ruo-lin , ZHANG Min , LI Zhan-li , KANG Bao-sheng. Review of image super-resolutionbased on deep learning [J]. Journal of Graphics, 2023, 44(1): 1-15. |
[12] |
SHAO Ying-jie, YIN Hui, XIE Ying, HUANG Hua.
A sketch-guided facial image completion network via
selective recurrent inference
[J]. Journal of Graphics, 2023, 44(1): 67-76.
|
[13] |
PAN Dong-hui, JIN Ying-han, SUN Xu, LIU Yu-sheng, ZHANG Dong-liang.
CTH-Net: CNN-Transformer hybrid network for garment image
generation from sketches and color points
[J]. Journal of Graphics, 2023, 44(1): 120-130.
|
[14] |
FAN Zhen, LIU Xiao-jing, LI Xiao-bo, CUI Ya-chao.
A homography estimation method robust to illumination and occlusion
[J]. Journal of Graphics, 2023, 44(1): 166-176.
|
[15] |
ZHU Lei , LI Dong-biao , YAN Xing-zhi , LIU Xiang-yang , SHEN Cai-hua.
Intelligent detection method of tunnel cracks based on improved
Mask R-CNN deep learning algorithm
[J]. Journal of Graphics, 2023, 44(1): 177-183.
|
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||