图学学报 ›› 2023, Vol. 44 ›› Issue (2): 260-270.DOI: 10.11996/JG.j.2095-302X.2023020260
曾武1(), 朱恒亮1, 邢树礼1, 林江宏1, 毛国君1,2(
)
收稿日期:
2022-06-02
接受日期:
2022-08-21
出版日期:
2023-04-30
发布日期:
2023-05-01
通讯作者:
毛国君(1966-),男,教授,博士。主要研究方向为数据挖掘、大数据和分布式计算。E-mail:作者简介:
曾武(1997-),男,硕士研究生。主要研究方向为图像数据增强和小样本学习。E-mail:2201905122@smail.fjut.edu.cn
基金资助:
ZENG Wu1(), ZHU Heng-liang1, XING Shu-li1, LIN Jiang-hong1, MAO Guo-jun1,2(
)
Received:
2022-06-02
Accepted:
2022-08-21
Online:
2023-04-30
Published:
2023-05-01
Contact:
MAO Guo-jun (1966-), professor, Ph.D. His main research interests cover data mining, big data and distributed computing. E-mail:About author:
ZENG Wu (1997-), master student. His main research interests cover image data augmentation and few-shot learning. E-mail:2201905122@smail.fjut.edu.cn
Supported by:
摘要:
针对多数数据增强方法在裁剪区域的选择中过于随机,以及多数方法过分关注图像中的特征显著区域而忽略了对图像中鉴别性较差区域进行加强学习,提出SaliencyOut以及SaliencyCutMix方法,旨在加强对图像中鉴别性较差区域特征的学习。具体来说,SaliencyOut首先利用显著性检测技术生成原图像的显著性映射图,之后在显著性图中寻找一个特征显著区域,接着将此区域中的像素去除。SaliencyCutMix则是将原图像的裁剪区域去除之后,使用补丁图像中相同区域的图块进行替换。通过对图像中部分特征显著区域的遮挡或替换,引导模型学习关于目标对象的其他特征。此外,针对在裁剪区域较大时,可能丢失过多显著特征区域的问题,提出在裁剪边界的选定中加入自适应缩放因子。该因子可以根据裁剪区域边界初始大小的不同,动态地调整裁剪边界。在4个数据集中的实验表明:本文方法可显著提升模型的分类性能以及抗干扰能力,优于多数先进方法。尤其是在Mini-ImageNet数据集中,应用于ResNet-34网络,SaliencyCutMix相较于CutMix的Top-1准确率提升了1.18%。
中图分类号:
曾武, 朱恒亮, 邢树礼, 林江宏, 毛国君. 显著性检测引导的图像数据增强方法[J]. 图学学报, 2023, 44(2): 260-270.
ZENG Wu, ZHU Heng-liang, XING Shu-li, LIN Jiang-hong, MAO Guo-jun. Saliency detection-guided for image data augmentation[J]. Journal of Graphics, 2023, 44(2): 260-270.
图1 不同数据增强方法生成新样本的效果示意图((a)原图像;(b)补丁图像;(c) Cutout;(d) SaliencyOut;(e) CutMix;(f) SaliencyCutMix)
Fig. 1 Schematic diagram of the effect of different data enhancement methods to generate new samples ((a) Original image; (b) Patch image; (c) Cutout; (d) SaliencyOut; (e) CutMix; (f) SaliencyCutMix)
数据集 | 类别数 | 训练数 | 测试数 |
---|---|---|---|
CIFAR-10 | 10 | 50000 | 10000 |
CIFAR-100 | 100 | 50000 | 10000 |
Mini-ImageNet | 100 | 50000 | 10000 |
ImageNet | 1000 | 1300000 | 50000 |
表1 数据集的主要指标
Table 1 The main indicators of the dataset
数据集 | 类别数 | 训练数 | 测试数 |
---|---|---|---|
CIFAR-10 | 10 | 50000 | 10000 |
CIFAR-100 | 100 | 50000 | 10000 |
Mini-ImageNet | 100 | 50000 | 10000 |
ImageNet | 1000 | 1300000 | 50000 |
模型+方法 | 准确率 | |
---|---|---|
CIFAR-10 Top-1 | CIFAR-100 Top-1 | |
ResNet34 | 93.18 | 71.85 |
ResNet34+Cutout | 93.65 | 72.15 |
ResNet34+SaliencyOut (本文) | 93.86 | 72.87 |
ResNet34+Mixup | 93.88 | 73.13 |
ResNet34+StyleMix | 93.34 | 71.91 |
ResNet34+CutMix | 94.21 | 73.81 |
ResNet34+StyleCutMix | 94.25 | 73.73 |
ResNet34+SaliencyMix | 94.33 | 74.21 |
ResNet34+SaliencyCutMix (本文) | 94.63 | 74.70 |
ResNet110 | 94.09 | 75.94 |
ResNet110+Cutout | 94.39 | 76.43 |
ResNet110+SaliencyOut (本文) | 94.76 | 77.87 |
ResNet110+Mixup | 94. 67 | 77.58 |
ResNet110+StyleMix | 94.45 | 76.66 |
ResNet110+CutMix | 95.34 | 78.29 |
ResNet110+StyleCutMix | 95.39 | 78.02 |
ResNet110+SaliencyMix | 95.23 | 78.46 |
ResNet110+SaliencyCutMix (本文) | 95.86 | 78.96 |
PyramidNet110 ( | 95.68 | 80.24 |
PyramidNet110 + Cutout | 96.13 | 80.58 |
PyramidNet110+SaliencyOut (本文) | 96.46 | 81.21 |
PyramidNet110+Mixup | 96.11 | 81.34 |
PyramidNet110+StyleMix | 95.50 | 80.43 |
PyramidNet110+CutMix | 96.54 | 81.84 |
PyramidNet110+StyleCutMix | 96.41 | 81.99 |
PyramidNet110+SaliencyMix | 96.51 | 82.15 |
PyramidNet110+SaliencyCutMix (本文) | 96.69 | 82.27 |
表2 CIFAR-10以及CIFAR-100的实验结果对比(%)
Table 2 Comparison of experimental results between CIFAR-10 and CIFAR-100 (%)
模型+方法 | 准确率 | |
---|---|---|
CIFAR-10 Top-1 | CIFAR-100 Top-1 | |
ResNet34 | 93.18 | 71.85 |
ResNet34+Cutout | 93.65 | 72.15 |
ResNet34+SaliencyOut (本文) | 93.86 | 72.87 |
ResNet34+Mixup | 93.88 | 73.13 |
ResNet34+StyleMix | 93.34 | 71.91 |
ResNet34+CutMix | 94.21 | 73.81 |
ResNet34+StyleCutMix | 94.25 | 73.73 |
ResNet34+SaliencyMix | 94.33 | 74.21 |
ResNet34+SaliencyCutMix (本文) | 94.63 | 74.70 |
ResNet110 | 94.09 | 75.94 |
ResNet110+Cutout | 94.39 | 76.43 |
ResNet110+SaliencyOut (本文) | 94.76 | 77.87 |
ResNet110+Mixup | 94. 67 | 77.58 |
ResNet110+StyleMix | 94.45 | 76.66 |
ResNet110+CutMix | 95.34 | 78.29 |
ResNet110+StyleCutMix | 95.39 | 78.02 |
ResNet110+SaliencyMix | 95.23 | 78.46 |
ResNet110+SaliencyCutMix (本文) | 95.86 | 78.96 |
PyramidNet110 ( | 95.68 | 80.24 |
PyramidNet110 + Cutout | 96.13 | 80.58 |
PyramidNet110+SaliencyOut (本文) | 96.46 | 81.21 |
PyramidNet110+Mixup | 96.11 | 81.34 |
PyramidNet110+StyleMix | 95.50 | 80.43 |
PyramidNet110+CutMix | 96.54 | 81.84 |
PyramidNet110+StyleCutMix | 96.41 | 81.99 |
PyramidNet110+SaliencyMix | 96.51 | 82.15 |
PyramidNet110+SaliencyCutMix (本文) | 96.69 | 82.27 |
模型+方法 | Top-1准确率 |
---|---|
ResNet34 | 79.20 |
ResNet34 + Cutout | 78.79 |
ResNet34 + SaliencyOut (本文) | 79.34 |
ResNet34 + Mixup | 79.70 |
ResNet34 + StyleMix | 79.01 |
ResNet34 + CutMix | 80.38 |
ResNet34 + StyleCutMix | 80.10 |
ResNet34 + SaliencyMix | 80.72 |
ResNet34 + SaliencyCutMix (本文) | 81.56 |
表3 Mini-ImageNet实验结果对比(%)
Table 3 Comparison of Mini-ImageNet experimental results (%)
模型+方法 | Top-1准确率 |
---|---|
ResNet34 | 79.20 |
ResNet34 + Cutout | 78.79 |
ResNet34 + SaliencyOut (本文) | 79.34 |
ResNet34 + Mixup | 79.70 |
ResNet34 + StyleMix | 79.01 |
ResNet34 + CutMix | 80.38 |
ResNet34 + StyleCutMix | 80.10 |
ResNet34 + SaliencyMix | 80.72 |
ResNet34 + SaliencyCutMix (本文) | 81.56 |
模型+方法 | Top-1准确率 |
---|---|
ResNet50 | 74.92 |
ResNet50 + Cutout | 75.50 |
ResNet50 + SaliencyOut (本文) | 75.67 |
ResNet50 + Mixup | 75.79 |
ResNet50 + CutMix | 76.64 |
ResNet50 + StyleCutMix | 76.30 |
ResNet50 + SaliencyMix | 76.77 |
ResNet50 + SaliencyCutMix (本文) | 76.89 |
表4 ImageNet实验结果对比(%)
Table 4 Comparison of ImageNet experimental results (%)
模型+方法 | Top-1准确率 |
---|---|
ResNet50 | 74.92 |
ResNet50 + Cutout | 75.50 |
ResNet50 + SaliencyOut (本文) | 75.67 |
ResNet50 + Mixup | 75.79 |
ResNet50 + CutMix | 76.64 |
ResNet50 + StyleCutMix | 76.30 |
ResNet50 + SaliencyMix | 76.77 |
ResNet50 + SaliencyCutMix (本文) | 76.89 |
图5 不同方法的收敛性能((a) SaliencyOut与Cutout相比;(b) SaliencyCutMix与CutMix相比)
Fig. 5 Convergence performance of different methods ((a) SaliencyOut compared with Cutout; (b) SaliencyCutMix compared with CutMix)
方法 | 准确率 | ||
---|---|---|---|
FGSM(1) Top-1 | FGSM(2) Top-1 | FGSM(4) Top-1 | |
Baseline | 24.33 | 15.12 | 9.45 |
Cutout | 23.46 | 13.57 | 9.98 |
SaliencyOut (本文) | 24.57 | 16.28 | 10.98 |
Mixup | 24.85 | 16.01 | 11.93 |
CutMix | 25.73 | 16. 01 | 10.48 |
StyleCutMix | 25.96 | 17.22 | 11.62 |
SaliencyMix | 26.30 | 16.50 | 10.48 |
SaliencyCutMix (本文) | 26.52 | 17.41 | 13.30 |
表5 使用FGSM进行攻击的实验结果(%)
Table 5 Experimental results of attacks using FGSM (%)
方法 | 准确率 | ||
---|---|---|---|
FGSM(1) Top-1 | FGSM(2) Top-1 | FGSM(4) Top-1 | |
Baseline | 24.33 | 15.12 | 9.45 |
Cutout | 23.46 | 13.57 | 9.98 |
SaliencyOut (本文) | 24.57 | 16.28 | 10.98 |
Mixup | 24.85 | 16.01 | 11.93 |
CutMix | 25.73 | 16. 01 | 10.48 |
StyleCutMix | 25.96 | 17.22 | 11.62 |
SaliencyMix | 26.30 | 16.50 | 10.48 |
SaliencyCutMix (本文) | 26.52 | 17.41 | 13.30 |
方法 | Baseline | Cutout | SaliencyOut (本文) | Mixup | CutMix | StyleCutMix | SaliencyMix | SaliencyCutMix (本文) |
---|---|---|---|---|---|---|---|---|
时间 | 1.40 | 1.41 | 1.45 | 1.44 | 1.41 | 4.51 | 1.45 | 1.45 |
表6 不同方法的平均进行一次迭代的计算时间(min/epochs)
Table 6 Average calculation time for one iteration of different methods (min/epochs)
方法 | Baseline | Cutout | SaliencyOut (本文) | Mixup | CutMix | StyleCutMix | SaliencyMix | SaliencyCutMix (本文) |
---|---|---|---|---|---|---|---|---|
时间 | 1.40 | 1.41 | 1.45 | 1.44 | 1.41 | 4.51 | 1.45 | 1.45 |
方法 | Top-1准确率 |
---|---|
SaliencyOut (Min) | 74.21 |
SaliencyOut (本文) | 74.96 |
SaliencyCutMix (Min) | 75.41 |
SaliencyCutMix (本文) | 76.61 |
表7 显著性区域裁剪的实验(%)
Table 7 Experiment with saliency region clipping (%)
方法 | Top-1准确率 |
---|---|
SaliencyOut (Min) | 74.21 |
SaliencyOut (本文) | 74.96 |
SaliencyCutMix (Min) | 75.41 |
SaliencyCutMix (本文) | 76.61 |
模型+方法 | 准确率 | |
---|---|---|
CIFAR-10 Top-1 | CIFAR-100 Top-1 | |
ResNet110+SaliencyOut (不加ρ) | 94.72 | 77.82 |
ResNet110+SaliencyOut (加ρ) | 94.76 | 77.87 |
ResNet110+SaliencyCutMix (不加ρ) | 95.79 | 78.88 |
ResNet110+SaliencyCutMix (加ρ) | 95.86 | 78.96 |
表8 在CIFAR-10以及CIFAR-100的实验结果对比(%)
Table 8 Comparison of experimental results between CIFAR-10 and CIFAR-100 (%)
模型+方法 | 准确率 | |
---|---|---|
CIFAR-10 Top-1 | CIFAR-100 Top-1 | |
ResNet110+SaliencyOut (不加ρ) | 94.72 | 77.82 |
ResNet110+SaliencyOut (加ρ) | 94.76 | 77.87 |
ResNet110+SaliencyCutMix (不加ρ) | 95.79 | 78.88 |
ResNet110+SaliencyCutMix (加ρ) | 95.86 | 78.96 |
方法 | Top-1准确率 |
---|---|
SPRE | 75.29 |
MONSO | 76.61 |
表9 多种显著性检测方法的实验对比(%)
Table 9 Experimental comparison of multiple significance detection methods (%)
方法 | Top-1准确率 |
---|---|
SPRE | 75.29 |
MONSO | 76.61 |
图6 各种方法生成新样本的类激活图((a)原图像;(b) Cutout;(c) SaliencyOut;(d) CutMix;(e) SaliencyCtuMix)
Fig. 6 Various methods generate class activation maps for new samples ((a) Original image; (b) Cutout; (c) SaliencyOut; (d) CutMix; (e) SaliencyCtuMix)
[1] | 常东良, 尹军辉, 谢吉洋, 等. 面向图像分类的基于注意力引导的Dropout[J]. 图学学报, 2021, 42(1): 32-36. |
CHANG D L, YIN J H, XIE J Y, et al. Attention-guided Dropout for image classification[J]. Journal of Graphics, 2021, 42(1): 32-36. (in Chinese) | |
[2] | KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]// The 25th International Conference on Neural Information Processing Systems-Volume 1. New York:ACM, 2012: 1097-1105. |
[3] | DEVRIES T, TAYLOR G W. Improved regularization of convolutional neural networks with cutout[EB/OL]. [2022-03- 05]. https://arxiv.org/abs/1708.04552. |
[4] | YUN S, HAN D, CHUN S, et al. CutMix: regularization strategy to train strong classifiers with localizable features[C]// 2019 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2019: 6022-6031. |
[5] | GONG C Y, WANG D L, LI M, et al. KeepAugment: a simple information-preserving data augmentation approach[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 1055-1064. |
[6] | DABOUEI A, SOLEYMANI S, TAHERKHANI F, et al. SuperMix: supervising the mixing data augmentation[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 13789-13798. |
[7] | UDDIN A F M S, MONIRA M S, SHIN W, et al. SaliencyMix: a saliency guided data augmentation strategy for better regularization[EB/OL]. [2022-03-05]. https://arxiv.org/abs/2006.01791. |
[8] | ZHONG Z, ZHENG L, KANG G L, et al. Random erasing data augmentation[C]// The AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2017: 13001-13008. |
[9] | ZHANG H Y, CISSE M, DAUPHIN Y N, et al. Mixup: beyond empirical risk minimization[EB/OL]. [2022-03-05]. https://arxiv.org/abs/1710.09412. |
[10] | HENDRYCKS D, MU N, CUBUK E D, et al. AugMix: a simple data processing method to improve robustness and uncertainty[EB/OL]. [2022-03-02]. https://arxiv.org/abs/1912.02781. |
[11] | HONG M, CHOI J, KIM G. StyleMix: separating content and style for enhanced data augmentation[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 14857-14865. |
[12] |
MONTABONE S, SOTO A. Human detection using a mobile platform and novel features derived from a visual saliency mechanism[J]. Image and Vision Computing, 2010, 28(3): 391-402.
DOI URL |
[13] |
CHENG M M, MITRA N J, HUANG X L, et al. Global contrast based salient region detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3): 569-582.
DOI URL |
[14] | LI C Y, YUAN Y C, CAI W D, et al. Robust saliency detection via regularized random walks ranking[C]// 2015 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2015: 2710-2717. |
[15] | DENG Z J, HU X W, ZHU L, et al. R³Net: recurrent residual refinement network for saliency detection[C]// The 27th International Joint Conference on Artificial Intelligence. California: International Joint Conferences on Artificial Intelligence Organization, 2018: 684-690. |
[16] | ZHANG L, DAI J, LU H C, et al. A Bi-directional message passing model for salient object detection[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 1741-1750. |
[17] | KRIZHEVSKY A. Learning multiple layers of features from tiny images[D]. Petersburg City: University of Tront, 2009. |
[18] | VINYALS O, BLUNDELL C, LILLICRAP T, et al. Matching networks for one shot learning[EB/OL]. [2022-03-02]. https://arxiv.org/abs/1606.04080. |
[19] |
RUSSAKOVSKY O, DENG J, SU H, et al. ImageNet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115(3): 211-252.
DOI URL |
[20] | HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 770-778. |
[21] | HAN D, KIM J, KIM J,. Deep pyramidal residual networks[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 6307-6315. |
[22] | GOODFELLOW I J, SHLENS J, SZEGEDY C. Explaining and harnessing adversarial examples[EB/OL]. [2022-04-06] https://arxiv.53yu.com/pdf/1412.6572.pdf. |
[23] | HOU X D, ZHANG L Q. Saliency detection: a spectral residual approach[C]// 2007 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2007: 1-8. |
[24] | SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: visual explanations from deep networks via gradient-based localization[C]// 2017 IEEE International Conference on Computer Vision. New York: IEEE Press, 2017: 618-626. |
[1] | 毕春艳, 刘越. 基于深度学习的视频人体动作识别综述[J]. 图学学报, 2023, 44(4): 625-639. |
[2] | 曹义亲 , 周一纬 , 徐露 .
基于 E-YOLOX 的实时金属表面缺陷检测算法
[J]. 图学学报, 2023, 44(4): 677-690. |
[3] | 王道累, 康博, 朱瑞.
基于深度学习的电力设备铭牌文本检测方法
[J]. 图学学报, 2023, 44(4): 691-698. |
[4] | 邵俊棋, 钱文华, 徐启豪.
基于条件残差生成对抗网络的风景图生成
[J]. 图学学报, 2023, 44(4): 710-717. |
[5] | 余伟群, 刘佳涛, 张亚萍.
融合注意力的拉普拉斯金字塔单目深度估计
[J]. 图学学报, 2023, 44(4): 728-738. |
[6] | 郭印宏, 王立春, 李爽.
基于重复性和特异性约束的图像特征匹配
[J]. 图学学报, 2023, 44(4): 739-746. |
[7] | 毛爱坤, 刘昕明, 陈文壮, 宋绍楼. 改进YOLOv5算法的变电站仪表目标检测方法[J]. 图学学报, 2023, 44(3): 448-455. |
[8] | 王佳婧, 王晨, 朱媛媛, 王笑梅. 基于民国纸币的图元素匹配检索[J]. 图学学报, 2023, 44(3): 492-501. |
[9] | 杨柳, 吴晓群. 基于深度学习的三维形状补全研究综述[J]. 图学学报, 2023, 44(2): 201-215. |
[10] | 罗启明, 吴昊, 夏信, 袁国武. 基于Dual Dense U-Net的云南壁画破损区域预测[J]. 图学学报, 2023, 44(2): 304-312. |
[11] | 李洪安 , 郑峭雪 , 陶若霖 , 张敏 , 李占利 , 康宝生 . 基于深度学习的图像超分辨率研究综述[J]. 图学学报, 2023, 44(1): 1-15. |
[12] | 邵英杰, 尹辉, 谢颖, 黄华.
草图引导的选择循环推理式人脸图像修复网络
[J]. 图学学报, 2023, 44(1): 67-76. |
[13] | 潘东辉, 金映含, 孙旭, 刘玉生, 张东亮.
CTH-Net:从线稿和颜色点生成服装图像的
CNN-Transformer 混合网络
[J]. 图学学报, 2023, 44(1): 120-130. |
[14] | 范震, 刘晓静, 李小波, 崔亚超.
一种对光照和遮挡鲁棒的单应性估计方法
[J]. 图学学报, 2023, 44(1): 166-176. |
[15] | 朱磊 , 李东彪 , 闫星志 , 刘向阳 , 沈才华 .
基于改进 Mask R-CNN 深度学习算法的隧道裂缝智能检测方法
[J]. 图学学报, 2023, 44(1): 177-183. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||