欢迎访问《图学学报》 分享到:

图学学报 ›› 2021, Vol. 42 ›› Issue (1): 32-36.DOI: 10.11996/JG.j.2095-302X.2021010032

• 图像处理与计算机视觉 • 上一篇    下一篇

面向图像分类的基于注意力引导的 Dropout

  

  1. (1. 北京邮电大学人工智能学院,北京 100876; 2. 南水北调中线信息科技有限公司,北京 100176)
  • 出版日期:2021-02-28 发布日期:2021-01-29
  • 基金资助:
    国家重点研发计划项目(2019YFF0303300,2019YFF0303302);国家自然科学基金项目(61773071,61922015,U19B2036);北京智源 人工智能研究院项目(BAAI2020ZJ0204);北京市科技新星跨学科合作项目(Z191100001119140);中国留学基金管理委员会奖学金 (202006470036);北京邮电大学博士生创新基金资助项目(CX2020105,CX2019109) 

Attention-guided Dropout for image classification

  1. (1. School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing 100876, China; 2. South-to-North Water Diversion Middle Route Information Technology Co., Ltd., Beijing 100176, China) 
  • Online:2021-02-28 Published:2021-01-29
  • Supported by:
    National Key Research and Development Program of China (2019YFF0303300, 2019YFF0303302); National Natural Science Foundation of China (61773071, 61922015, U19B2036); Beijing Academy of Artificial Intelligence (BAAI2020ZJ0204); Beijing Nova Program Interdisciplinary Cooperation Project (Z191100001119140); Scholarship from China Scholarship Council (202006470036); BUPT Excellent Ph.D. Students Foundation (CX2020105, CX2019109) 

摘要: 当一个较大的神经网络在较少的训练数据上训练时,不可避免的会遭遇过拟合问题,进而在测 试数据集上泛化性能较差。因此,提出了多种基于 Dropout 的正则化方法来缓解这个问题。虽然,这些方法不 能直接促使模型关注有较低判别力的特征,但对减少过拟合同样非常重要。为解决该问题,提出了一种基于注 意力引导的 Dropout (AD),利用自监督机制更高效地缓解特征描述算子之间的协同自适应问题。AD 包含 2 个 独特的组件:①特征的重要性度量机制,通过 SE-Block 度量得到每个特征的重要程度;②基于可学习丢弃概 率的 Dropout,通过丢弃“较好”的特征检测算子,强迫“较差”的特征检测算子学习到一个较好的特征表示, 从而缓解特征检测算子之间的协同自适应并促使模型学习拥有较低判别力的特征。实验结果表明该方法可以容 易地被应用到各种卷积神经网络(CNN)结构里,并获得较好的性能。

关键词: 深度神经网络, 过拟合, Dropout, 自注意力机制, 图像分类

Abstract: When a large-scale neural network is trained on a small training set, it typically yields “overfitting”, i.e., the model performs poorly on held-out test data. Therefore, various Dropout techniques have been proposed to alleviate this problem. However, the aforementioned methods cannot directly encourage the model to learn the less discriminative parts, which is also important to reducing overfitting. To address this problem, we proposed an attention-guided Dropout (AD), which utilized the self-attention mechanism to alleviate the co-adaptation of feature detectors more effectively. The AD comprised two distinctive components, the importance measurement mechanism for feature maps and the Dropout with a learnable probability. The importance measurement mechanism calculated the degree of importance for each feature map in whole by a Squeeze-and-Excitation block. The Dropout with a learnable probability can force the “bad” neurons to learn a better representation by dropping the “good” neurons. Therefore, it will diminish the co-adaptation and encourage models to learn the less discriminative part. The experimental results show that the proposed method can be easily applied to various convolutional neural network (CNN) architectures, thus yielding better performance. 

Key words:  , deep neural network, overfitting, Dropout, self-attention mechanism, image classification

中图分类号: