基于矩阵2-范数池化的卷积神经网络图像识别算法

doi:10.11996/JG.j.2095-302X.2016050694

摘要/Abstract

摘要： 卷积神经网络中的池化操作可以实现图像变换的缩放不变性，并且对噪声和杂波有
很好的鲁棒性。针对图像识别中池化操作提取局部特征时忽略了隐藏在图像中的能量信息的问
题，根据图像的能量与矩阵的奇异值之间的关系，并且考虑到图像信息的主要能量集中于奇异值
中数值较大的几个，提出一种矩阵2-范数池化方法。首先将前一卷积层特征图划分为若干个互不
重叠的子块图像，然后分别计算子块图像矩阵的奇异值，将最大奇异值作为每个池化区域的统计
结果。利用5 种不同的池化方法在Cohn-Kanade、Caltech-101、MNIST 和CIFAR-10 数据集上进
行了大量实验，实验结果表明，相比较于其他方法，该方法具有更好地识别效果和稳健性。

关键词: 深度学习, 卷积神经网络, 矩阵2-范数, 池化, 奇异值

Abstract: The pooling operation in convolutional neural networks can achieve the scale invariance of
image transformations, and has better robustness to noise and clutter. In view of the problem that
pooling operation ignores the energy information hidden in the image when it extracts local features for
image recognition, according to the relationship between energy of the image and singular value of the
matrix, and taking into account the image information of the energy mainly concentrates on the larger
singular value, a pooling method based on matrix 2-norm was proposed. The former feature map of
convolutional layer is divided into several non-overlapping sub blocks, and then singular value of the
matrix is calculated. The maximum value is used as the statistical results of each pooling region.
Various numerical experiments has been carried out based on Cohn-Kanade, Caltech-101, MNIST and
CIFAR-10 database using different kinds of pooling method. Experimental results show that the
proposed method is superior in both recognition rate and robustness compared with other methods.

Key words: deep learning, convolutional neural networks, matrix 2-norm, pooling, singular value

余萍，赵继生. 基于矩阵2-范数池化的卷积神经网络图像识别算法[J]. 图学学报, DOI: 10.11996/JG.j.2095-302X.2016050694.

Yu Ping, Zhao Jisheng. Image Recognition Algorithm of Convolutional Neural Networks Based on Matrix 2-Norm Pooling[J]. Journal of Graphics, DOI: 10.11996/JG.j.2095-302X.2016050694.

[1]	廖仕敏, 刘仰川, 朱叶晨, 王艳玲, 高欣 . 一种基于 CycleGAN 改进的低剂量 CT 图像增强网络[J]. 图学学报, 2022, 43(4): 570-578.
[2]	张盾, 黄志开, 王欢, 吴义鹏, 王颖, 邹家豪. 基于多尺度特征实现超参进化的野生菌分类研究与应用[J]. 图学学报, 2022, 43(4): 580-589.
[3]	范新南, 黄伟盛, 史朋飞, 辛元雪, 朱凤婷, 周润康. 基于改进 YOLOv4 的嵌入式变电站仪表检测算法[J]. 图学学报, 2022, 43(3): 396-403.
[4]	李华恩, 赵洋, 陈缘, 张效娟. 基于递归对齐网络的黑白老卡通高清重制[J]. 图学学报, 2022, 43(3): 434-442.
[5]	姜柳, 史健勇, 付功义, 潘泽宇, 王朝宇. 基于 BIM 和深度学习的建筑平面凹凸不规则识别[J]. 图学学报, 2022, 43(3): 522-529.
[6]	林佳瑞, 程志刚, 韩宇, 尹云鹏. 基于 BERT 预训练模型的灾害推文分类方法[J]. 图学学报, 2022, 43(3): 530-536.
[7]	姜莱, 于震, 王鹏飞, 周东生, 侯亚庆 . 音频驱动跨模态视觉生成算法综述[J]. 图学学报, 2022, 43(2): 181-188.
[8]	廖志伟, 金兢, 张超凡, 杨学志. 基于分层压缩激励的 ASPP 网络单目深度估计[J]. 图学学报, 2022, 43(2): 214-222.
[9]	苏常保, 龚世才. 基于深度学习的人物肖像全自动抠图算法[J]. 图学学报, 2022, 43(2): 247-253.
[10]	何国忠, 梁宇. 基于卷积神经网络的 PCB 缺陷检测[J]. 图学学报, 2022, 43(1): 21-27.
[11]	唐晓天 , 马骏 , 李峰 , 杨雪 , 梁亮 . 基于多尺度时域 3D 卷积的视频超分辨率重建[J]. 图学学报, 2022, 43(1): 53-59.
[12]	唐静, 彭伟龙, 唐可可, 方美娥. 基于多视图网络三维形状检索的通用扰动攻击[J]. 图学学报, 2022, 43(1): 93-100.
[13]	汪玉金, 谢诚, 余蓓蓓, 向鸿鑫, 柳青. 属性语义与图谱语义融合增强的零次学习图像识别[J]. 图学学报, 2021, 42(6): 899-907.
[14]	张成 , 侯宇超 , 焦宇倩 , 白艳萍 , 李建军 . 基于三通道分离特征融合与支持向量机的混凝土图像分类研究[J]. 图学学报, 2021, 42(6): 917-923.
[15]	马欢, 冀晶晶, 刘佳豪, 刘雨婷. 面向机器人自主分割的肉品识别分类系统实现[J]. 图学学报, 2021, 42(6): 924-930.

基于矩阵2-范数池化的卷积神经网络图像识别算法

Image Recognition Algorithm of Convolutional Neural Networks Based on Matrix 2-Norm Pooling

PDF (PC)

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价