基于神经网络架构搜索的铭牌目标检测方法

doi:10.11996/JG.j.2095-302X.2023040718

图学学报 ›› 2023, Vol. 44 ›› Issue (4): 718-727.DOI: 10.11996/JG.j.2095-302X.2023040718

• 图像处理与计算机视觉 • 上一篇下一篇

基于神经网络架构搜索的铭牌目标检测方法

邓渭铭¹(), 杨铁军², 李纯纯¹, 黄琳¹()

1.桂林理工大学广西嵌入式技术与智能系统重点实验室，广西桂林 541004
2.桂林医学院智能医学与生物技术学院，广西桂林 541199

收稿日期:2022-12-20 接受日期:2023-03-11 出版日期:2023-08-31 发布日期:2023-08-16
通讯作者: 黄琳(1980-)，女，副教授，博士。主要研究方向为计算机视觉。E-mail：hlcucu@qq.com
作者简介:
邓渭铭(1997-)，男，硕士研究生。主要研究方向为计算机视觉。E-mail：1270445316@qq.com
基金资助:
国家自然科学基金项目(62166012);国家自然科学基金项目(62266015);广西自然科学基金项目(2022GXNSFAA035644);广西嵌入式技术与智能系统重点实验室主任基金项目(2020-1-8)

Object detection for nameplate based on neural architecture search

DENG Wei-ming¹(), YANG Tie-jun², LI Chun-chun¹, HUANG Lin¹()

1. Guangxi Key Laboratory of Embedded Technology and Intelligent System, Guilin University of Technology, Guilin Guangxi 541004, China
2. College of Intelligent Medicine and Biotechnology, Guilin Medical University, Guilin Guangxi 541199, China

Received:2022-12-20 Accepted:2023-03-11 Online:2023-08-31 Published:2023-08-16
Contact: HUANG Lin (1980-), associate professor, Ph.D. Her main research interest covers computer vision. E-mail：hlcucu@qq.com
About author:
DENG Wei-ming (1997-), master student. His main research interest covers computer vision. E-mail：1270445316@qq.com
Supported by:
National Natural Science Foundation of China(62166012);National Natural Science Foundation of China(62266015);Guangxi Natural Science Foundation(2022GXNSFAA035644);Guangxi Key Laboratory Fund of Embedded Technology and Intelligent System(2020-1-8)

摘要/Abstract

摘要：

为了提高构建深度卷积神经网络(CNN)的自动化程度并进一步提高目标检测精度，提出了一种改进的基于DenseNAS的神经网络架构搜索方法以自动构建铭牌检测CNN。首先，基于改进DenseNAS的Head层，设计了可搜索的、融合深浅层特征的子网模块(CSP-Block1和CSP-Block2)。然后，基于CSP-Block1和CSP-Block2构建的搜索空间，搜索铭牌检测CNN的Backbone和Head。实验结果表明，该方法在一个铭牌5分类的数据集上，耗时约9.35 GPU hours搜索出了最佳神经网络，在测试集上检测精度mAP≈97.3%，比YOLOv5等SOTA方法更高。

关键词: 神经网络架构搜索, 卷积神经网络, CSP结构, 铭牌, 目标检测

Abstract:

In order to enhance the automation of building deep convolutional neural network (CNN) for object detection and further improve the detection accuracy, an improved DenseNAS-based neural architecture search method was proposed to automatically build a CNN for nameplate detection. First, the searchable subnet modules (CSP-Block1 and CSP-Block2) were designed to fuse deep and shallow layer feature mapping by enhancing the Head layer of DenseNAS. Subsequently, the search space was established based on the CSP-Block1 and CSP-Block2 to explore the Backbone and Head of CNN for nameplate detection. The experimental results demonstrated that the proposed method required about 9.35 GPU hours to search the optimal neural network on a nameplate dataset consisting of 5 classes, and that the detection accuracy mAP was about 97.3% on the test set, exceeding those of state-of-the-art methods, such as YOLOv5.

Key words: neural architecture search, convolutional neural network, CSP structure, nameplate, object detection

中图分类号:

TP391

邓渭铭, 杨铁军, 李纯纯, 黄琳. 基于神经网络架构搜索的铭牌目标检测方法[J]. 图学学报, 2023, 44(4): 718-727.

DENG Wei-ming, YANG Tie-jun, LI Chun-chun, HUANG Lin. Object detection for nameplate based on neural architecture search[J]. Journal of Graphics, 2023, 44(4): 718-727.

图/表 22

图1 方法流程图

Fig. 1 The method flow chart

图2 DenseNAS的子网模块结构

Fig. 2 Structure of the subnet module of DenseNAS

图3 CSP-Block1模块(红色框内为CSP结构)

Fig. 3 CSP-Block1 module (the red box shows the CSP structure)

表1 对齐层和Neck层的候选操作

Table 1 Candidate operations for the alignment and Neck layers

Operations	Code
CBS_K1	0
CBS_K3	1

表2 CSP-Block1的Basic layers候选操作

Table 2 Candidate operations for the Basic layers of CSP-Block1

Operations	Code
Resblock1	0
Skip_connect	1

图4 CBS_Kn，Resblock1和Resblock2候选操作结构

Fig. 4 Candidate operation structures of CBS_Kn, Resblock1 and Resblock2

图5 CSP-Block2模块(红色框内为CSP结构)

Fig. 5 CSP-Block2 module (the red box shows the CSP structure)

表3 CSP-Block2的Basic layers候选操作

Table 3 Candidate operations for the basic layers of CSP-Block2

Operations	Code
Resblock2	0
Skip_connect	1

图6 基于CSP-Block1模块和CSP-Block2模块构建的目标检测网络

Fig. 6 Object detection network based on CSP-Block1 and CSP-Block2 modules

表4 Backbone中不同子网的通道数、空间分辨率和CSP-Block1堆叠个数

Table 4 Number of channels, spatial resolution, and number of CSP-Block1 stacks of different subnets in the Backbone network

参数	子网
参数	1	2	3	4	5
尺寸	208	104	52	26	13
堆叠个数	1	3	4	2	1
通道数	32	64,72,80	120,128,136,144	240,256	512

图7 样本类别标记示意图

Fig. 7 Schematic diagram of sample category marking ((a) P; (b) P-T; (c) P-B; (d) P-L; (e) P-N)

表5 增强后各类别样本数量(个)

Table 5 The number of different samples after enhanced (Units)

编号	名称	训练集	验证集	测试集
0	铭牌	4620	700	1 400
1	铭牌顶部	4060	700	1 120
2	铭牌底部	4060	700	1 120
3	铭牌标识	3920	700	1 050
4	铭牌名称	3920	700	1 050
总数		20580	3500	5 740

图8 PCB数据集

Fig. 8 The PCB dataset ((a) M-H; (b) M-B; (c) O-C; (d) S-T; (e) S-P; (f) S-C)

表6 NAS-NOCSP，NAS-NOCSP1，NAS-NOCSP2和NAS-DET在铭牌测试集上的比较

Table 6 Comparison of NAS-NOCSP, NAS-NOCSP1, NAS- NOCSP2 and NAS-DET in the test set of nameplate

Method	AP					mAP	Params (M)	FPS
Method	P	P-T	P-B	P-L	P-N	mAP	Params (M)	FPS
NAS-NOCSP	0.995	0.995	0.995	0.932	0.858	0.955	7.73	74
NAS-NOCSP1	0.995	0.995	0.995	0.943	0.862	0.958	8.70	71
NAS-NOCSP2	0.995	0.995	0.995	0.952	0.871	0.961	9.34	70
NAS-DET	0.995	0.995	0.995	0.955	0.924	0.973	5.28	73

图9 NAS-NOCSP和NAS-DET在铭牌验证集上的比较

Fig. 9 Comparison of NAS-NOCSP and NAS-DET in the val set of nameplate

表7 不同网络深度和宽度搜索性能比较

Table 7 Comparison of search performance of different network depth and width

Method	SearchSpace params (M)	SearchCost (GPU hours)	Width_multiple	Depth_multiple
NAS-DET-L	125.34	14.05	1.00	1.00
NAS-DET-M	68.31	11.74	0.75	0.67
NAS-DET-S	27.71	9.35	0.50	0.33

表8 不同网络深度和宽度的测试结果比较

Table 8 Comparison of test results of different network depths and widths

Method	AP					mAP	Params (M)	FPS
Method	P	P-T	P-B	P-L	P-N	mAP	Params (M)	FPS
NAS-DET-L	0.995	0.995	0.995	0.871	0.893	0.950	44.98	35
NAS-DET-M	0.995	0.995	0.995	0.925	0.862	0.954	20.70	63
NAS-DET-S	0.995	0.995	0.995	0.955	0.924	0.973	5.28	73

表9 不同NAS方法的搜索性能比较

Table 9 Comparison of search performance of different NAS methods

Method	Search space params (M)	Search cost (GPU hours)
DARTS+YOLOv5	2.27	35.02
DenseNAS+YOLOv5	141.90	15.05
NAS-DET-S	27.71	9.35

表10 不同方法在铭牌测试集上的测试结果比较

Table 10 Comparison of test results of different methods on test set of nameplate

Method	AP					mAP	Params (M)	FPS
Method	P	P-T	P-B	P-L	P-N	mAP	Params (M)	FPS
DenseNAS+YOLOv5	0.995	0.994	0.995	0.936	0.842	0.952	11.17	54
DARTS+YOLOv5	0.995	0.995	0.995	0.925	0.862	0.954	36.03	33
YOLOv5-Shufflenetv2	0.995	0.983	0.995	0.924	0.886	0.957	5.85	57
YOLOv5-Mobilenetv2	0.995	0.995	0.995	0.955	0.866	0.961	6.62	54
YOLOv5-ResNet50	0.995	0.995	0.995	0.921	0.819	0.945	28.31	49
YOLOv5-S	0.995	0.995	0.995	0.932	0.884	0.958	7.02	75
YOLOX	0.996	0.980	0.976	0.892	0.843	0.937	8.94	76
YOLOv7	0.995	0.995	0.995	0.952	0.872	0.962	6.21	78
Sparse RCNN	0.998	0.997	0.995	0.967	0.906	0.972	106.12	13
NAS-DET-S	0.995	0.995	0.995	0.955	0.924	0.973	5.28	73

图10 NAS-DET-S搜索到的铭牌检测CNN模型

Fig. 10 CNN model for nameplate detection searched by NAS-DET-S

图11 NAS-DET-S在铭牌数据集上的部分检测结果

Fig. 11 Partial detection results of NAS-DET-S on the data set of nameplate

表11 不同方法在PCB测试集上的测试结果比较

Table 11 Comparison of test results of different methods on test set of PCB

Method	AP						mAP	Params (M)	FPS
Method	M-H	M-B	O-C	S-T	S-P	S-C	mAP	Params (M)	FPS
YOLOv5-S	0.998	0.903	0.906	0.977	0.996	0.906	0.948	7.02	73
YOLOX	0.999	0.909	0.909	0.995	0.910	0.904	0.937	8.94	74
YOLOv7	1.000	0.926	0.909	0.998	0.993	0.903	0.954	6.21	76
NAS-DET-S	1.000	0.963	0.943	0.999	0.998	0.907	0.968	5.84	72

参考文献 26

[1]	ZHAO Z Q, ZHENG P, XU S T, et al. Object detection with deep learning: a review[J]. IEEE Transactions on Neural Networks and Learning Systems, 2019, 30(11): 3212-3232. DOI URL
[2]	REN Z H, FANG F Z, YAN N, et al. State of the art in defect detection based on machine vision[J]. International Journal of Precision Engineering and Manufacturing-Green Technology, 2022, 9(2): 661-691. DOI
[3]	KONG Y H, HAN S H, LI X Y, et al. Object detection method for industrial scene based on MobileNet[C]// The 12th International Conference on Intelligent Human-Machine Systems and Cybernetics. New York: IEEE Press, 2020: 79-82.
[4]	BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. (2022-04-03) [2022-08-06]. https://arxiv.org/abs/2004.10934.
[5]	REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. DOI PMID
[6]	SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL]. [2022-08-06]. https://arxiv.org/abs/1409.1556.
[7]	HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 770-778.
[8]	SUN P Z, ZHANG R F, JIANG Y, et al. Sparse R-CNN: end-to-end object detection with learnable proposals[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 14449-14458.
[9]	ELSKEN T, METZEN J H, HUTTER F. Neural architecture search[M]// Automated Machine Learning. Cham: Springer International Publishing, 2019: 63-77.
[10]	WU B C, DAI X L, ZHANG P Z, et al. FBNet: hardware-aware efficient ConvNet design via differentiable neural architecture search[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 10726-10734.
[11]	WAN A, DAI X L, ZHANG P Z, et al. FBNetV2: differentiable neural architecture search for spatial and channel dimensions[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 12962-12971.
[12]	FANG J M, SUN Y Z, ZHANG Q, et al. Densely connected search space for more flexible neural architecture search[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 10625-10634.
[13]	HOWARD A, SANDLER M, CHEN B, et al. Searching for MobileNetV3[C]// 2019 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2020: 1314-1324.
[14]	TAN M X, CHEN B, PANG R M, et al. MnasNet: platform-aware neural architecture search for mobile[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 2815-2823.
[15]	LIU H X, SIMONYAN K, YANG Y M. DARTS: differentiable architecture search[EB/OL]. (2018-06-08) [2022-08-06]. https://arxiv.org/abs/1806.09055.
[16]	PHAM H, GUAN M Y, ZOPH B, et al. Efficient neural architecture search via parameter sharing[EB/OL]. (2018-02-11) [2022-08-06]. https://arxiv.org/abs/1802.03268.
[17]	ZOPH B, LE Q V. Neural architecture search with reinforcement learning[EB/OL]. (2016-11-04) [2022-08-06]. https://arxiv.org/abs/1611.01578.
[18]	DU X Z, LIN T Y, JIN P C, et al. SpineNet: learning scale-permuted backbone for recognition and localization[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 11589-11598.
[19]	ZHANG X B, HUANG Z H, WANG N Y, et al. You only search once: single shot neural architecture search via direct sparse optimization[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(9): 2891-2904. DOI URL
[20]	GIRSHICK R. Fast R-CNN[C]// 2015 IEEE International Conference on Computer Vision. New York: IEEE Press, 2016: 1440-1448.
[21]	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 779-788.
[22]	LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector[C]// European Conference on Computer Vision. Cham: Springer International Publishing, 2016: 21-37.
[23]	GE Z, LIU S T, WANG F, et al. YOLOX: exceeding YOLO series in 2021[EB/OL]. (2021-07-18) [2022-08-06]. https://arxiv.org/abs/2107.08430.
[24]	WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[EB/OL]. (2022-07-16) [2022-08-06]. https://arxiv.org/abs/2207.02696.
[25]	SANDLER M, HOWARD A, ZHU M L, et al. MobileNetV2: inverted residuals and linear bottlenecks[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 4510-4520.
[26]	LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 936-944.

基于神经网络架构搜索的铭牌目标检测方法

Object detection for nameplate based on neural architecture search

RichHTML

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

图/表 22

参考文献 26

相关文章 15

编辑推荐

Metrics

本文评价

[1]	毕春艳, 刘越. 基于深度学习的视频人体动作识别综述[J]. 图学学报, 2023, 44(4): 625-639.
[2]	李利霞, 王鑫, 王军, 张又元. 基于特征融合与注意力机制的无人机图像小目标检测算法[J]. 图学学报, 2023, 44(4): 658-666.
[3]	王道累, 康博, 朱瑞. 基于深度学习的电力设备铭牌文本检测方法[J]. 图学学报, 2023, 44(4): 691-698.
[4]	李鑫, 普园媛, 赵征鹏, 徐丹, 钱文华. 内容语义和风格特征匹配一致的艺术风格迁移[J]. 图学学报, 2023, 44(4): 699-709.
[5]	毛爱坤, 刘昕明, 陈文壮, 宋绍楼. 改进YOLOv5算法的变电站仪表目标检测方法[J]. 图学学报, 2023, 44(3): 448-455.
[6]	郝鹏飞, 刘立群, 顾任远. YOLO-RD-Apple果园异源图像遮挡果实检测模型[J]. 图学学报, 2023, 44(3): 456-464.
[7]	罗文宇, 傅明月. 基于YoloX-ECA模型的非法野泳野钓现场监测技术[J]. 图学学报, 2023, 44(3): 465-472.
[8]	王佳婧, 王晨, 朱媛媛, 王笑梅. 基于民国纸币的图元素匹配检索[J]. 图学学报, 2023, 44(3): 492-501.
[9]	杨柳, 吴晓群. 基于深度学习的三维形状补全研究综述[J]. 图学学报, 2023, 44(2): 201-215.
[10]	陈刚, 张培基, 龚冬冬, 于俊清. 火电厂监控视频安全服检测方法研究[J]. 图学学报, 2023, 44(2): 291-297.
[11]	成浪, 敬超. 基于改进YOLOv7的X线图像旋转目标检测[J]. 图学学报, 2023, 44(2): 324-334.
[12]	李小波, 李阳贵, 郭宁, 范震. 融合注意力机制的YOLOv5口罩检测算法[J]. 图学学报, 2023, 44(1): 16-25.
[13]	皮骏, 刘宇恒, 李久昊. 基于YOLOv5s的轻量化森林火灾检测算法研究[J]. 图学学报, 2023, 44(1): 26-32.
[14]	单芳湄, 王梦文, 李敏. 融合注意力机制的肠道息肉分割多尺度卷积神经网络[J]. 图学学报, 2023, 44(1): 50-58.
[15]	谷雨, 赵军. 列车闸瓦钎及闸瓦故障图像检测算法研究[J]. 图学学报, 2023, 44(1): 88-94.