图学学报 ›› 2023, Vol. 44 ›› Issue (4): 718-727.DOI: 10.11996/JG.j.2095-302X.2023040718
收稿日期:
2022-12-20
接受日期:
2023-03-11
出版日期:
2023-08-31
发布日期:
2023-08-16
通讯作者:
黄琳(1980-),女,副教授,博士。主要研究方向为计算机视觉。E-mail:作者简介:
邓渭铭(1997-),男,硕士研究生。主要研究方向为计算机视觉。E-mail:1270445316@qq.com
基金资助:
DENG Wei-ming1(), YANG Tie-jun2, LI Chun-chun1, HUANG Lin1(
)
Received:
2022-12-20
Accepted:
2023-03-11
Online:
2023-08-31
Published:
2023-08-16
Contact:
HUANG Lin (1980-), associate professor, Ph.D. Her main research interest covers computer vision. E-mail:About author:
DENG Wei-ming (1997-), master student. His main research interest covers computer vision. E-mail:1270445316@qq.com
Supported by:
摘要:
为了提高构建深度卷积神经网络(CNN)的自动化程度并进一步提高目标检测精度,提出了一种改进的基于DenseNAS的神经网络架构搜索方法以自动构建铭牌检测CNN。首先,基于改进DenseNAS的Head层,设计了可搜索的、融合深浅层特征的子网模块(CSP-Block1和CSP-Block2)。然后,基于CSP-Block1和CSP-Block2构建的搜索空间,搜索铭牌检测CNN的Backbone和Head。实验结果表明,该方法在一个铭牌5分类的数据集上,耗时约9.35 GPU hours搜索出了最佳神经网络,在测试集上检测精度mAP≈97.3%,比YOLOv5等SOTA方法更高。
中图分类号:
邓渭铭, 杨铁军, 李纯纯, 黄琳. 基于神经网络架构搜索的铭牌目标检测方法[J]. 图学学报, 2023, 44(4): 718-727.
DENG Wei-ming, YANG Tie-jun, LI Chun-chun, HUANG Lin. Object detection for nameplate based on neural architecture search[J]. Journal of Graphics, 2023, 44(4): 718-727.
Operations | Code |
---|---|
CBS_K1 | 0 |
CBS_K3 | 1 |
表1 对齐层和Neck层的候选操作
Table 1 Candidate operations for the alignment and Neck layers
Operations | Code |
---|---|
CBS_K1 | 0 |
CBS_K3 | 1 |
Operations | Code |
---|---|
Resblock1 | 0 |
Skip_connect | 1 |
表2 CSP-Block1的Basic layers候选操作
Table 2 Candidate operations for the Basic layers of CSP-Block1
Operations | Code |
---|---|
Resblock1 | 0 |
Skip_connect | 1 |
Operations | Code |
---|---|
Resblock2 | 0 |
Skip_connect | 1 |
表3 CSP-Block2的Basic layers候选操作
Table 3 Candidate operations for the basic layers of CSP-Block2
Operations | Code |
---|---|
Resblock2 | 0 |
Skip_connect | 1 |
参数 | 子网 | ||||
---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | |
尺寸 | 208 | 104 | 52 | 26 | 13 |
堆叠个数 | 1 | 3 | 4 | 2 | 1 |
通道数 | 32 | 64,72,80 | 120,128,136,144 | 240,256 | 512 |
表4 Backbone中不同子网的通道数、空间分辨率和CSP-Block1堆叠个数
Table 4 Number of channels, spatial resolution, and number of CSP-Block1 stacks of different subnets in the Backbone network
参数 | 子网 | ||||
---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | |
尺寸 | 208 | 104 | 52 | 26 | 13 |
堆叠个数 | 1 | 3 | 4 | 2 | 1 |
通道数 | 32 | 64,72,80 | 120,128,136,144 | 240,256 | 512 |
编号 | 名称 | 训练集 | 验证集 | 测试集 |
---|---|---|---|---|
0 | 铭牌 | 4620 | 700 | 1 400 |
1 | 铭牌顶部 | 4060 | 700 | 1 120 |
2 | 铭牌底部 | 4060 | 700 | 1 120 |
3 | 铭牌标识 | 3920 | 700 | 1 050 |
4 | 铭牌名称 | 3920 | 700 | 1 050 |
总数 | 20580 | 3500 | 5 740 |
表5 增强后各类别样本数量(个)
Table 5 The number of different samples after enhanced (Units)
编号 | 名称 | 训练集 | 验证集 | 测试集 |
---|---|---|---|---|
0 | 铭牌 | 4620 | 700 | 1 400 |
1 | 铭牌顶部 | 4060 | 700 | 1 120 |
2 | 铭牌底部 | 4060 | 700 | 1 120 |
3 | 铭牌标识 | 3920 | 700 | 1 050 |
4 | 铭牌名称 | 3920 | 700 | 1 050 |
总数 | 20580 | 3500 | 5 740 |
Method | AP | mAP | Params (M) | FPS | ||||
---|---|---|---|---|---|---|---|---|
P | P-T | P-B | P-L | P-N | ||||
NAS-NOCSP | 0.995 | 0.995 | 0.995 | 0.932 | 0.858 | 0.955 | 7.73 | 74 |
NAS-NOCSP1 | 0.995 | 0.995 | 0.995 | 0.943 | 0.862 | 0.958 | 8.70 | 71 |
NAS-NOCSP2 | 0.995 | 0.995 | 0.995 | 0.952 | 0.871 | 0.961 | 9.34 | 70 |
NAS-DET | 0.995 | 0.995 | 0.995 | 0.955 | 0.924 | 0.973 | 5.28 | 73 |
表6 NAS-NOCSP,NAS-NOCSP1,NAS-NOCSP2和NAS-DET在铭牌测试集上的比较
Table 6 Comparison of NAS-NOCSP, NAS-NOCSP1, NAS- NOCSP2 and NAS-DET in the test set of nameplate
Method | AP | mAP | Params (M) | FPS | ||||
---|---|---|---|---|---|---|---|---|
P | P-T | P-B | P-L | P-N | ||||
NAS-NOCSP | 0.995 | 0.995 | 0.995 | 0.932 | 0.858 | 0.955 | 7.73 | 74 |
NAS-NOCSP1 | 0.995 | 0.995 | 0.995 | 0.943 | 0.862 | 0.958 | 8.70 | 71 |
NAS-NOCSP2 | 0.995 | 0.995 | 0.995 | 0.952 | 0.871 | 0.961 | 9.34 | 70 |
NAS-DET | 0.995 | 0.995 | 0.995 | 0.955 | 0.924 | 0.973 | 5.28 | 73 |
Method | SearchSpace params (M) | SearchCost (GPU hours) | Width_multiple | Depth_multiple |
---|---|---|---|---|
NAS-DET-L | 125.34 | 14.05 | 1.00 | 1.00 |
NAS-DET-M | 68.31 | 11.74 | 0.75 | 0.67 |
NAS-DET-S | 27.71 | 9.35 | 0.50 | 0.33 |
表7 不同网络深度和宽度搜索性能比较
Table 7 Comparison of search performance of different network depth and width
Method | SearchSpace params (M) | SearchCost (GPU hours) | Width_multiple | Depth_multiple |
---|---|---|---|---|
NAS-DET-L | 125.34 | 14.05 | 1.00 | 1.00 |
NAS-DET-M | 68.31 | 11.74 | 0.75 | 0.67 |
NAS-DET-S | 27.71 | 9.35 | 0.50 | 0.33 |
Method | AP | mAP | Params (M) | FPS | ||||
---|---|---|---|---|---|---|---|---|
P | P-T | P-B | P-L | P-N | ||||
NAS-DET-L | 0.995 | 0.995 | 0.995 | 0.871 | 0.893 | 0.950 | 44.98 | 35 |
NAS-DET-M | 0.995 | 0.995 | 0.995 | 0.925 | 0.862 | 0.954 | 20.70 | 63 |
NAS-DET-S | 0.995 | 0.995 | 0.995 | 0.955 | 0.924 | 0.973 | 5.28 | 73 |
表8 不同网络深度和宽度的测试结果比较
Table 8 Comparison of test results of different network depths and widths
Method | AP | mAP | Params (M) | FPS | ||||
---|---|---|---|---|---|---|---|---|
P | P-T | P-B | P-L | P-N | ||||
NAS-DET-L | 0.995 | 0.995 | 0.995 | 0.871 | 0.893 | 0.950 | 44.98 | 35 |
NAS-DET-M | 0.995 | 0.995 | 0.995 | 0.925 | 0.862 | 0.954 | 20.70 | 63 |
NAS-DET-S | 0.995 | 0.995 | 0.995 | 0.955 | 0.924 | 0.973 | 5.28 | 73 |
Method | Search space params (M) | Search cost (GPU hours) |
---|---|---|
DARTS+YOLOv5 | 2.27 | 35.02 |
DenseNAS+YOLOv5 | 141.90 | 15.05 |
NAS-DET-S | 27.71 | 9.35 |
表9 不同NAS方法的搜索性能比较
Table 9 Comparison of search performance of different NAS methods
Method | Search space params (M) | Search cost (GPU hours) |
---|---|---|
DARTS+YOLOv5 | 2.27 | 35.02 |
DenseNAS+YOLOv5 | 141.90 | 15.05 |
NAS-DET-S | 27.71 | 9.35 |
Method | AP | mAP | Params (M) | FPS | ||||
---|---|---|---|---|---|---|---|---|
P | P-T | P-B | P-L | P-N | ||||
DenseNAS+YOLOv5 | 0.995 | 0.994 | 0.995 | 0.936 | 0.842 | 0.952 | 11.17 | 54 |
DARTS+YOLOv5 | 0.995 | 0.995 | 0.995 | 0.925 | 0.862 | 0.954 | 36.03 | 33 |
YOLOv5-Shufflenetv2 | 0.995 | 0.983 | 0.995 | 0.924 | 0.886 | 0.957 | 5.85 | 57 |
YOLOv5-Mobilenetv2 | 0.995 | 0.995 | 0.995 | 0.955 | 0.866 | 0.961 | 6.62 | 54 |
YOLOv5-ResNet50 | 0.995 | 0.995 | 0.995 | 0.921 | 0.819 | 0.945 | 28.31 | 49 |
YOLOv5-S | 0.995 | 0.995 | 0.995 | 0.932 | 0.884 | 0.958 | 7.02 | 75 |
YOLOX | 0.996 | 0.980 | 0.976 | 0.892 | 0.843 | 0.937 | 8.94 | 76 |
YOLOv7 | 0.995 | 0.995 | 0.995 | 0.952 | 0.872 | 0.962 | 6.21 | 78 |
Sparse RCNN | 0.998 | 0.997 | 0.995 | 0.967 | 0.906 | 0.972 | 106.12 | 13 |
NAS-DET-S | 0.995 | 0.995 | 0.995 | 0.955 | 0.924 | 0.973 | 5.28 | 73 |
表10 不同方法在铭牌测试集上的测试结果比较
Table 10 Comparison of test results of different methods on test set of nameplate
Method | AP | mAP | Params (M) | FPS | ||||
---|---|---|---|---|---|---|---|---|
P | P-T | P-B | P-L | P-N | ||||
DenseNAS+YOLOv5 | 0.995 | 0.994 | 0.995 | 0.936 | 0.842 | 0.952 | 11.17 | 54 |
DARTS+YOLOv5 | 0.995 | 0.995 | 0.995 | 0.925 | 0.862 | 0.954 | 36.03 | 33 |
YOLOv5-Shufflenetv2 | 0.995 | 0.983 | 0.995 | 0.924 | 0.886 | 0.957 | 5.85 | 57 |
YOLOv5-Mobilenetv2 | 0.995 | 0.995 | 0.995 | 0.955 | 0.866 | 0.961 | 6.62 | 54 |
YOLOv5-ResNet50 | 0.995 | 0.995 | 0.995 | 0.921 | 0.819 | 0.945 | 28.31 | 49 |
YOLOv5-S | 0.995 | 0.995 | 0.995 | 0.932 | 0.884 | 0.958 | 7.02 | 75 |
YOLOX | 0.996 | 0.980 | 0.976 | 0.892 | 0.843 | 0.937 | 8.94 | 76 |
YOLOv7 | 0.995 | 0.995 | 0.995 | 0.952 | 0.872 | 0.962 | 6.21 | 78 |
Sparse RCNN | 0.998 | 0.997 | 0.995 | 0.967 | 0.906 | 0.972 | 106.12 | 13 |
NAS-DET-S | 0.995 | 0.995 | 0.995 | 0.955 | 0.924 | 0.973 | 5.28 | 73 |
Method | AP | mAP | Params (M) | FPS | |||||
---|---|---|---|---|---|---|---|---|---|
M-H | M-B | O-C | S-T | S-P | S-C | ||||
YOLOv5-S | 0.998 | 0.903 | 0.906 | 0.977 | 0.996 | 0.906 | 0.948 | 7.02 | 73 |
YOLOX | 0.999 | 0.909 | 0.909 | 0.995 | 0.910 | 0.904 | 0.937 | 8.94 | 74 |
YOLOv7 | 1.000 | 0.926 | 0.909 | 0.998 | 0.993 | 0.903 | 0.954 | 6.21 | 76 |
NAS-DET-S | 1.000 | 0.963 | 0.943 | 0.999 | 0.998 | 0.907 | 0.968 | 5.84 | 72 |
表11 不同方法在PCB测试集上的测试结果比较
Table 11 Comparison of test results of different methods on test set of PCB
Method | AP | mAP | Params (M) | FPS | |||||
---|---|---|---|---|---|---|---|---|---|
M-H | M-B | O-C | S-T | S-P | S-C | ||||
YOLOv5-S | 0.998 | 0.903 | 0.906 | 0.977 | 0.996 | 0.906 | 0.948 | 7.02 | 73 |
YOLOX | 0.999 | 0.909 | 0.909 | 0.995 | 0.910 | 0.904 | 0.937 | 8.94 | 74 |
YOLOv7 | 1.000 | 0.926 | 0.909 | 0.998 | 0.993 | 0.903 | 0.954 | 6.21 | 76 |
NAS-DET-S | 1.000 | 0.963 | 0.943 | 0.999 | 0.998 | 0.907 | 0.968 | 5.84 | 72 |
[1] |
ZHAO Z Q, ZHENG P, XU S T, et al. Object detection with deep learning: a review[J]. IEEE Transactions on Neural Networks and Learning Systems, 2019, 30(11): 3212-3232.
DOI URL |
[2] |
REN Z H, FANG F Z, YAN N, et al. State of the art in defect detection based on machine vision[J]. International Journal of Precision Engineering and Manufacturing-Green Technology, 2022, 9(2): 661-691.
DOI |
[3] | KONG Y H, HAN S H, LI X Y, et al. Object detection method for industrial scene based on MobileNet[C]// The 12th International Conference on Intelligent Human-Machine Systems and Cybernetics. New York: IEEE Press, 2020: 79-82. |
[4] | BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. (2022-04-03) [2022-08-06]. https://arxiv.org/abs/2004.10934. |
[5] |
REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
DOI PMID |
[6] | SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL]. [2022-08-06]. https://arxiv.org/abs/1409.1556. |
[7] | HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 770-778. |
[8] | SUN P Z, ZHANG R F, JIANG Y, et al. Sparse R-CNN: end-to-end object detection with learnable proposals[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 14449-14458. |
[9] | ELSKEN T, METZEN J H, HUTTER F. Neural architecture search[M]// Automated Machine Learning. Cham: Springer International Publishing, 2019: 63-77. |
[10] | WU B C, DAI X L, ZHANG P Z, et al. FBNet: hardware-aware efficient ConvNet design via differentiable neural architecture search[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 10726-10734. |
[11] | WAN A, DAI X L, ZHANG P Z, et al. FBNetV2: differentiable neural architecture search for spatial and channel dimensions[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 12962-12971. |
[12] | FANG J M, SUN Y Z, ZHANG Q, et al. Densely connected search space for more flexible neural architecture search[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 10625-10634. |
[13] | HOWARD A, SANDLER M, CHEN B, et al. Searching for MobileNetV3[C]// 2019 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2020: 1314-1324. |
[14] | TAN M X, CHEN B, PANG R M, et al. MnasNet: platform-aware neural architecture search for mobile[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 2815-2823. |
[15] | LIU H X, SIMONYAN K, YANG Y M. DARTS: differentiable architecture search[EB/OL]. (2018-06-08) [2022-08-06]. https://arxiv.org/abs/1806.09055. |
[16] | PHAM H, GUAN M Y, ZOPH B, et al. Efficient neural architecture search via parameter sharing[EB/OL]. (2018-02-11) [2022-08-06]. https://arxiv.org/abs/1802.03268. |
[17] | ZOPH B, LE Q V. Neural architecture search with reinforcement learning[EB/OL]. (2016-11-04) [2022-08-06]. https://arxiv.org/abs/1611.01578. |
[18] | DU X Z, LIN T Y, JIN P C, et al. SpineNet: learning scale-permuted backbone for recognition and localization[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 11589-11598. |
[19] |
ZHANG X B, HUANG Z H, WANG N Y, et al. You only search once: single shot neural architecture search via direct sparse optimization[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(9): 2891-2904.
DOI URL |
[20] | GIRSHICK R. Fast R-CNN[C]// 2015 IEEE International Conference on Computer Vision. New York: IEEE Press, 2016: 1440-1448. |
[21] | REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 779-788. |
[22] | LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector[C]// European Conference on Computer Vision. Cham: Springer International Publishing, 2016: 21-37. |
[23] | GE Z, LIU S T, WANG F, et al. YOLOX: exceeding YOLO series in 2021[EB/OL]. (2021-07-18) [2022-08-06]. https://arxiv.org/abs/2107.08430. |
[24] | WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[EB/OL]. (2022-07-16) [2022-08-06]. https://arxiv.org/abs/2207.02696. |
[25] | SANDLER M, HOWARD A, ZHU M L, et al. MobileNetV2: inverted residuals and linear bottlenecks[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 4510-4520. |
[26] | LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 936-944. |
[1] | 毕春艳, 刘越. 基于深度学习的视频人体动作识别综述[J]. 图学学报, 2023, 44(4): 625-639. |
[2] | 李利霞, 王鑫, 王军, 张又元. 基于特征融合与注意力机制的无人机图像小目标检测算法[J]. 图学学报, 2023, 44(4): 658-666. |
[3] | 王道累, 康博, 朱瑞. 基于深度学习的电力设备铭牌文本检测方法[J]. 图学学报, 2023, 44(4): 691-698. |
[4] | 李鑫, 普园媛, 赵征鹏, 徐丹, 钱文华. 内容语义和风格特征匹配一致的艺术风格迁移[J]. 图学学报, 2023, 44(4): 699-709. |
[5] | 毛爱坤, 刘昕明, 陈文壮, 宋绍楼. 改进YOLOv5算法的变电站仪表目标检测方法[J]. 图学学报, 2023, 44(3): 448-455. |
[6] | 郝鹏飞, 刘立群, 顾任远. YOLO-RD-Apple果园异源图像遮挡果实检测模型[J]. 图学学报, 2023, 44(3): 456-464. |
[7] | 罗文宇, 傅明月. 基于YoloX-ECA模型的非法野泳野钓现场监测技术[J]. 图学学报, 2023, 44(3): 465-472. |
[8] | 王佳婧, 王晨, 朱媛媛, 王笑梅. 基于民国纸币的图元素匹配检索[J]. 图学学报, 2023, 44(3): 492-501. |
[9] | 杨柳, 吴晓群. 基于深度学习的三维形状补全研究综述[J]. 图学学报, 2023, 44(2): 201-215. |
[10] | 陈刚, 张培基, 龚冬冬, 于俊清. 火电厂监控视频安全服检测方法研究[J]. 图学学报, 2023, 44(2): 291-297. |
[11] | 成浪, 敬超. 基于改进YOLOv7的X线图像旋转目标检测[J]. 图学学报, 2023, 44(2): 324-334. |
[12] | 李小波, 李阳贵, 郭宁, 范震. 融合注意力机制的YOLOv5口罩检测算法[J]. 图学学报, 2023, 44(1): 16-25. |
[13] | 皮骏, 刘宇恒, 李久昊. 基于YOLOv5s的轻量化森林火灾检测算法研究[J]. 图学学报, 2023, 44(1): 26-32. |
[14] | 单芳湄, 王梦文, 李敏. 融合注意力机制的肠道息肉分割多尺度卷积神经网络[J]. 图学学报, 2023, 44(1): 50-58. |
[15] | 谷雨, 赵军. 列车闸瓦钎及闸瓦故障图像检测算法研究[J]. 图学学报, 2023, 44(1): 88-94. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||