图学学报 ›› 2023, Vol. 44 ›› Issue (2): 324-334.DOI: 10.11996/JG.j.2095-302X.2023020324
收稿日期:
2022-09-28
接受日期:
2022-11-08
出版日期:
2023-04-30
发布日期:
2023-05-01
通讯作者:
敬超(1983-),男,副教授,博士。主要研究方向为机器学习、图像处理。E-mail:作者简介:
成浪(1995-),男,硕士研究生。主要研究方向为计算机视觉、图像处理。E-mail:862409782@qq.com
基金资助:
CHENG Lang1(), JING Chao1,2,3(
)
Received:
2022-09-28
Accepted:
2022-11-08
Online:
2023-04-30
Published:
2023-05-01
Contact:
JING Chao (1983-), associate professor, Ph.D. His main research interests cover machine learning and image processing. E-mail:About author:
CHENG Lang (1995-), master student. His main research interests cover computer vision and image processing. E-mail:862409782@qq.com
Supported by:
摘要:
针对X线图像违禁品目标检测中存在的识别定位困难以及忽略物品方向性的问题,提出了一种基于改进YOLOv7的X线图像旋转目标检测算法。首先,通过在原网络中融合高效注意力机制模块提高模型对深层重要特征的提取能力;然后,改进扩展的高效长程注意力机制的特征融合路径,在模块之间增加跳跃连接和1×1卷积架构,使网络提取更丰富的物品特征;最后,针对X线图像中违禁品放置方向任意的问题,使用密集编码标签表示法对角度进行离散化处理,提高违禁品定位的准确性。实验结果表明,改进的算法在HiXray,OPIXray和PIDray数据集上分别取得了91.2%,92.6%和66.4%的检测精度,较原YOLOv7模型分别提高了20.2%,10.6%和15.5%,在有效提高X线图像违禁品检测精度的基础上,为保障公共安全提供了很好的技术支持。
中图分类号:
成浪, 敬超. 基于改进YOLOv7的X线图像旋转目标检测[J]. 图学学报, 2023, 44(2): 324-334.
CHENG Lang, JING Chao. X-ray image rotating object detection based on improved YOLOv7[J]. Journal of Graphics, 2023, 44(2): 324-334.
Group | EPSA | SA | E-ELAN | DCL | Params (M) | FLOPs (G) | mAP@0.5 (%) |
---|---|---|---|---|---|---|---|
第1组 | × | × | × | × | 97.2 | 515.2 | 75.8 |
第2组 | √ | × | × | × | 98.6 | 535.4 | 77.2 |
第3组 | × | √ | × | × | 97.8 | 521.5 | 78.4 |
第4组 | √ | √ | × | × | 99.2 | 541.7 | 79.9 |
第5组 | √ | √ | √ | × | 105.6 | 553.6 | 85.4 |
第6组 | × | × | × | √ | 104.5 | 527.4 | 89.5 |
第7组 | √ | √ | √ | √ | 112.6 | 565.6 | 91.2 |
表1 各项改进和模块的消融实验结果对比
Table 1 The results of each improvement and module ablation experiment were compared
Group | EPSA | SA | E-ELAN | DCL | Params (M) | FLOPs (G) | mAP@0.5 (%) |
---|---|---|---|---|---|---|---|
第1组 | × | × | × | × | 97.2 | 515.2 | 75.8 |
第2组 | √ | × | × | × | 98.6 | 535.4 | 77.2 |
第3组 | × | √ | × | × | 97.8 | 521.5 | 78.4 |
第4组 | √ | √ | × | × | 99.2 | 541.7 | 79.9 |
第5组 | √ | √ | √ | × | 105.6 | 553.6 | 85.4 |
第6组 | × | × | × | √ | 104.5 | 527.4 | 89.5 |
第7组 | √ | √ | √ | √ | 112.6 | 565.6 | 91.2 |
Type | Methods | AP@0.5 (%) | mAP@0.5 (%) | |||||||
---|---|---|---|---|---|---|---|---|---|---|
PO1 | PO2 | WA | LA | MP | TA | CO | NL | |||
H | RetinaNet[ | 73.5 | 76.7 | 76.2 | 82.3 | 79.8 | 81.5 | 50.6 | 12.7 | 66.7 |
H | FCOS[ | 77.3 | 79.3 | 84.6 | 85.6 | 81.5 | 86.5 | 53.3 | 13.9 | 70.3 |
H | YOLOv7[ | 87.6 | 87.4 | 87.8 | 88.9 | 89.9 | 87.9 | 63.6 | 14.5 | 75.9 |
R | ReDet[ | 92.6 | 93.5 | 88.8 | 90.9 | 90.7 | 89.8 | 67.5 | 19.7 | 79.2 |
R | SCRDet[ | 94.9 | 93.2 | 89.3 | 91.7 | 90.8 | 90.2 | 74.8 | 22.4 | 80.9 |
R | R3Det[ | 97.5 | 95.5 | 94.8 | 94.9 | 97.0 | 95.9 | 79.7 | 28.3 | 85.5 |
R | YOLOv7-E6R(Ours) | 98.6 | 96.8 | 97.9 | 99.2 | 98.3 | 97.4 | 86.8 | 54.3 | 91.2 |
表2 不同旋转目标检测算法和水平目标检测算法在HiXray数据集上的检测准确率对比
Table 2 Comparison of detection accuracy of different rotating target detection algorithms and horizontal target detection algorithms on HiXray dataset
Type | Methods | AP@0.5 (%) | mAP@0.5 (%) | |||||||
---|---|---|---|---|---|---|---|---|---|---|
PO1 | PO2 | WA | LA | MP | TA | CO | NL | |||
H | RetinaNet[ | 73.5 | 76.7 | 76.2 | 82.3 | 79.8 | 81.5 | 50.6 | 12.7 | 66.7 |
H | FCOS[ | 77.3 | 79.3 | 84.6 | 85.6 | 81.5 | 86.5 | 53.3 | 13.9 | 70.3 |
H | YOLOv7[ | 87.6 | 87.4 | 87.8 | 88.9 | 89.9 | 87.9 | 63.6 | 14.5 | 75.9 |
R | ReDet[ | 92.6 | 93.5 | 88.8 | 90.9 | 90.7 | 89.8 | 67.5 | 19.7 | 79.2 |
R | SCRDet[ | 94.9 | 93.2 | 89.3 | 91.7 | 90.8 | 90.2 | 74.8 | 22.4 | 80.9 |
R | R3Det[ | 97.5 | 95.5 | 94.8 | 94.9 | 97.0 | 95.9 | 79.7 | 28.3 | 85.5 |
R | YOLOv7-E6R(Ours) | 98.6 | 96.8 | 97.9 | 99.2 | 98.3 | 97.4 | 86.8 | 54.3 | 91.2 |
Type | Methods | OPIXray mAP@0.5 (%) | PIDray | mAP@0.5 (%) | ||
---|---|---|---|---|---|---|
Easy | Hard | Hidden | Average | |||
H | RetinaNet[ | 75.6 | 61.3 | 51.3 | 37.6 | 50.1 |
H | FCOS[ | 82.1 | 64.6 | 52.9 | 40.3 | 52.6 |
H | YOLOv7[ | 83.7 | 72.3 | 57.4 | 42.8 | 57.5 |
R | ReDet[ | 85.4 | 75.9 | 58.3 | 43.7 | 59.3 |
R | SCRDet[ | 87.8 | 78.2 | 59.1 | 45.2 | 60.8 |
R | R3Det[ | 88.0 | 82.5 | 64.8 | 48.9 | 65.4 |
R | YOLOv7-E6R(Ours) | 92.6 | 84.5 | 65.4 | 49.2 | 66.4 |
表3 不同旋转目标检测算法和水平目标检测算法在OPIXray和PIDray数据集上的检测准确率对比
Table 3 Comparison of detection accuracy of different rotating target detection algorithms and horizontal target detection algorithms on OPIXray and PIDray datasets
Type | Methods | OPIXray mAP@0.5 (%) | PIDray | mAP@0.5 (%) | ||
---|---|---|---|---|---|---|
Easy | Hard | Hidden | Average | |||
H | RetinaNet[ | 75.6 | 61.3 | 51.3 | 37.6 | 50.1 |
H | FCOS[ | 82.1 | 64.6 | 52.9 | 40.3 | 52.6 |
H | YOLOv7[ | 83.7 | 72.3 | 57.4 | 42.8 | 57.5 |
R | ReDet[ | 85.4 | 75.9 | 58.3 | 43.7 | 59.3 |
R | SCRDet[ | 87.8 | 78.2 | 59.1 | 45.2 | 60.8 |
R | R3Det[ | 88.0 | 82.5 | 64.8 | 48.9 | 65.4 |
R | YOLOv7-E6R(Ours) | 92.6 | 84.5 | 65.4 | 49.2 | 66.4 |
图10 几种检测方法与改进的YOLOv7方法在HiXray数据集上的检测效果对比((a)有芯充电宝;(b)移动手机;(c)化妆品;(d)水)
Fig. 10 Comparison of detection effects of several detection methods and improved YOLOv7 method on HiXray dataset ((a) Portable Charger; (b) Mobile Phone; (c) Cosmetic; (d) Water)
[1] | 常青青, 陈嘉敏, 李维姣. 城市轨道交通安检中基于X射线图像的危险品识别技术研究[J]. 城市轨道交通研究, 2022, 25(4): 205-209. |
CHANG Q Q, CHEN J M, LI W J. Dangerous goods detection technology based on X-ray images in urban rail transit security inspection[J]. Urban Mass Transit, 2022, 25(4): 205-209. (in Chinese) | |
[2] |
李柯泉, 陈燕, 刘佳晨, 等. 基于深度学习的目标检测算法综述[J]. 计算机工程, 2022, 48(7): 1-12.
DOI |
LI K Q, CHEN Y, LIU J C, et al. Survey of deep learning-based object detection algorithms[J]. Computer Engineering, 2022, 48(7): 1-12. (in Chinese)
DOI |
|
[3] | 张友康, 苏志刚, 张海刚, 等. X光安检图像多尺度违禁品检测[J]. 信号处理, 2020, 36(7): 1096-1106. |
ZHANG Y K, SU Z G, ZHANG H G, et al. Multi-scale prohibited item detection in X-ray security image[J]. Journal of Signal Processing, 2020, 36(7): 1096-1106. (in Chinese) | |
[4] | 张宇, 马杰, 崔静雯, 等. 融合注意力机制的遥感图像旋转目标检测算法[J]. 激光与光电子学进展, 2022, 59(24): 2415005. |
ZHANG Y, MA J, CUI J, et al. Remote sensing image rotation target detection algorithm based on attention mechanism[J]. Laser & Optoelectronics Progress, 2022, 59(24): 2415005. (in Chinese) | |
[5] |
梁添汾, 张南峰, 张艳喜, 等. 违禁品X光图像检测技术应用研究进展综述[J]. 计算机工程与应用, 2021, 57(16): 74-82.
DOI |
LIANG T F, ZHANG N F, ZHANG Y X, et al. Summary of research progress on application of prohibited item detection in X-ray images[J]. Computer Engineering and Applications, 2021, 57(16): 74-82. (in Chinese)
DOI |
|
[6] | BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. [2022-08-23]. https://arxiv.org/abs/2004.10934. |
[7] | GE Z, LIU S T, WANG F, et al. YOLOX: exceeding YOLO series in 2021[EB/OL]. [2022-08-23]. https://arxiv.org/abs/2107.08430. |
[8] | REDMON J, FARHADI A. YOLOv3: an incremental improvement[EB/OL]. [2022-08-23]. https://arxiv.org/abs/1804.02767. |
[9] | 朱成, 李柏岩, 刘晓强, 等. 基于YOLO的违禁品检测深度卷积网络[J]. 合肥工业大学学报: 自然科学版, 2021, 44(9): 1198-1203. |
ZHU C, LI B Y, LIU X Q, et al. A deep convolutional neural network based on YOLO for contraband detection[J]. Journal of Hefei University of Technology: Natural Science, 2021, 44(9): 1198-1203. (in Chinese) | |
[10] | 董乙杉, 李兆鑫, 郭靖圆, 等. 一种改进YOLOv5的X光违禁品检测模型[J]. 激光与光电子学进展, 2023, 60(4): 0415005. |
DONG Y S, LI Z X, GUO J Y, et al. An improved YOLOv5 model for X-ray prohibited items detection[J]. Laser & Optoelectronics Progress, 2023, 60(4): 0415005. (in Chinese) | |
[11] | 吴海滨, 魏喜盈, 刘美红, 等. 结合空洞卷积和迁移学习改进YOLOv4的X光安检危险品检测[J]. 中国光学, 2021, 14(6): 1417-1425. |
WU H B, WEI X Y, LIU M H, et al. Improved YOLOv4 for dangerous goods detection in X-ray inspection combined with atrous convolution and transfer learning[J]. Chinese Optics, 2021, 14(6): 1417-1425. (in Chinese)
DOI URL |
|
[12] |
廖育荣, 王海宁, 林存宝, 等. 基于深度学习的光学遥感图像目标检测研究进展[J]. 通信学报, 2022, 43(5): 190-203.
DOI |
LIAO Y R, WANG H N, LIN C B, et al. Research progress of deep learning-based object detection of optical remote sensing image[J]. Journal on Communications, 2022, 43(5): 190-203. (in Chinese)
DOI |
|
[13] | LI W T, CHEN Y J, HU K X, et al. Oriented RepPoints for aerial object detection[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 1819-1828. |
[14] | MING Q, MIAO L J, ZHOU Z Q, et al. Optimization for arbitrary-oriented object detection via representation invariance loss[J]. IEEE Geoscience and Remote Sensing Letters, 2022, 19: 1-5. |
[15] | YANG X, YAN J C, MING Q, et al. Rethinking rotated object detection with Gaussian Wasserstein distance loss[EB/OL]. [2022-08-23]. https://arxiv.org/abs/2101.11952. |
[16] | WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[EB/OL]. [2022-08-23]. https://arxiv.org/abs/2207.02696. |
[17] | ZHANG H, ZU K K, LU J, et al. EPSANet: an efficient pyramid squeeze attention block on convolutional neural network[EB/OL]. [2022-08-23]. https://arxiv.org/abs/2105.14447. |
[18] | ZHANG Q L, YANG Y B. SA-net: shuffle attention for deep convolutional neural networks[C]// 2021 IEEE International Conference on Acoustics, Speech and Signal Processing. New York: IEEE Press, 2021: 2235-2239. |
[19] | ZHANG X D, ZENG H, GUO S, et al. Efficient long-range attention network for image super-resolution[EB/OL]. [2022-08-23]. https://arxiv.org/abs/2203.06697. |
[20] | YANG X, HOU L P, ZHOU Y, et al. Dense label encoding for boundary discontinuity free rotation detection[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 15814-15824. |
[21] | WANG C Y, BOCHKOVSKIY A, LIAO H Y M. Scaled-YOLOv4: scaling cross stage partial network[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 13024-13033. |
[22] | DING X H, ZHANG X Y, MA N N, et al. RepVGG: making VGG-style ConvNets great again[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 13728-13737. |
[23] | JIANG T T, CHENG J Y. Target recognition based on CNN with LeakyReLU and PReLU activation functions[C]// 2019 IEEE Conference on Sensing, Diagnostics, Prognostics, and Control. New York: IEEE Press, 2019: 718-722. |
[24] | CORREIA, ALANA DE SANTANA, COLOMBINI, et al. Neural attention models in deep learning: survey and taxonomy[EB/OL]. [2022-08-23]. https://arxiv.org/abs/2112.05909. |
[25] |
张宸嘉, 朱磊, 俞璐. 卷积神经网络中的注意力机制综述[J]. 计算机工程与应用, 2021, 57(20): 64-72.
DOI |
ZHANG C J, ZHU L, YU L. Review of attention mechanism in convolutional neural networks[J]. Computer Engineering and Applications, 2021, 57(20): 64-72. (in Chinese)
DOI |
|
[26] | HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// 2018 IEEE/CVF International Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 7132-7141. |
[27] | LI X, WANG X, HU X L, et al. Selective kernel networks[C]// 2019 IEEE/CVF International Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 510-519. |
[28] | HE K M, ZHANG X, REN SQ, et al. Deep residual learning for image recognition[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 770-778. |
[29] | MA N N, ZHANG X Y, ZHENG H T, et al. ShuffleNet V2: practical guidelines for efficient CNN architecture design[EB/OL]. [2022-08-23]. https://arxiv.org/abs/1807.11164. |
[30] | YANG X, YANG J R, YAN J C, et al. SCRDet: towards more robust detection for small, cluttered and rotated objects[C]// 2019 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2020: 8231-8240. |
[31] | TAO R S, WEI Y L, JIANG X J, et al. Towards real-world X-ray security inspection: a high-quality benchmark and lateral inhibition module for prohibited items detection[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2022: 10903-10912. |
[32] | WEI Y L, TAO R S, WU Z J, et al. Occluded prohibited items detection: an X-ray security inspection benchmark and de-occlusion attention module[C]/ The 28th ACM International Conference on Multimedia. New York: ACM, 2020: 138-146. |
[33] | WANG B Y, ZHANG L B, WEN L Y, et al. Towards real-world prohibited item detection: a large-scale X-ray benchmark[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2022: 5392-5401. |
[34] | LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]// 2017 IEEE International Conference on Computer Vision. New York: IEEE Press, 2017: 2999-3007. |
[35] | TIAN Z, SHEN C H, CHEN H, et al. FCOS: fully convolutional one-stage object detection[C]// 2019 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2020: 9626-9635. |
[36] | HAN J M, DING J, XUE N, et al. ReDet: a rotation- equivariant detector for aerial object detection[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 2785-2794. |
[37] | YANG X, LIU Q Q, YAN J C, et al. R3Det: refined single-stage detector with feature refinement for rotating object[C]// The 35th Conference on Association for the Advancement of Artificial Intelligence. Palo Alto: AAAI, 2021: 3163-3171. |
[1] | 李利霞, 王鑫, 王军, 张又元 .
基于特征融合与注意力机制的无人机图像小目标检测算法
[J]. 图学学报, 2023, 44(4): 658-666. |
[2] | 李鑫, 普园媛, 赵征鹏, 徐丹, 钱文华 .
内容语义和风格特征匹配一致的艺术风格迁移
[J]. 图学学报, 2023, 44(4): 699-709. |
[3] | 余伟群, 刘佳涛, 张亚萍.
融合注意力的拉普拉斯金字塔单目深度估计
[J]. 图学学报, 2023, 44(4): 728-738. |
[4] | 胡欣, 周运强, 肖剑, 杨杰. 基于改进YOLOv5的螺纹钢表面缺陷检测[J]. 图学学报, 2023, 44(3): 427-437. |
[5] | 郝鹏飞, 刘立群, 顾任远. YOLO-RD-Apple果园异源图像遮挡果实检测模型[J]. 图学学报, 2023, 44(3): 456-464. |
[6] | 罗文宇, 傅明月. 基于YoloX-ECA模型的非法野泳野钓现场监测技术[J]. 图学学报, 2023, 44(3): 465-472. |
[7] | 李雨, 闫甜甜, 周东生, 魏小鹏. 基于注意力机制与深度多尺度特征融合的自然场景文本检测[J]. 图学学报, 2023, 44(3): 473-481. |
[8] | 吴文欢, 张淏坤. 融合空间十字注意力与通道注意力的语义分割网络[J]. 图学学报, 2023, 44(3): 531-539. |
[9] | 谢国波, 贺笛轩, 何宇钦, 林志毅. 基于P-CenterNet的光学遥感图像烟囱检测[J]. 图学学报, 2023, 44(2): 233-240. |
[10] | 熊举举, 徐杨, 范润泽, 孙少聪. 基于轻量化视觉Transformer的花卉识别[J]. 图学学报, 2023, 44(2): 271-279. |
[11] | 曹义亲, 伍铭林, 徐露. 基于改进YOLOv5算法的钢材表面缺陷检测[J]. 图学学报, 2023, 44(2): 335-345. |
[12] | 张伟康, 孙浩, 陈鑫凯, 李叙兵, 姚立纲, 东辉. 基于改进YOLOv5的智能除草机器人蔬菜苗田杂草检测研究[J]. 图学学报, 2023, 44(2): 346-356. |
[13] | 李小波 , 李阳贵 , 郭宁 , 范震 . 融合注意力机制的 YOLOv5 口罩检测算法[J]. 图学学报, 2023, 44(1): 16-25. |
[14] | 邵文斌, 刘玉杰, 孙晓瑞, 李宗民. 基于残差增强注意力的跨模态行人重识别[J]. 图学学报, 2023, 44(1): 33-40. |
[15] | 单芳湄, 王梦文, 李敏.
融合注意力机制的肠道息肉分割多尺度卷积神经网络
[J]. 图学学报, 2023, 44(1): 50-58. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||