基于改进YOLOv7的X线图像旋转目标检测

doi:10.11996/JG.j.2095-302X.2023020324

图学学报 ›› 2023, Vol. 44 ›› Issue (2): 324-334.DOI: 10.11996/JG.j.2095-302X.2023020324

• 图像处理与计算机视觉 • 上一篇下一篇

基于改进YOLOv7的X线图像旋转目标检测

成浪¹(), 敬超¹^,²^,³()

1.桂林理工大学信息科学与工程学院，广西桂林 541004
2.桂林理工大学嵌入式技术与智能系统重点实验室，广西桂林 541004
3.桂林电子科技大学可信软件重点实验室，广西桂林 541004

收稿日期:2022-09-28 接受日期:2022-11-08 出版日期:2023-04-30 发布日期:2023-05-01
通讯作者: 敬超(1983-)，男，副教授，博士。主要研究方向为机器学习、图像处理。E-mail：jingchao@glut.edu.cn
作者简介:成浪(1995-)，男，硕士研究生。主要研究方向为计算机视觉、图像处理。E-mail：862409782@qq.com
基金资助:
国家自然科学基金项目(61802085);国家自然科学基金项目(61862019);广西自然科学基金项目(2020GXNSFAA159038);广西可信软件重点实验室基金项目(kx202011);广西中青年教师基础能力提升项目(2022KY0252)

X-ray image rotating object detection based on improved YOLOv7

CHENG Lang¹(), JING Chao¹^,²^,³()

1. School of Information Science and Engineering, Guilin University of Technology, Guilin Guangxi 541004, China
2. Guangxi Key Laboratory of Embedded Technology and Intelligent System, Guilin University of Technology, Guilin Guangxi 541004, China
3. Guangxi Key Laboratory of Trusted Software, Guilin University of Electronic Science and Technology, Guilin Guangxi 541004, China

Received:2022-09-28 Accepted:2022-11-08 Online:2023-04-30 Published:2023-05-01
Contact: JING Chao (1983-), associate professor, Ph.D. His main research interests cover machine learning and image processing. E-mail：jingchao@glut.edu.cn
About author:CHENG Lang (1995-), master student. His main research interests cover computer vision and image processing. E-mail：862409782@qq.com
Supported by:
National Natural Science Foundation of China(61802085);National Natural Science Foundation of China(61862019);Guangxi Natural Science Foundation(2020GXNSFAA159038);Guangxi Trusted Software Key Laboratory Fund(kx202011);Guangxi Middle-Aged and Young Teachers′ Basic Ability Improvement Project(2022KY0252)

摘要/Abstract

摘要：

针对X线图像违禁品目标检测中存在的识别定位困难以及忽略物品方向性的问题，提出了一种基于改进YOLOv7的X线图像旋转目标检测算法。首先，通过在原网络中融合高效注意力机制模块提高模型对深层重要特征的提取能力；然后，改进扩展的高效长程注意力机制的特征融合路径，在模块之间增加跳跃连接和1×1卷积架构，使网络提取更丰富的物品特征；最后，针对X线图像中违禁品放置方向任意的问题，使用密集编码标签表示法对角度进行离散化处理，提高违禁品定位的准确性。实验结果表明，改进的算法在HiXray，OPIXray和PIDray数据集上分别取得了91.2%，92.6%和66.4%的检测精度，较原YOLOv7模型分别提高了20.2%，10.6%和15.5%，在有效提高X线图像违禁品检测精度的基础上，为保障公共安全提供了很好的技术支持。

关键词: 旋转目标检测, 注意力机制, X线图像, YOLOv7, 违禁品

Abstract:

For prohibited items in X-ray images, an algorithm for the detection of rotating targets based on the improved YOLOv7 was proposed to address the challenges of accurate identification and localization, as well as the neglection of the directionality of the items. Firstly, an efficient attention network module was integrated into the original network to enhance the ability of the model to extract deep important features. Then, the feature fusion path of the extended efficient long-range attention network (E-ELAN) was improved, and the residual structure jump connection and 1×1 convolution were added between modules, allowing the network to extract richer item features. Finally, to tackle the problem of arbitrary placement direction of prohibited items in X-ray images, the angles were discretized using the dense coded label representation method, thereby improving the positioning accuracy of prohibited items. The experimental results revealed that the improved algorithm could achieve a detection accuracy of 91.2%, 92.6%, and 66.4% on HiXray, OPIXray, and PIDray datasets, respectively. Compared with the original YOLOv7 model, the results were improved by 20.2%, 10.6%, and 15.5%, respectively. The proposed algorithm could provide a valuable technical support for public security by effectively improving the accuracy of prohibited item detection in X-ray images.

Key words: rotating target detection, attention mechanism, X-ray images, YOLOv7, prohibited item

中图分类号:

TP391

成浪, 敬超. 基于改进YOLOv7的X线图像旋转目标检测[J]. 图学学报, 2023, 44(2): 324-334.

CHENG Lang, JING Chao. X-ray image rotating object detection based on improved YOLOv7[J]. Journal of Graphics, 2023, 44(2): 324-334.

图/表 13

图1 同一图像使用水平框(左)和旋转框(右)标注对比

Fig. 1 Comparison of annotation effects of the same image using horizontal box (left) and rotation box (right)

图2 YOLOv7网络结构

Fig. 2 YOLOv7 network structure

图3 EPSA模块结构图

Fig. 3 EPSA module structure

图4 PSA模块结构图

Fig. 4 PSA module structure

图5 SA模块结构图

Fig. 5 SA module structure

图6 改进的E-ELAN结构图

Fig. 6 Modified E-ELAN structure ((a) ELAB; (b) ELAN; (c) E-ELAN; (d) RepVGG Block; (e) Improved E-ELAN)

图7 长边定义法

Fig. 7 Long edge definition method

图8 密集编码标签表示法

Fig. 8 Densely coded label representation

图9 改进后的YOLOv7网络结构

Fig. 9 The improved YOLOv7 network structure

表1 各项改进和模块的消融实验结果对比

Table 1 The results of each improvement and module ablation experiment were compared

Group	EPSA	SA	E-ELAN	DCL	Params (M)	FLOPs (G)	mAP@0.5 (%)
第1组	×	×	×	×	97.2	515.2	75.8
第2组	√	×	×	×	98.6	535.4	77.2
第3组	×	√	×	×	97.8	521.5	78.4
第4组	√	√	×	×	99.2	541.7	79.9
第5组	√	√	√	×	105.6	553.6	85.4
第6组	×	×	×	√	104.5	527.4	89.5
第7组	√	√	√	√	112.6	565.6	91.2

表2 不同旋转目标检测算法和水平目标检测算法在HiXray数据集上的检测准确率对比

Table 2 Comparison of detection accuracy of different rotating target detection algorithms and horizontal target detection algorithms on HiXray dataset

Type	Methods	AP@0.5 (%)								mAP@0.5 (%)
Type	Methods	PO1	PO2	WA	LA	MP	TA	CO	NL	mAP@0.5 (%)
H	RetinaNet^[34]	73.5	76.7	76.2	82.3	79.8	81.5	50.6	12.7	66.7
H	FCOS^[35]	77.3	79.3	84.6	85.6	81.5	86.5	53.3	13.9	70.3
H	YOLOv7^[16]	87.6	87.4	87.8	88.9	89.9	87.9	63.6	14.5	75.9
R	ReDet^[36]	92.6	93.5	88.8	90.9	90.7	89.8	67.5	19.7	79.2
R	SCRDet^[30]	94.9	93.2	89.3	91.7	90.8	90.2	74.8	22.4	80.9
R	R3Det^[37]	97.5	95.5	94.8	94.9	97.0	95.9	79.7	28.3	85.5
R	YOLOv7-E6R(Ours)	98.6	96.8	97.9	99.2	98.3	97.4	86.8	54.3	91.2

表3 不同旋转目标检测算法和水平目标检测算法在OPIXray和PIDray数据集上的检测准确率对比

Table 3 Comparison of detection accuracy of different rotating target detection algorithms and horizontal target detection algorithms on OPIXray and PIDray datasets

Type	Methods	OPIXray mAP@0.5 (%)	PIDray		mAP@0.5 (%)
Type	Methods	OPIXray mAP@0.5 (%)	Easy	Hard	Hidden	Average
H	RetinaNet^[34]	75.6	61.3	51.3	37.6	50.1
H	FCOS^[35]	82.1	64.6	52.9	40.3	52.6
H	YOLOv7^[16]	83.7	72.3	57.4	42.8	57.5
R	ReDet^[36]	85.4	75.9	58.3	43.7	59.3
R	SCRDet^[30]	87.8	78.2	59.1	45.2	60.8
R	R3Det^[37]	88.0	82.5	64.8	48.9	65.4
R	YOLOv7-E6R(Ours)	92.6	84.5	65.4	49.2	66.4

图10 几种检测方法与改进的YOLOv7方法在HiXray数据集上的检测效果对比((a)有芯充电宝；(b)移动手机；(c)化妆品；(d)水)

Fig. 10 Comparison of detection effects of several detection methods and improved YOLOv7 method on HiXray dataset ((a) Portable Charger; (b) Mobile Phone; (c) Cosmetic; (d) Water)

参考文献 37

[1]	常青青, 陈嘉敏, 李维姣. 城市轨道交通安检中基于X射线图像的危险品识别技术研究[J]. 城市轨道交通研究, 2022, 25(4): 205-209.
	CHANG Q Q, CHEN J M, LI W J. Dangerous goods detection technology based on X-ray images in urban rail transit security inspection[J]. Urban Mass Transit, 2022, 25(4): 205-209. (in Chinese)
[2]	李柯泉, 陈燕, 刘佳晨, 等. 基于深度学习的目标检测算法综述[J]. 计算机工程, 2022, 48(7): 1-12. DOI
	LI K Q, CHEN Y, LIU J C, et al. Survey of deep learning-based object detection algorithms[J]. Computer Engineering, 2022, 48(7): 1-12. (in Chinese) DOI
[3]	张友康, 苏志刚, 张海刚, 等. X光安检图像多尺度违禁品检测[J]. 信号处理, 2020, 36(7): 1096-1106.
	ZHANG Y K, SU Z G, ZHANG H G, et al. Multi-scale prohibited item detection in X-ray security image[J]. Journal of Signal Processing, 2020, 36(7): 1096-1106. (in Chinese)
[4]	张宇, 马杰, 崔静雯, 等. 融合注意力机制的遥感图像旋转目标检测算法[J]. 激光与光电子学进展, 2022, 59(24): 2415005.
	ZHANG Y, MA J, CUI J, et al. Remote sensing image rotation target detection algorithm based on attention mechanism[J]. Laser & Optoelectronics Progress, 2022, 59(24): 2415005. (in Chinese)
[5]	梁添汾, 张南峰, 张艳喜, 等. 违禁品X光图像检测技术应用研究进展综述[J]. 计算机工程与应用, 2021, 57(16): 74-82. DOI
	LIANG T F, ZHANG N F, ZHANG Y X, et al. Summary of research progress on application of prohibited item detection in X-ray images[J]. Computer Engineering and Applications, 2021, 57(16): 74-82. (in Chinese) DOI
[6]	BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. [2022-08-23]. https://arxiv.org/abs/2004.10934.
[7]	GE Z, LIU S T, WANG F, et al. YOLOX: exceeding YOLO series in 2021[EB/OL]. [2022-08-23]. https://arxiv.org/abs/2107.08430.
[8]	REDMON J, FARHADI A. YOLOv3: an incremental improvement[EB/OL]. [2022-08-23]. https://arxiv.org/abs/1804.02767.
[9]	朱成, 李柏岩, 刘晓强, 等. 基于YOLO的违禁品检测深度卷积网络[J]. 合肥工业大学学报: 自然科学版, 2021, 44(9): 1198-1203.
	ZHU C, LI B Y, LIU X Q, et al. A deep convolutional neural network based on YOLO for contraband detection[J]. Journal of Hefei University of Technology: Natural Science, 2021, 44(9): 1198-1203. (in Chinese)
[10]	董乙杉, 李兆鑫, 郭靖圆, 等. 一种改进YOLOv5的X光违禁品检测模型[J]. 激光与光电子学进展, 2023, 60(4): 0415005.
	DONG Y S, LI Z X, GUO J Y, et al. An improved YOLOv5 model for X-ray prohibited items detection[J]. Laser & Optoelectronics Progress, 2023, 60(4): 0415005. (in Chinese)
[11]	吴海滨, 魏喜盈, 刘美红, 等. 结合空洞卷积和迁移学习改进YOLOv4的X光安检危险品检测[J]. 中国光学, 2021, 14(6): 1417-1425.
	WU H B, WEI X Y, LIU M H, et al. Improved YOLOv4 for dangerous goods detection in X-ray inspection combined with atrous convolution and transfer learning[J]. Chinese Optics, 2021, 14(6): 1417-1425. (in Chinese) DOI URL
[12]	廖育荣, 王海宁, 林存宝, 等. 基于深度学习的光学遥感图像目标检测研究进展[J]. 通信学报, 2022, 43(5): 190-203. DOI
	LIAO Y R, WANG H N, LIN C B, et al. Research progress of deep learning-based object detection of optical remote sensing image[J]. Journal on Communications, 2022, 43(5): 190-203. (in Chinese) DOI
[13]	LI W T, CHEN Y J, HU K X, et al. Oriented RepPoints for aerial object detection[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 1819-1828.
[14]	MING Q, MIAO L J, ZHOU Z Q, et al. Optimization for arbitrary-oriented object detection via representation invariance loss[J]. IEEE Geoscience and Remote Sensing Letters, 2022, 19: 1-5.
[15]	YANG X, YAN J C, MING Q, et al. Rethinking rotated object detection with Gaussian Wasserstein distance loss[EB/OL]. [2022-08-23]. https://arxiv.org/abs/2101.11952.
[16]	WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[EB/OL]. [2022-08-23]. https://arxiv.org/abs/2207.02696.
[17]	ZHANG H, ZU K K, LU J, et al. EPSANet: an efficient pyramid squeeze attention block on convolutional neural network[EB/OL]. [2022-08-23]. https://arxiv.org/abs/2105.14447.
[18]	ZHANG Q L, YANG Y B. SA-net: shuffle attention for deep convolutional neural networks[C]// 2021 IEEE International Conference on Acoustics, Speech and Signal Processing. New York: IEEE Press, 2021: 2235-2239.
[19]	ZHANG X D, ZENG H, GUO S, et al. Efficient long-range attention network for image super-resolution[EB/OL]. [2022-08-23]. https://arxiv.org/abs/2203.06697.
[20]	YANG X, HOU L P, ZHOU Y, et al. Dense label encoding for boundary discontinuity free rotation detection[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 15814-15824.
[21]	WANG C Y, BOCHKOVSKIY A, LIAO H Y M. Scaled-YOLOv4: scaling cross stage partial network[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 13024-13033.
[22]	DING X H, ZHANG X Y, MA N N, et al. RepVGG: making VGG-style ConvNets great again[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 13728-13737.
[23]	JIANG T T, CHENG J Y. Target recognition based on CNN with LeakyReLU and PReLU activation functions[C]// 2019 IEEE Conference on Sensing, Diagnostics, Prognostics, and Control. New York: IEEE Press, 2019: 718-722.
[24]	CORREIA, ALANA DE SANTANA, COLOMBINI, et al. Neural attention models in deep learning: survey and taxonomy[EB/OL]. [2022-08-23]. https://arxiv.org/abs/2112.05909.
[25]	张宸嘉, 朱磊, 俞璐. 卷积神经网络中的注意力机制综述[J]. 计算机工程与应用, 2021, 57(20): 64-72. DOI
	ZHANG C J, ZHU L, YU L. Review of attention mechanism in convolutional neural networks[J]. Computer Engineering and Applications, 2021, 57(20): 64-72. (in Chinese) DOI
[26]	HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// 2018 IEEE/CVF International Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 7132-7141.
[27]	LI X, WANG X, HU X L, et al. Selective kernel networks[C]// 2019 IEEE/CVF International Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 510-519.
[28]	HE K M, ZHANG X, REN SQ, et al. Deep residual learning for image recognition[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 770-778.
[29]	MA N N, ZHANG X Y, ZHENG H T, et al. ShuffleNet V2: practical guidelines for efficient CNN architecture design[EB/OL]. [2022-08-23]. https://arxiv.org/abs/1807.11164.
[30]	YANG X, YANG J R, YAN J C, et al. SCRDet: towards more robust detection for small, cluttered and rotated objects[C]// 2019 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2020: 8231-8240.
[31]	TAO R S, WEI Y L, JIANG X J, et al. Towards real-world X-ray security inspection: a high-quality benchmark and lateral inhibition module for prohibited items detection[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2022: 10903-10912.
[32]	WEI Y L, TAO R S, WU Z J, et al. Occluded prohibited items detection: an X-ray security inspection benchmark and de-occlusion attention module[C]/ The 28th ACM International Conference on Multimedia. New York: ACM, 2020: 138-146.
[33]	WANG B Y, ZHANG L B, WEN L Y, et al. Towards real-world prohibited item detection: a large-scale X-ray benchmark[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2022: 5392-5401.
[34]	LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]// 2017 IEEE International Conference on Computer Vision. New York: IEEE Press, 2017: 2999-3007.
[35]	TIAN Z, SHEN C H, CHEN H, et al. FCOS: fully convolutional one-stage object detection[C]// 2019 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2020: 9626-9635.
[36]	HAN J M, DING J, XUE N, et al. ReDet: a rotation- equivariant detector for aerial object detection[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 2785-2794.
[37]	YANG X, LIU Q Q, YAN J C, et al. R3Det: refined single-stage detector with feature refinement for rotating object[C]// The 35th Conference on Association for the Advancement of Artificial Intelligence. Palo Alto: AAAI, 2021: 3163-3171.

基于改进YOLOv7的X线图像旋转目标检测

X-ray image rotating object detection based on improved YOLOv7

RichHTML

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

图/表 13

参考文献 37

相关文章 15

编辑推荐

Metrics

本文评价

[1]	李利霞, 王鑫, 王军, 张又元 . 基于特征融合与注意力机制的无人机图像小目标检测算法 [J]. 图学学报, 2023, 44(4): 658-666.
[2]	李鑫, 普园媛, 赵征鹏, 徐丹, 钱文华 . 内容语义和风格特征匹配一致的艺术风格迁移 [J]. 图学学报, 2023, 44(4): 699-709.
[3]	余伟群, 刘佳涛, 张亚萍. 融合注意力的拉普拉斯金字塔单目深度估计 [J]. 图学学报, 2023, 44(4): 728-738.
[4]	胡欣, 周运强, 肖剑, 杨杰. 基于改进YOLOv5的螺纹钢表面缺陷检测[J]. 图学学报, 2023, 44(3): 427-437.
[5]	郝鹏飞, 刘立群, 顾任远. YOLO-RD-Apple果园异源图像遮挡果实检测模型[J]. 图学学报, 2023, 44(3): 456-464.
[6]	罗文宇, 傅明月. 基于YoloX-ECA模型的非法野泳野钓现场监测技术[J]. 图学学报, 2023, 44(3): 465-472.
[7]	李雨, 闫甜甜, 周东生, 魏小鹏. 基于注意力机制与深度多尺度特征融合的自然场景文本检测[J]. 图学学报, 2023, 44(3): 473-481.
[8]	吴文欢, 张淏坤. 融合空间十字注意力与通道注意力的语义分割网络[J]. 图学学报, 2023, 44(3): 531-539.
[9]	谢国波, 贺笛轩, 何宇钦, 林志毅. 基于P-CenterNet的光学遥感图像烟囱检测[J]. 图学学报, 2023, 44(2): 233-240.
[10]	熊举举, 徐杨, 范润泽, 孙少聪. 基于轻量化视觉Transformer的花卉识别[J]. 图学学报, 2023, 44(2): 271-279.
[11]	曹义亲, 伍铭林, 徐露. 基于改进YOLOv5算法的钢材表面缺陷检测[J]. 图学学报, 2023, 44(2): 335-345.
[12]	张伟康, 孙浩, 陈鑫凯, 李叙兵, 姚立纲, 东辉. 基于改进YOLOv5的智能除草机器人蔬菜苗田杂草检测研究[J]. 图学学报, 2023, 44(2): 346-356.
[13]	李小波 , 李阳贵 , 郭宁 , 范震 . 融合注意力机制的 YOLOv5 口罩检测算法[J]. 图学学报, 2023, 44(1): 16-25.
[14]	邵文斌, 刘玉杰, 孙晓瑞, 李宗民. 基于残差增强注意力的跨模态行人重识别[J]. 图学学报, 2023, 44(1): 33-40.
[15]	单芳湄, 王梦文, 李敏. 融合注意力机制的肠道息肉分割多尺度卷积神经网络 [J]. 图学学报, 2023, 44(1): 50-58.