基于YOLOv8的轻量化无人机图像目标检测算法

doi:10.11996/JG.j.2095-302X.2024061328

摘要/Abstract

摘要：

针对无人机图像目标像素低、背景复杂、模型部署难等问题，提出一种基于YOLOv8的轻量级多尺度特征融合小目标检测算法。为了降低网络参数量，提高模型检测速度，使用fasternet block替换C2f的bottleneck，构建轻量化特征提取模块FasterC2f；为了增强模型多尺度特征融合能力，设计全新的聚焦扩散特征金字塔结构，使颈部网络每层特征图都聚焦三层特征信息；设计共享卷积检测头，在优化模型参数量的同时，让每个检测头都包含不同尺度特征信息；重构小目标检测网络，采用更大尺度的三层检测头，提高模型对小目标的特征学习能力。在Visdrone数据集上的实验结果表明，与YOLOv8s相比，该模型的精确率、召回率和mAP分别提高了5.1%，5.4%和6.6%，参数量降低了68%，模型文件体积减少了15.3 MB，FPS提高了16%，表明该模型具有检测精度高、检测速度快、模型易部署等优点。

关键词: YOLOv8, 无人机, 小目标检测, 轻量化, 特征融合

Abstract:

To address the problems of low target pixels, complex backgrounds, and difficult model deployment in unmanned aerial vehicle (UAV) images, a lightweight multi-scale feature fusion small target detection algorithm based on YOLOv8 was proposed. In order to reduce the number of network parameters and improve the model detection speed, the FasterNet block was used to replace the bottleneck of C2f, resulting in the construction of a lightweight feature extraction module, FasterC2f. To enhance the multi-scale feature fusion ability of the model, a new focus diffusion feature was designed that enables each layer feature map of the neck network to focus on three layers of feature information. A shared convolution detection head was designed, allowing each detection head to contain feature information from different scales while optimizing the model parameters. The small target detection network was reconstructed to utilize a larger-scale three-layer detection head, improving the model’s feature learning capability for small targets. Experimental results on the Visdrone data set indicated that compared with YOLOv8s, the precision rate, recall rate, and mAP of this model increased by 5.1%, 5.4%, and 6.6%, respectively. The number of parameters was reduced by 68%, and the model file size decreased by 15.3 MB, while FPS increased by 16%. These results demonstrated that the model possesses advantages in high detection accuracy, fast detection speed, and ease of deployment.

Key words: YOLOv8, UAV, small target detection, lightweight, feature fusion

中图分类号:

闫建红, 冉同霄. 基于YOLOv8的轻量化无人机图像目标检测算法[J]. 图学学报, 2024, 45(6): 1328-1337.

YAN Jianhong, RAN Tongxiao. Lightweight UAV image target detection algorithm based on YOLOv8[J]. Journal of Graphics, 2024, 45(6): 1328-1337.

图/表 16

参考文献 29

[1]	BOUGUETTAYA A, ZARZOUR H, KECHIDA A, et al. Vehicle detection from UAV imagery with deep learning: a review[J]. IEEE Transactions on Neural Networks and Learning Systems, 2022, 33(11): 6047-6067.
[2]	李利霞, 王鑫, 王军, 等. 基于特征融合与注意力机制的无人机图像小目标检测算法[J]. 图学学报, 2023, 44(4): 658-666. DOI
	LI L X, WANG X, WANG J, et al. Small object detection algorithm in UAV image based on feature fusion and attention mechanism[J]. Journal of Graphics, 2023, 44(4): 658-666. (in Chinese)
[3]	CHEN Y T, LI J, NIU Y F, et al. Small object detection networks based on classification-oriented super-resolution GAN for UAV aerial imagery[C]// 2019 Chinese Control And Decision Conference. New York: IEEE Press, 2019: 4610-4615.
[4]	潘晓英, 贾凝心, 穆元震, 等. 小目标检测研究综述[J]. 中国图象图形学报, 2023, 28(9): 2587-2615.
	PAN X Y, JIA N X, MU Y Z, et al. Survey of small object detection[J]. Journal of Image and Graphics, 2023, 28(9): 2587-2615. (in Chinese)
[5]	DONG J, OTA K, DONG M X. UAV-based real-time survivor detection system in post-disaster search and rescue operations[J]. IEEE Journal on Miniaturization for Air and Space Systems, 2021, 2(4): 209-219.
[6]	REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. DOI PMID
[7]	LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector[C]// Computer Vision-ECCV 2016: 14th European Conference. Cham: Springer, 2016: 21-37.
[8]	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 779-788.
[9]	LIU W J, QIANG J, LI X X, et al. UAV image small object detection based on composite backbone network[J]. Mobile Information Systems, 2022, 2022(1): 7319529.
[10]	魏陈浩, 杨睿, 刘振丙, 等. 具有双层路由注意力的YOLOv8道路场景目标检测方法[J]. 图学学报, 2023, 44(6): 1104-1111. DOI
	WEI C H, YANG R, LIU Z B, et al. YOLOv8 with bi-level routing attention for road scene object detection[J]. Journal of Graphics, 2023, 44(6): 1104-1111. (in Chinese)
[11]	史涛, 崔杰, 李松. 优化改进YOLOv8实现实时无人机车辆检测的算法[J]. 计算机工程与应用, 2024, 60(9): 79-89. DOI
	SHI T, CUI J, LI S. Algorithm for real-time vehicle detection from UAVs based on optimizing and improving YOLOv8[J]. Computer Engineering and Applications, 2024, 60(9): 79-89. (in Chinese) DOI
[12]	LI Y T, FAN Q S, HUANG H S, et al. A modified YOLOv8 detection network for UAV aerial image recognition[J]. Drones, 2023, 7(5): 304.
[13]	吴明杰, 云利军, 陈载清, 等. 改进YOLOv5s的无人机视角下小目标检测算法[J]. 计算机工程与应用, 2024, 60(2): 191-199. DOI
	WU M J, YUN L J, CHEN Z Q, et al. Improved YOLOv5s small object detection algorithm in UAV view[J]. Computer Engineering and Applications, 2024, 60(2): 191-199. (in Chinese) DOI
[14]	WANG G, CHEN Y F, AN P, et al. UAV-YOLOv8: a small-object-detection model based on improved YOLOv8 for UAV aerial photography scenarios[J]. Sensors, 2023, 23(16): 7190.
[15]	BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. [2024-04-12]. https://arxiv.org/abs/2004.10934.
[16]	WANG C Y, LIAO H Y M, WU Y H, et al. CSPNet: a new backbone that can enhance learning capability of CNN[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. New York: IEEE Press, 2020: 390-391.
[17]	WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 7464-7475.
[18]	LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 2117-2125.
[19]	LIU S, QI L, QIN H F, et al. Path aggregation network for instance segmentation[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 8759-8768.
[20]	GE Z, LIU S T, WANG F, et al. YOLOX: exceeding YOLO series in 2021[EB/OL]. [2024-04-12]. https://arxiv.org/abs/2107.08430.
[21]	KONG T, SUN F C, LIU H P, et al. FoveaBox: beyound anchor-based object detection[J]. IEEE Transactions on Image Processing, 2020, 29: 7389-7398.
[22]	CHEN J R, KAO S H, HE H, et al. Run, don't walk: chasing higher FLOPS for faster neural networks[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 12021-12031.
[23]	WANG C Y, YEH I H, LIAO H Y M. YOLOv9: learning what you want to learn using programmable gradient information[EB/OL]. [2024-04-12]. https://arxiv.org/abs/2402.13616.
[24]	CHOLLET F. Xception: deep learning with depthwise separable convolutions[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 1251-1258.
[25]	TIAN Z, SHEN C H, CHEN H, et al. FCOS: a simple and strong anchor-free object detector[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(4): 1922-1933.
[26]	TAN M X, PANG R M, LE Q V. EfficientDet: scalable and efficient object detection[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 10781-10790.
[27]	YANG G Y, LEI J, ZHU Z K, et al. AFPN: asymptotic feature pyramid network for object detection[C]// 2023 IEEE International Conference on Systems, Man, and Cybernetics. New York: IEEE, 2023: 2184-2189.
[28]	刘子洋, 徐慧英, 朱信忠, 等. Bi-YOLO: 一种基于YOLOv8n改进的轻量化目标检测算法[J]. 计算机工程与科学, 2024, 46(8): 1444-1454.
	LIU Z Y, XU H Y, ZHU X Z, et al. Bi-YOLO: an improved lightweight object detection algorithm based on YOLOv8n[J]. Computer Engineering & Science, 2024, 46(8): 1444-1454. (in Chinese)
[29]	朱强军, 胡斌, 汪慧兰, 等. 基于轻量化YOLOv8s交通标志的检测[J]. 图学学报, 2024, 45(3): 422-432. DOI
	ZHU Q J, HU B, WANG H L, et al. Detection of traffic signs based on lightweight YOLOv8s[J]. Journal of Graphics, 2024, 45(3): 422-432. (in Chinese) DOI

参数名称	参数值
epoch	300
batch-size	8
初始学习率	0.01
优化器	SGD
momentum	0.937
图像分辨率	640×640

参数名称	参数值
epoch	300
batch-size	8
初始学习率	0.01
优化器	SGD
momentum	0.937
图像分辨率	640×640

实验	P/ %	R/ %	mAP/ %	Weight/ MB	FPS
YOLOv8s	50.2	38.0	38.8	22.5	110
替换backbone	49.9	38.6	39.4	18.7	238
替换neck	50.3	37.9	39.3	18.8	161
全部替换	49.0	37.6	38.8	16.0	251

实验	P/ %	R/ %	mAP/ %	Weight/ MB	FPS
YOLOv8s	50.2	38.0	38.8	22.5	110
替换backbone	49.9	38.6	39.4	18.7	238
替换neck	50.3	37.9	39.3	18.8	161
全部替换	49.0	37.6	38.8	16.0	251

模型	P/ %	R/ %	mAP/ %	Weight/ MB	FPS
YOLOv8s	50.2	38.0	38.8	22.5	110
YOLOv8s+BiFPN	50.0	39.9	40.7	56.7	149
YOLOv8s+AFPN	49.6	36.1	37.7	17.2	207
YOLOv8s+FD-FPN	51.5	38.2	39.9	21.1	176