BSD-YOLO：基于动态稀疏注意力与自适应检测头的小目标车辆检测方法

doi:10.11996/JG.j.2095-302X.2026010099

摘要/Abstract

摘要：

在智能交通监控系统中，复杂场景下的小目标车辆检测面临特征分辨率低、遮挡干扰严重、模型计算冗余及边界框回归精度不足等挑战。为兼顾检测精度与边缘设备部署效率，提出一种基于动态稀疏注意力与轻量化双分支结构的改进YOLOv8检测框架。首先设计双向路由稀疏注意力机制(ReBiAttention)，通过双层动态路由筛选关键特征，增强对小目标浅层特征的保留能力；随后结合GSConv与VoV-GSCSP模块，在减小计算量的同时动态调整多尺度特征权重；并在检测头部分引入改进型DynamicHead结构，实现多任务自适应优化；最后改进ShapeIoU损失函数，引入形状与尺度感知机制，提升定位精度。在UA-DETRAC数据集上的实验表明，改进模型较基线YOLOv8n的Precision，Recall与mAP@0.5分别提升8.739%，1.685%和7.225%，参数量减少4.3%。该方法为复杂交通场景下的小目标车辆精准检测提供了高效解决方案。

关键词: YOLOv8, 注意力机制, 轻量化, 深度学习, 小目标检测

Abstract:

In intelligent traffic monitoring systems, small target vehicle detection in complex scenes faces challenges such as low feature resolution, severe occlusion interference, computational redundancy, and insufficient bounding-box regression accuracy. To balance detection accuracy with deployment efficiency on edge devices, an improved YOLOv8 framework based on dynamic sparse attention and a lightweight dual-branch structure was proposed. The method first introduced a bidirectional routing sparse attention mechanism (ReBiAttention) that enhanced the retention of shallow features for small targets by dynamically filtering key features through a two-level routing strategy. Subsequently, GSConv and VoV-GSCSP modules were integrated to reduce computational cost while dynamically adjusting multi-scale feature weights. An improved DynamicHead was applied for multi-task adaptive optimization, and a modified ShapeIoU loss function with shape- and scale-aware weighting was employed to improve localization accuracy. Experiments on the UA-DETRAC dataset showed that, relative to baseline YOLOv8n, Precision, Recall, and mAP@0.5 increased by 8.739%, 1.685%, and 7.225%, respectively, while the parameter count decreased by 4.3%. This method provided an efficient solution for accurate detection of small-target vehicles in complex traffic scenarios.

Key words: YOLOv8, sparse attention, lightweight, deep learning, small target detection

中图分类号:

杨彪, 王学, 官铮, 龙萍. BSD-YOLO：基于动态稀疏注意力与自适应检测头的小目标车辆检测方法[J]. 图学学报, 2026, 47(1): 99-110.

YANG Biao, WANG Xue, GUAN Zheng, LONG Ping. BSD-YOLO: a small target vehicle detection method based on dynamic sparse attention and adaptive detection head[J]. Journal of Graphics, 2026, 47(1): 99-110.

图/表 13

参考文献 31

[1]	火久元, 苏泓瑞, 武泽宇, 等. 基于改进YOLOv8的道路交通小目标车辆检测算法[J]. 计算机工程, 2025, 51(1): 246-257. DOI
	HUO J Y, SU H R, WU Z Y, et al. Road traffic small target vehicle detection algorithm based on improved YOLOv8[J]. Computer Engineering, 2025, 51(1): 246-257 (in Chinese). DOI
[2]	NAVIA-VAZQUEZ A, GUTIERREZ-GONZALEZ D, PARRADO-HERNÁNDEZ E, et al. Distributed support vector machines[J]. IEEE Transactions on Neural Networks, 2006, 17(4): 1091-1097. DOI URL
[3]	ARREOLA L, GUDIÑO G, FLORES G. Object recognition and tracking using Haar-like Features Cascade Classifiers: application to a quad-rotor UAV[C]// 2022 8th International Conference on Control, Decision and Information Technologies. New York: IEEE Press, 2022: 45-50.
[4]	杜铨熠. 基于改进YOLOv8的无人机航拍交通小目标检测算法研究[D]. 大连: 大连交通大学, 2025.
	DU Q Y. Research on aerial traffic small target detection algorithm in UAV based on improved YOLOv8[D]. Dalian: Dalian Jiaotong University, 2025 (in Chinese).
[5]	鞠默然, 罗海波, 王仲博, 等. 改进的YOLO V3算法及其在小目标检测中的应用[J]. 光学学报, 2019, 39(7): 0715004.
	JU M R, LUO H B, WANG Z B, et al. Improved YOLO V3 algorithm and its application in small target detection[J]. Acta Optica Sinica, 2019, 39(7): 0715004 (in Chinese). DOI URL
[6]	濮志远, 罗素云. 复杂交通场景下的目标检测方法[J]. 信息与控制, 2025, 54(4): 632-643.
	PU Z Y, LUO S Y. Object detection method in complex traffic scenarios[J]. Information and Control, 2025, 54(4): 632-643 (in Chinese).
[7]	孙旭辉, 官铮, 王学. 红外与可见光图像分组融合的视觉 Transformer[J]. 中国图象图形学报, 2023, 28(1): 166-178.
	SUN X H, GUAN Z, WANG X. Vision transformer for fusing infrared and visible images in groups[J]. Journal of Image and Graphics, 2023, 28(1): 166-178 (in Chinese). DOI URL
[8]	ZHU L, WANG X J, KE Z H, et al. BiFormer: vision transformer with bi-level routing attention[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 10323-10333.
[9]	黄崇庆, 徐慧英, 张晓雷, 等. BGR-YOLO: 基于YOLOv8改进的交通场景下目标检测算法[EB/OL]. (2025-04-08) [2025-05-29]. https://link.cnki.net/urlid/43.1258.TP.20250408.1455.002.
	HUANG C Q, XU H Y, ZHANG X L, et al. BGR-YOLO:an improved object detection algorithm under traffic scenarios based on YOLOv8[EB/OL]. (2025-04-08) [2025-05-29]. https://link.cnki.net/urlid/43.1258.TP.20250408.1455.002. (in Chinese).
[10]	刘熠龙, 张自立, 冯冀宁. 基于UAV-YOLO的无人机航拍图像轻量化目标检测算法[J]. 现代电子技术, 2025, 48(15): 51-56.
	LIU Y L, ZHANG Z L, FENG J N. UAV-YOLO-based lightweight object detection algorithm for UAV aerial images[J]. Modern Electronics Technique, 2025, 48(15): 51-56 (in Chinese).
[11]	HOWARD A G, ZHU M L, CHEN B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications[EB/OL]. [2025-04-16]. https://arxiv.org/abs/1704.04861.
[12]	LUO Y H, CAO X, ZHANG J T, et al. CE-FPN: enhancing channel information for object detection[J]. Multimedia Tools and Applications, 2022, 81(21): 30685-30704. DOI
[13]	LI H L, LI J, WEI H B, et al. Slim-neck by GSConv: a lightweight-design for real-time detector architectures[J]. Journal of Real-Time Image Processing, 2024, 21(3): 62. DOI
[14]	ZHENG W, TANG W L, JIANG L, et al. SE-SSD: self- ensembling single-stage object detector from point cloud[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 14489-14498.
[15]	MIAO T, ZENG H C, YANG W, et al. An improved lightweight RetinaNet for ship detection in SAR images[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2022, 15: 4667-4679. DOI URL
[16]	MAITY M, BANERJEE S, CHAUDHURI S S. Faster R-CNN and YOLO based vehicle detection: a survey[C]// The 5th International Conference on Computing Methodologies and Communication. New York: IEEE Press, 2021: 1442-1447.
[17]	CHAI B S, NIE X, ZHOU Q F, et al. Enhanced cascade R-CNN for multiscale object detection in dense scenes from SAR images[J]. IEEE Sensors Journal, 2024, 24(12): 20143-20153. DOI URL
[18]	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 779-788.
[19]	REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 6517-6525.
[20]	LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 936-944.
[21]	BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4:optimal speed and accuracy of object detection[EB/OL]. [2025-04-16]. https://arxiv.org/abs/2004.10934.
[22]	NELSON J, SOLAWETZ J. YOLOv5 is here: state-of-the-art object detection at 140 FPS[EB/OL]. (2020-06-10) [2025- 04-16]. https://blog.roboflow.com/yolov5-is-here/.
[23]	LI C Y, LI L L, JIANG H L, et al. YOLOv6:a single-stage object detection framework for industrial applications[EB/OL]. [2025-04-16]. https://arxiv.org/abs/2209.02976.
[24]	WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7:trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[EB/OL]. [2025-04-16]. https://arxiv.org/pdf/2207.02696.pdf.
[25]	Ultralytics. YOLOv8(8.0)[EB/OL]. [2025-04-16]. https://github.com/ultralytics/ultralytics.
[26]	杨锦辉, 李鸿, 杜芸彦, 等. 基于改进YOLOv5s的轻量化目标检测算法[J]. 电光与控制, 2023, 30(2): 24-30.
	YANG J H, LI H, DU Y Y, et al. A lightweight object detection algorithm based on improved YOLOv5s[J]. Electronics Optics & Control, 2023, 30(2): 24-30 (in Chinese).
[27]	YU B Y, LI Z X, CAO Y, et al. YOLO-MPAM: efficient real-time neural networks based on multi-channel feature fusion[J]. Expert Systems with Applications, 2024, 252: 124282. DOI URL
[28]	DAI X Y, CHEN Y P, XIAO B, et al. Dynamic head: unifying object detection heads with attentions[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 7369-7378.
[29]	TIAN Y J, YE Q X, DOERMANN D. YOLOv12:attention-centric real-time object detectors[EB/OL]. [2025- 04-16]. https://arxiv.org/abs/2502.12524.
[30]	WEN L Y, DU D W, CAI Z W, et al. UA-DETRAC: a new benchmark and protocol for multi-object detection and tracking[J]. Computer Vision and Image Understanding, 2020, 193: 102907. DOI URL
[31]	YANG G Y, LEI J, ZHU Z K, et al. AFPN: asymptotic feature pyramid network for object detection[C]// 2023 IEEE International Conference on Systems, Man, and Cybernetics). New York: IEEE Press, 2023: 2184-2189.

Top-k	Precision	Recall	mAP@ 0.50	mAP@ 0.50:0.95	GFLOPs
1	0.563 24	0.619 81	0.599 37	0.416 07	11.1
2	0.623 38	0.593 59	0.602 03	0.445 74	11.2
3	0.669 23	0.571 61	0.623 59	0.448 92	11.3
4	0.673 39	0.613 42	0.642 58	0.444 12	11.3
5	0.654 76	0.624 12	0.631 84	0.470 34	11.4
6	0.632 35	0.625 91	0.636 18	0.457 68	11.5
7	0.615 29	0.628 39	0.633 24	0.457 65	11.5
8	0.615 95	0.656 82	0.648 06	0.468 26	11.6

Top-k	Precision	Recall	mAP@ 0.50	mAP@ 0.50:0.95	GFLOPs
1	0.563 24	0.619 81	0.599 37	0.416 07	11.1
2	0.623 38	0.593 59	0.602 03	0.445 74	11.2
3	0.669 23	0.571 61	0.623 59	0.448 92	11.3
4	0.673 39	0.613 42	0.642 58	0.444 12	11.3
5	0.654 76	0.624 12	0.631 84	0.470 34	11.4
6	0.632 35	0.625 91	0.636 18	0.457 68	11.5
7	0.615 29	0.628 39	0.633 24	0.457 65	11.5
8	0.615 95	0.656 82	0.648 06	0.468 26	11.6

编号	参数	设置
1	epochs	300
2	Batch	8
3	imgsz	640
4	workers	4
5	optimizer	SGD
6	close_mosaic	0
7	patience	50
8	warmup_epochs	3.0
9	warmup_momentum	0.8
10	lr0	0.01
11	lrf	0.01
12	mosaic	1.0
13	weight_decay	0.000 5

编号	参数	设置
1	epochs	300
2	Batch	8
3	imgsz	640
4	workers	4
5	optimizer	SGD
6	close_mosaic	0
7	patience	50
8	warmup_epochs	3.0
9	warmup_momentum	0.8
10	lr0	0.01
11	lrf	0.01
12	mosaic	1.0
13	weight_decay	0.000 5

实验方法	Precision	Recall	mAP@0.50	mAP@0.50:0.95	GFLOPs	Params/M
YOLOv8n	0.60881	0.598 17	0.589 04	0.431 27	8.1	3.00
+C2f-ReBiAttention	0.636 09	0.576 85	0.592 03	0.399 98	8.1	2.95
+C3-ReBiAttention	0.639 66	0.486 80	0.571 48	0.417 49	8.0	2.93
+CPN-ReBiAttention	0.646 37	0.563 50	0.598 16	0.424 37	8.0	2.93
+CSC-ReBiAttention	0.642 02	0.566 64	0.584 40	0.409 42	8.0	2.94
+ReBiAttention	0.615 95	0.656 82	0.648 06	0.468 26	11.6	3.45
Ours	0.696 20	0.615 02	0.661 59	0.468 31	7.9	2.87