基于改进YOLOv8的船舰遥感图像旋转目标检测算法

doi:10.11996/JG.j.2095-302X.2024040726

摘要/Abstract

摘要：

针对船舰遥感目标图像检测中存在的小目标检测困难，船舰形状各异以及传统水平边界框对于高长宽比的目标所框选冗余信息较多的问题，提出了一种基于改进YOLOv8的船舰遥感图像旋转目标检测算法。通过改进主干网络中的卷积结构，缓解了由于跨步卷积所导致的细粒度信息丢失的问题，对于小目标检测的精度有所提升；将C2f中的部分卷积模块替换为DCNv3可变形卷积，使其可以更好提取不规则物体的特征信息，提高模型的非线性建模能力；在颈部网络中融入主干网络中的浅层特征信息，缓解了经多次卷积操作所导致的细节信息丢失的问题，提升了模型对小目标物体的检测能力。实验结果表明，改进后的算法在ShipRSImageNet数据集上的检测精度(mAP50)达到了84.317%，较基准模型提升了4.054%，在HRSC2016数据集上达到了93.235%，较基准模型提升了1.555%，在少量增加模型参数量的情况下取得了较高的检测性能，很好地平衡了模型的效率和性能。

关键词: YOLOv8, 旋转目标检测, 可变形卷积, 特征融合, 深度学习

Abstract:

Aiming at the problems of difficulty in detecting small targets in ship remote sensing target image detection, varied ship shapes, and excessive redundant information in traditional horizontal bounding boxes for targets with high aspect ratios, a rotating target detection algorithm for ship remote sensing images based on an improvedYOLOv8 was proposed. By improving the convolution structure in the backbone network, the problem of fine-grained information loss caused by stride convolution was alleviated, improving the accuracy of small target detection. By replacing some of the convolution modules in C2f with DCNv3 deformable convolution, the feature information extraction of irregular objects was enhanced, improving the nonlinear modeling capabilities of the model. Integrating the shallow feature map from the backbone network into the neck alleviated the problem of detailed information loss caused by multiple convolution operations, enhancing the detection capability for small target objects. Experimental results showed that the detection accuracy (mAP50) of the improved algorithm on the ShipRSImageNet dataset reached 84.317%, which is 4.054% higher than the baseline model. The model accuracy reached 93.235% on the HRSC2016 dataset, which is 1.555% higher than the baseline model. The improved algorithm achieved high detection performance with a small increase in the number of model parameters, effectively balancing model efficiency and performance.

Key words: YOLOv8, rotating target detection, deformable convolution, feature fusion, deep learning

中图分类号:

牛为华, 郭迅. 基于改进YOLOv8的船舰遥感图像旋转目标检测算法[J]. 图学学报, 2024, 45(4): 726-735.

NIU Weihua, GUO Xun. Rotating target detection algorithm in ship remote sensing images based on YOLOv8[J]. Journal of Graphics, 2024, 45(4): 726-735.

图/表 17

图1 YOLOv8 网络结构

Fig. 1 YOLOv8 network architecture

图2 水平检测框与旋转检测框对比效果图((a)水平框；(b)旋转框)

Fig. 2 Comparison renderings of HBB and OBB ((a) HBB; (b) OBB)

图3 ConvFocus模块

Fig. 3 ConvFocus Module

图4 ConvFocus图示

Fig. 4 Illustration of ConvFocus

图5 普通卷积和可变形卷积对比((a)普通卷积；(b)可变性卷积)

Fig. 5 Comparison between ordinary convolution and deformable convolution ((a) Ordinary convolution; (b) Deformable convolution)

图6 C2f_DCNv3模块

Fig. 6 C2f_DCNv3 Module

图7 改进后的YOLOv8模型网络结构

Fig. 7 Improved network structure of YOLOv8

表1 不同算法添加ConvFocus模块对比结果

Table 1 Comparison results of adding ConvFocus module to different algorithms

Module	Conv Focus	Params/M	FLOPs/G	Ship/%	Dock/%	mAP50/%
YOLOv3	×	7.2	21.1	80.7	68.0	74.4
YOLOv5	×	7.0	15.8	78.8	71.9	75.4
YOLOv8	×	3.0	8.1	74.1	78.2	76.1
YOLOv3	√	7.7	37.0	82.5	70.0	76.2 (+1.8)
YOLOv5	√	8.0	29.7	79.5	74.3	76.9 (+1.5)
YOLOv8	√	3.2	11.6	77.2	78.3	77.8 (+1.7)

表2 C2f模块中替换不同数量的DCNv3的实验结果

Table 2 Experimental results of replacing different numbers of DCNv3 in C2f module

Group	DCNv3数量	Params/M	FLOPs/G	Ship/%	Dock/%	mAP50/%
G1	0	3.07	8.3	77.699	82.826	80.263
G2	n	2.94	8.1	78.549	85.959	82.254
G3	2n	2.72	7.4	74.258	75.875	75.066

图8 每组实验前80轮实际训练时长

Fig. 8 The actual training time of 80 rounds before each group of experiments

表3 改进前后实验结果对比

Table 3 Comparison of experimental results before and after improvement

Module	Params/M	FLOPs/G	Ship/%	Dock/%	mAP50/%
YOLOv8n	3.00	8.1	76.093	82.999	79.546
YOLOv8n-obb	3.07	8.3	77.699	82.826	80.263
Ours	3.25	13.3	82.261	86.373	84.317

图9 原始YOLOv8n-obb算法的PR曲线

Fig. 9 PR curve of original YOLOv8n-obb algorithm

图10 改进后算法的PR曲线

Fig. 10 PR curve of the improved algorithm

表4 消融实验结果对比

Table 4 Comparsion of the ablation experiments

Module	Params/M	FLOPs/G	Ship/%	Dock/%	mAP@50/%	AP-Small/%	AR-Small/%
YOLOv8n-obb	3.07	8.3	77.699	82.826	80.263	56.2	84.3
YOLOv8n-obb + CF	3.33	11.8	79.299	83.616	81.458	63.2	84.5
YOLOv8n-obb + C2fD	2.94	8.1	78.549	85.959	82.254	59.5	84.9
YOLOv8n-obb + FF	3.17	10.2	80.608	82.115	81.361	64.4	88.8
Ours	3.25	13.3	82.261	86.373	84.317	58.3	90.2

表5 不同模型在ShipRSImageNet数据集上的对比实验

Table 5 Comparative experiments of different models on ShipRSImageNet dataset

Method	Venue	Backbone	Params/M	FLOPs/G	Ship/%	Dock/%	mAP@50/%
GWD^[17]	ICML’21	ResNet-50	36.42	215.92	61.90	10.40	36.20
KLD^[25]	NIPS’21	ResNet-50	36.13	209.58	60.30	16.70	38.50
R³Det^[19]	AAAI’21	ResNet-50	41.58	200.92	68.60	29.70	45.80
Gliding vertex^[24]	TPAMI’20	ResNet-50	41.13	121.50	68.80	40.10	54.50
S²A-Net^[23]	TGRS’21	ResNet-50	35.02	198.03	70.00	45.20	57.60
KFIoU^[26]	ICLR’23	CSPDarkNet-53	11.42	29.60	75.71	71.07	73.39
PETDet^[27]	IEEE TGRS’23	ResNet-50	47.67	204.07	78.70	78.50	78.60
YOLOv8n-obb	-	CSPDarkNet-53	3.07	8.30	77.69	82.82	80.26
Ours	-	CSPDarkNet-53	3.25	13.30	82.26	86.37	84.31

表6 不同模型在HRSC2016数据集上的对比实验

Table 6 Comparative experiments of different models on HRSC2016 dataset

Method	Params/M	FLOPs/G	mAP@50/%
Gliding Vertex^[24]	41.13	121.50	88.20
R³Det^[19]	41.58	200.92	89.26
Oriented RCNN^[28]	121.58	41.13	90.33
RoI-Transformer^[18]	122.61	55.13	90.21
H2RBox-v2^[29]	51.41	242.69	89.66
PSC^[30]	36.19	210.94	90.06
YOLOv8n-obb	3.07	8.30	91.68
Ours	3.25	13.30	93.23

图11 检测结果可视化对比图((a)原始图片；(b)真实标签；(c) YOLOv8n-obb；(d)本文改进算法)

Fig. 11 Test Result Comparison ((a) Original image; (b) True label; (c) YOLOv8n-obb; (d) Ours)

参考文献 30

[1]	ZHU C R, ZHOU H, WANG R S, et al. A novel hierarchical method of ship detection from spaceborne optical image based on shape and texture features[J]. IEEE Transactions on Geoscience and Remote Sensing, 2010, 48(9): 3446-3456.
[2]	SHI Z W, YU X R, JIANG Z G, et al. Ship detection in high-resolution optical imagery based on anomaly detector and local shape feature[J]. IEEE Transactions on Geoscience and Remote Sensing, 2014, 52(8): 4511-4523.
[3]	YANG F, XU Q Z, LI B. Ship detection from optical satellite images based on saliency segmentation and structure-LBP feature[J]. IEEE Geoscience and Remote Sensing Letters, 2017, 14(5): 602-606.
[4]	ZHANG D D, WANG C P, FU Q. OFCOS: an oriented anchor-free detector for ship detection in remote sensing images[J]. IEEE Geoscience and Remote Sensing Letters, 2023, 20: 6004005.
[5]	REN Z D, TANG Y Q, HE Z W, et al. Ship detection in high-resolution optical remote sensing images aided by saliency information[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 5623616.
[6]	HAN W X, KUERBAN A, YANG Y C, et al. Multi-vision network for accurate and real-time small object detection in optical remote sensing images[J]. IEEE Geoscience and Remote Sensing Letters, 2022, 19: 6001205.
[7]	张磊, 陈文, 王岳环. 用于遥感舰船细粒度检测与识别的关键子区域融合网络[J]. 中国图象图形学报, 2023, 28(9): 2940-2955.
	ZHANG L, CHEN W, WANG Y H. Key sub-region feature fusion network for fine-grained ship detection and recognition in remote sensing images[J]. Journal of Image and Graphics, 2023, 28(9): 2940-2955 (in Chinese).
[8]	焦仕昂, 罗亮, 杨萌, 等. 基于改进YOLOv7的光学遥感图像船舶旋转目标检测[EB/OL]. (2023-07-05)[2024-04-16]. http://kns.cnki.net/kcms/detail/42.1824.U.20230704.1821.111.html.
	JIAO S A, LUO L, YANG M, et al. Ship rotating target detection in optical remote sensing images based on improved YOLOv7[EB/OL]. (2023-07-05)[2024-04-16]. http://kns.cnki.net/kcms/detail/42.1824.U.20230704.1821.111.html (in Cinese).
[9]	邹珺淏, 任酉贵, 冷芳玲, 等. LW-YOLOv7SAR: 轻量SAR图像目标检测方法[EB/OL]. (2023-11-06) [2024-04-16]. http://kns.cnki.net/kcms/detail/21.1106.TP.20231106.0909.002.html.
	ZOU J H, REN Y G, LENG F L, etc. LW-YOLOv7SAR: Lightweight SAR image target detection method[EB/OL]. (2023-11-06)[2024-04-16]. http://kns.cnki.net/kcms/detail/21.1106.TP.20231106.0909.002.html (in Chinese).
[10]	李大湘, 吉展, 刘颖, 等. 改进YOLOv7遥感图像目标检测算法[EB/OL]. [2024-04-15]. https://kns.cnki.net/kcms/detail/10.1034.T.20240412.2210.004.html.
	LI D X, JI Z, LIU Y, et al. Improving YOLOv7 remote sensing image target detection algorithm[EB/OL]. [2024-04-15]. https://kns.cnki.net/kcms/detail/10.1034.T.20240412.2210.004.html (in Chinese).
[11]	曾伦杰, 储珺, 陈昭俊. 二阶段锚框和类均衡损失的遥感图像目标检测[J]. 图学学报, 2023, 44(2): 249-259. DOI
	ZENG L J, CHU J, CHEN Z J. Object detection in remote sensing image based on two-stage anchor and class balanced loss[J]. Journal of Graphics, 2023, 44(2): 249-259 (in Chinese).
[12]	WANG W H, DAI J F, CHEN Z, et al. InternImage: exploring large-scale vision foundation models with deformable convolutions[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 14408-14419.
[13]	BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. [2024-04-15]. http://arxiv.org/abs/2004.10934.
[14]	ZHANG H Y, CISSE M, DAUPHIN Y N, et al. Mixup: beyond empirical risk minimization[EB/OL]. [2024-04-15]. http://arxiv.org/abs/1710.09412.
[15]	ZHENG Z H, WANG P, REN D W, et al. Enhancing geometric factors in model learning and inference for object detection and instance segmentation[J]. IEEE Transactions on Cybernetics, 2022, 52(8): 8574-8586.
[16]	LI X, WANG W H, WU L J, et al. Generalized focal loss: learning qualified and distributed bounding boxes for dense object detection[EB/OL]. [2024-04-15]. http://arxiv.org/abs/2006.04388.
[17]	YANG X, ZHANG G F, YANG X J, et al. Detecting rotated objects as Gaussian distributions and its 3-D generalization[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(4): 4335-4354.
[18]	DING J, XUE N, LONG Y, et al. Learning RoI transformer for oriented object detection in aerial images[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 2844-2853.
[19]	YANG X, YAN J C, FENG Z M, et al. R3Det: refined single-stage detector with feature refinement for rotating object[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2021, 35(4): 3163-3171.
[20]	LLERENA J M, ZENI L F, KRISTEN L N, et al. Gaussian bounding boxes and probabilistic intersection-over-union for object detection[EB/OL]. [2024-04-15]. http://arxiv.org/abs/2106.06072.
[21]	WANG R X, SHIVANNA R, CHENG D, et al. DCN V2: improved deep & cross network and practical lessons for web-scale learning to rank systems[C]// Proceedings of the Web Conference 2021. New York: ACM, 2021: 1785-1797.
[22]	ZHANG Z N, ZHANG L, WANG Y, et al. ShipRSImageNet: a large-scale fine-grained dataset for ship detection in high-resolution optical remote sensing images[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2021, 14: 8458-8472.
[23]	HAN J M, DING J, LI J, et al. Align deep features for oriented object detection[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 5602511.
[24]	XU Y C, FU M T, WANG Q M, et al. Gliding vertex on the horizontal bounding box for multi-oriented object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(4): 1452-1459.
[25]	YANG X, ZHANG G F, YANG X J, et al. Detecting rotated objects as Gaussian distributions and its 3-D generalization[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(4): 4335-4354.
[26]	YANG X, ZHOU Y, ZHANG G F, et al. The KFIoU loss for rotated object detection[EB/OL]. [2024-04-15]. http://arxiv.org/abs/2201.12558.
[27]	LI W T, ZHAO D P, YUAN B, et al. PETDet: proposal enhancement for two-stage fine-grained object detection[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 62: 5602214.
[28]	XIE X X, CHENG G, WANG J B, et al. Oriented R-CNN for object detection[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 3500-3509.
[29]	YU Y, YANG X, LI Q Y, et al. H2RBox-v2: incorporating symmetry for boosting horizontal box supervised oriented object detection[EB/OL]. [2024-04-15]. http://arxiv.org/abs/2304.04403.
[30]	YU Y, DA F P. Phase-shifting coder: predicting accurate orientation in oriented object detection[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 13354-13363.