图学学报 ›› 2024, Vol. 45 ›› Issue (4): 650-658.DOI: 10.11996/JG.j.2095-302X.2024040650
收稿日期:
2023-07-17
接受日期:
2024-04-09
出版日期:
2024-08-31
发布日期:
2024-09-02
第一作者:
李大湘(1974-),男,副教授,博士。主要研究方向为遥感图像分类、目标检测与跟踪、医学图像分割等。E-mail:www_ldx@163.com
基金资助:
LI Daxiang(), JI Zhan, LIU Ying, TANG Yao
Received:
2023-07-17
Accepted:
2024-04-09
Published:
2024-08-31
Online:
2024-09-02
First author:
LI Daxiang (1974-), associate professor, Ph.D. His main research interests cover remote sensing image classification, target detection and tracking, medical image segmentation, etc. E-mail:www_ldx@163.com
Supported by:
摘要:
针对遥感图像中目标尺度变化大和背景复杂导致检测精度低的问题,设计了一种改进YOLOv7目标检测算法。首先,为了缓解复杂背景对检测器的干扰,设计了一个注意力引导的高效层聚合网络(ALAN),以优化多路径网络使其更聚焦前景目标而降低背景的影响;其次,为了降低目标尺度变化大对检测精度的影响,设计了一种注意力多尺度特征增强(AMSFE)模块,用于扩展主干网络输出特征的感受野,以加强网络对尺度变化大目标的特征表征能力;最后,引入旋转边界框损失函数,以获取任意朝向物体的精确位置信息。在DIOR-R数据集上的实验结果表明,该算法mAP达到了64.51%,相比于基线原始YOLOv7算法提高了3.43%,且优于其他同类算法,能够适应遥感图像中多尺度和复杂背景的目标检测任务。
中图分类号:
李大湘, 吉展, 刘颖, 唐垚. 改进YOLOv7遥感图像目标检测算法[J]. 图学学报, 2024, 45(4): 650-658.
LI Daxiang, JI Zhan, LIU Ying, TANG Yao. Improving YOLOv7 remote sensing image target detection algorithm[J]. Journal of Graphics, 2024, 45(4): 650-658.
图4 特征可视化((a) YOLOv7热图;(b) YOLOv7+WSA热图;(c) YOLOv7检测图;(d) YOLOv7+WSA检测图)
Fig. 4 Visualization of feature map ((a) YOLOv7 thermogram; (b) YOLOv7+WSA thermogram; (c) YOLOv7 detection results; (d) YOLOv7+WSA detection results)
方法 | APL | APO | BF | BC | BR | CH | DAM | ETS | ESA | GF | GTF |
---|---|---|---|---|---|---|---|---|---|---|---|
基线 | 72.48 | 30.00 | 81.03 | 81.56 | 33.78 | 72.66 | 20.11 | 72.23 | 79.96 | 49.88 | 74.93 |
+AM | 81.45 | 37.05 | 81.16 | 81.60 | 40.31 | 72.71 | 25.20 | 72.28 | 80.82 | 67.32 | 76.07 |
+AL | 81.57 | 30.07 | 81.03 | 81.48 | 34.07 | 72.60 | 18.93 | 72.04 | 80.46 | 60.84 | 75.28 |
Ours | 80.48 | 29.44 | 80.54 | 89.06 | 34.75 | 75.13 | 28.48 | 74.47 | 81.13 | 67.06 | 76.59 |
方法 | HA | OP | SH | STA | STO | TC | TS | VE | WM | mAP | |
基线 | 40.57 | 51.79 | 81.17 | 62.25 | 62.77 | 81.56 | 48.25 | 51.91 | 72.78 | 61.08 | |
+AM | 41.95 | 52.68 | 81.22 | 62.90 | 62.05 | 90.19 | 50.66 | 44.48 | 73.83 | 63.80 | |
+AL | 41.42 | 52.47 | 81.10 | 62.66 | 62.09 | 81.54 | 49.19 | 51.89 | 73.47 | 62.21 | |
Ours | 40.61 | 52.77 | 88.24 | 67.67 | 66.82 | 88.83 | 46.95 | 48.79 | 72.44 | 64.51 |
表1 消融实验数据/%
Table 1 Ablation experiment data/%
方法 | APL | APO | BF | BC | BR | CH | DAM | ETS | ESA | GF | GTF |
---|---|---|---|---|---|---|---|---|---|---|---|
基线 | 72.48 | 30.00 | 81.03 | 81.56 | 33.78 | 72.66 | 20.11 | 72.23 | 79.96 | 49.88 | 74.93 |
+AM | 81.45 | 37.05 | 81.16 | 81.60 | 40.31 | 72.71 | 25.20 | 72.28 | 80.82 | 67.32 | 76.07 |
+AL | 81.57 | 30.07 | 81.03 | 81.48 | 34.07 | 72.60 | 18.93 | 72.04 | 80.46 | 60.84 | 75.28 |
Ours | 80.48 | 29.44 | 80.54 | 89.06 | 34.75 | 75.13 | 28.48 | 74.47 | 81.13 | 67.06 | 76.59 |
方法 | HA | OP | SH | STA | STO | TC | TS | VE | WM | mAP | |
基线 | 40.57 | 51.79 | 81.17 | 62.25 | 62.77 | 81.56 | 48.25 | 51.91 | 72.78 | 61.08 | |
+AM | 41.95 | 52.68 | 81.22 | 62.90 | 62.05 | 90.19 | 50.66 | 44.48 | 73.83 | 63.80 | |
+AL | 41.42 | 52.47 | 81.10 | 62.66 | 62.09 | 81.54 | 49.19 | 51.89 | 73.47 | 62.21 | |
Ours | 40.61 | 52.77 | 88.24 | 67.67 | 66.82 | 88.83 | 46.95 | 48.79 | 72.44 | 64.51 |
方法 | AMSFE | ALAN | Params/M | FLOPs/G |
---|---|---|---|---|
基线 | - | - | 37.30 | 164.78 |
√ | - | 40.95 | 183.32 | |
- | √ | 37.53 | 171.32 | |
√ | √ | 41.18 | 189.85 |
表2 各模块对算法复杂度的影响
Table 2 The impact of each module on the complexity of the algorithm
方法 | AMSFE | ALAN | Params/M | FLOPs/G |
---|---|---|---|---|
基线 | - | - | 37.30 | 164.78 |
√ | - | 40.95 | 183.32 | |
- | √ | 37.53 | 171.32 | |
√ | √ | 41.18 | 189.85 |
Methods | APL | APO | BF | BC | BR | CH | DAM | ETS | ESA | GF | GTF |
---|---|---|---|---|---|---|---|---|---|---|---|
Fsater rc-O[ | 62.79 | 26.80 | 71.72 | 80.91 | 34.20 | 72.57 | 18.95 | 66.45 | 65.75 | 66.63 | 79.24 |
RetinaNet-O[ | 61.49 | 28.52 | 73.57 | 81.17 | 23.98 | 72.54 | 19.94 | 72.39 | 58.20 | 69.25 | 79.54 |
Gliding V.[ | 65.35 | 28.87 | 74.96 | 81.33 | 33.88 | 74.31 | 19.58 | 70.72 | 64.70 | 72.30 | 78.68 |
RoI Trans.[ | 63.34 | 37.88 | 71.78 | 87.53 | 40.68 | 72.60 | 26.86 | 78.71 | 68.09 | 68.96 | 82.74 |
AOPG[ | 62.39 | 37.79 | 71.62 | 87.63 | 40.90 | 72.47 | 31.08 | 65.42 | 77.99 | 73.20 | 81.94 |
QPDet[ | 63.22 | 41.39 | 71.97 | 88.55 | 41.23 | 72.63 | 28.82 | 78.90 | 69.00 | 70.07 | 83.01 |
Ours | 80.48 | 29.44 | 80.54 | 89.06 | 34.75 | 75.13 | 28.48 | 74.47 | 81.13 | 67.06 | 76.59 |
Fsater rc-O[ | HA | OP | SH | STA | STO | TC | TS | VE | WM | mAP | |
RetinaNet-O[ | 34.95 | 48.79 | 81.14 | 64.34 | 71.21 | 81.44 | 47.31 | 50.46 | 65.21 | 59.54 | |
Gliding V.[ | 32.14 | 44.87 | 77.71 | 67.57 | 61.09 | 81.46 | 47.33 | 38.01 | 60.24 | 57.55 | |
RoI Trans.[ | 37.22 | 49.64 | 80.22 | 69.26 | 61.13 | 81.49 | 44.76 | 47.71 | 65.04 | 60.06 | |
AOPG[ | 47.71 | 55.61 | 81.21 | 78.23 | 70.26 | 81.61 | 54.86 | 43.27 | 65.52 | 63.87 | |
QPDet[ | 42.32 | 54.45 | 81.17 | 72.69 | 71.31 | 81.49 | 60.04 | 52.38 | 69.99 | 64.41 | |
Ours | 47.83 | 55.54 | 81.23 | 72.15 | 62.66 | 89.05 | 58.09 | 43.38 | 65.36 | 64.20 | |
Fsater rc-O[ | 40.61 | 52.77 | 88.24 | 67.67 | 66.82 | 88.83 | 46.95 | 48.79 | 72.44 | 64.51 |
表3 DIOR-R数据集上的对比实验结果/%
Table 3 Comparison of experimental results in the DIOR-R data set/%
Methods | APL | APO | BF | BC | BR | CH | DAM | ETS | ESA | GF | GTF |
---|---|---|---|---|---|---|---|---|---|---|---|
Fsater rc-O[ | 62.79 | 26.80 | 71.72 | 80.91 | 34.20 | 72.57 | 18.95 | 66.45 | 65.75 | 66.63 | 79.24 |
RetinaNet-O[ | 61.49 | 28.52 | 73.57 | 81.17 | 23.98 | 72.54 | 19.94 | 72.39 | 58.20 | 69.25 | 79.54 |
Gliding V.[ | 65.35 | 28.87 | 74.96 | 81.33 | 33.88 | 74.31 | 19.58 | 70.72 | 64.70 | 72.30 | 78.68 |
RoI Trans.[ | 63.34 | 37.88 | 71.78 | 87.53 | 40.68 | 72.60 | 26.86 | 78.71 | 68.09 | 68.96 | 82.74 |
AOPG[ | 62.39 | 37.79 | 71.62 | 87.63 | 40.90 | 72.47 | 31.08 | 65.42 | 77.99 | 73.20 | 81.94 |
QPDet[ | 63.22 | 41.39 | 71.97 | 88.55 | 41.23 | 72.63 | 28.82 | 78.90 | 69.00 | 70.07 | 83.01 |
Ours | 80.48 | 29.44 | 80.54 | 89.06 | 34.75 | 75.13 | 28.48 | 74.47 | 81.13 | 67.06 | 76.59 |
Fsater rc-O[ | HA | OP | SH | STA | STO | TC | TS | VE | WM | mAP | |
RetinaNet-O[ | 34.95 | 48.79 | 81.14 | 64.34 | 71.21 | 81.44 | 47.31 | 50.46 | 65.21 | 59.54 | |
Gliding V.[ | 32.14 | 44.87 | 77.71 | 67.57 | 61.09 | 81.46 | 47.33 | 38.01 | 60.24 | 57.55 | |
RoI Trans.[ | 37.22 | 49.64 | 80.22 | 69.26 | 61.13 | 81.49 | 44.76 | 47.71 | 65.04 | 60.06 | |
AOPG[ | 47.71 | 55.61 | 81.21 | 78.23 | 70.26 | 81.61 | 54.86 | 43.27 | 65.52 | 63.87 | |
QPDet[ | 42.32 | 54.45 | 81.17 | 72.69 | 71.31 | 81.49 | 60.04 | 52.38 | 69.99 | 64.41 | |
Ours | 47.83 | 55.54 | 81.23 | 72.15 | 62.66 | 89.05 | 58.09 | 43.38 | 65.36 | 64.20 | |
Fsater rc-O[ | 40.61 | 52.77 | 88.24 | 67.67 | 66.82 | 88.83 | 46.95 | 48.79 | 72.44 | 64.51 |
方法 | Backbone | AP |
---|---|---|
RoI Transformer[ | R-101-FPN | 86.20 |
Gliding Ver[ | R-101-FPN | 88.20 |
RetinaNet-O[ | R-101-FPN | 89.18 |
Oriented RepP[ | R-50-FPN | 90.38 |
R3Det[ | R-101-FPN | 89.26 |
DAL[ | R-101-FPN | 89.77 |
Ours | CSPDarknet53 | 90.87 |
表4 HRSC2016数据集上的对比实验结果/%
Table 4 Comparison of experimental results in the HRSC2016 data set/%
方法 | Backbone | AP |
---|---|---|
RoI Transformer[ | R-101-FPN | 86.20 |
Gliding Ver[ | R-101-FPN | 88.20 |
RetinaNet-O[ | R-101-FPN | 89.18 |
Oriented RepP[ | R-50-FPN | 90.38 |
R3Det[ | R-101-FPN | 89.26 |
DAL[ | R-101-FPN | 89.77 |
Ours | CSPDarknet53 | 90.87 |
图8 不同算法在DIOR-R数据集的检测结果图
Fig. 8 Detection results of the different algorithm on the DIOR-R dataset ((a1, a2) Ours; (b1, b2) AOPG; (c1, c2) Fsater rc-O)
[1] | 李德仁, 王密, 沈欣, 等. 从对地观测卫星到对地观测脑[J]. 武汉大学学报: 信息科学版, 2017, 42(2): 143-149. |
LI D R, WANG M, SHEN X, et al. From earth observation satellite to earth observation brain[J]. Geomatics and Information Science of Wuhan University, 2017, 42(2): 143-149 (in Chinese). | |
[2] | YUAN Q Q, SHEN H F, LI T W, et al. Deep learning in environmental remote sensing: achievements and challenges[J]. Remote Sensing of Environment, 2020, 241: 111716. |
[3] | CHENG G, ZHOU P C, YAO X W, et al. Object detection in VHR optical remote sensing images via learning rotation-invariant HOG feature[C]// 2016 4th International Workshop on Earth Observation and Remote Sensing Applications. New York: IEEE Press, 2016: 433-436. |
[4] | QI S X, MA J, LIN J, et al. Unsupervised ship detection based on saliency and S-HOG descriptor from optical satellite images[J]. IEEE Geoscience and Remote Sensing Letters, 2015, 12(7): 1451-1455. |
[5] | WU X, HONG D F, TIAN J J, et al. ORSIm detector: a novel object detection framework in optical remote sensing imagery using spatial-frequency channel features[J]. IEEE Transactions on Geoscience and Remote Sensing, 2019, 57(7): 5146-5158. |
[6] |
REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
DOI PMID |
[7] | REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 779-788. |
[8] | LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector[M]//Computer Vision - ECCV 2016. Cham: Springer International Publishing, 2016: 21-37. |
[9] | LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]// 2017 IEEE International Conference on Computer Vision. New York: IEEE Press, 2017: 2999-3007. |
[10] | MA J Q, SHAO W Y, YE H, et al. Arbitrary-oriented scene text detection via rotation proposals[J]. IEEE Transactions on Multimedia, 2018, 20(11): 3111-3122. |
[11] | YANG X, YAN J C, FENG Z M, et al. R3Det: refined single-stage detector with feature refinement for rotating object[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2021, 35(4): 3163-3171. |
[12] | HAN J M, DING J, XUE N, et al. ReDet: a rotation- equivariant detector for aerial object detection[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 2785-2794. |
[13] | YANG X, YAN J C, MING Q, et al. Rethinking rotated object detection with Gaussian Wasserstein distance loss[EB/OL]. [2023-04-08]. http://arxiv.org/abs/2101.11952. |
[14] | YANG X, YANG X J, YANG J R, et al. Learning high-precision bounding box for rotated object detection via kullback-leibler divergence[EB/OL]. [2023-04-08]. http://arxiv.org/abs/2106.01883. |
[15] | YANG X, YAN J C. Arbitrary-oriented object detection with circular smooth label[M]//Computer Vision - ECCV 2020. Cham: Springer International Publishing, 2020: 677-694. |
[16] | YANG X, HOU L P, ZHOU Y, et al. Dense label encoding for boundary discontinuity free rotation detection[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 15814-15824. |
[17] |
毛爱坤, 刘昕明, 陈文壮, 等. 改进YOLOv5算法的变电站仪表目标检测方法[J]. 图学学报, 2023, 44(3): 448-455.
DOI |
MAO A K, LIU X M, CHEN W Z, et al. Improved substation instrument target detection method for YOLOv5 algorithm[J]. Journal of Graphics, 2023, 44(3): 448-455 (in Chinese). | |
[18] | 东辉, 陈鑫凯, 孙浩, 等. 基于改进YOLOv4和图像处理的蔬菜田杂草检测[J]. 图学学报, 2022, 43(4): 559-569. |
DONG H, CHEN X K, SUN H, et al. Weed detection in vegetable field based on improved YOLOv4 and image processing[J]. Journal of Graphics, 2022, 43(4): 559-569 (in Chinese). | |
[19] | LIU L K, LIU Y X, YAN J N, et al. Object detection in large-scale remote sensing images with a distributed deep learning framework[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2022, 15: 8142-8154. |
[20] | WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 7464-7475. |
[21] | DING X H, ZHANG X Y, MA N N, et al. RepVGG: making VGG-style ConvNets great again[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 13728-13737. |
[22] | LIU S, QI L, QIN H F, et al. Path aggregation network for instance segmentation[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 8759-8768. |
[23] | YANG Y T, JIAO L C, LIU X, et al. Dual wavelet attention networks for image classification[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2023, 33(4): 1899-1910. |
[24] | WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[M]//Computer Vision - ECCV 2018. Cham: Springer International Publishing, 2018: 3-19. |
[25] | QIN Z Q, ZHANG P Y, WU F, et al. FcaNet: frequency channel attention networks[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 763-772. |
[26] | SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: visual explanations from deep networks via gradient-based localization[J]. International Journal of Computer Vision, 2020, 128(2): 336-359. |
[27] | ZHANG H, ZU K K, LU J, et al. EPSANet: an efficient pyramid squeeze attention block on convolutional neural network[C]// Computer Vision - ACCV 2022: 16th Asian Conference on Computer Vision. New York: ACM, 2022: 541-557. |
[28] | LI G Q, FANG Q, ZHA L L, et al. HAM: hybrid attention module in deep convolutional neural networks for image classification[J]. Pattern Recognition, 2022, 129: 108785. |
[29] | XU Y C, FU M T, WANG Q M, et al. Gliding vertex on the horizontal bounding box for multi-oriented object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(4): 1452-1459. |
[30] | CHENG G, WANG J B, LI K, et al. Anchor-free oriented proposal generator for object detection[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 5625411. |
[31] | YAO Y Q, CHENG G, WANG G X, et al. On improving bounding box representations for oriented object detection[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 61: 5600111. |
[32] | LI W T, CHEN Y J, HU K X, et al. Oriented RepPoints for aerial object detection[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 1819-1828. |
[33] | MA Y C, LIU S T, LI Z M, et al. IQDet: instance-wise quality distribution sampling for object detection[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 1717-1725. |
[1] | 胡凤阔 , 叶兰 , 谭显峰 , 张钦展 , 胡志新 , 方清 , 王磊 , 满孝锋 . 一种基于改进 YOLOv8 的轻量化路面病害检测算法[J]. 图学学报, 2024, 45(5): 892-900. |
[2] | 刘义艳 , 郝婷楠 , 贺晨 , 常英杰 . 基于 DBBR-YOLO 的光伏电池表面缺陷检测[J]. 图学学报, 2024, 45(5): 913-921. |
[3] | 吴沛宸 , 袁立宁 , 胡皓 , 刘钊 , 郭放 . 基于注意力特征融合的视频异常行为检测[J]. 图学学报, 2024, 45(5): 922-929. |
[4] | 刘丽, 张起凡, 白宇昂, 黄凯烨. 结合Swin Transformer的多尺度遥感图像变化检测研究[J]. 图学学报, 2024, 45(5): 941-956. |
[5] | 姜晓恒, 段金忠, 卢洋, 崔丽莎, 徐明亮. 融合先验知识推理的表面缺陷检测[J]. 图学学报, 2024, 45(5): 957-967. |
[6] | 章东平 , 魏杨悦 , 何数技 , 徐云超 , 胡海苗 , 黄文君 . 特征融合与层间传递:一种基于Anchor DETR改进的目标检测方法[J]. 图学学报, 2024, 45(5): 968-978. |
[7] | 谢国波, 林松泽, 林志毅, 吴陈锋, 梁立辉. 基于改进YOLOv7-tiny的道路病害检测算法[J]. 图学学报, 2024, 45(5): 987-997. |
[8] | 熊超 , 王云艳 , 罗雨浩 . 特征对齐与上下文引导的多视图三维重建[J]. 图学学报, 2024, 45(5): 1008-1016. |
[9] | 彭文, 林金炜. 基于空间信息关注和纹理增强的短小染色体分类方法[J]. 图学学报, 2024, 45(5): 1017-1029. |
[10] | 李建华 , 韩宇 , 石开铭 , 张可嘉 , 郭红领 , 方东平 , 曹佳明 . 施工现场小目标工人检测方法[J]. 图学学报, 2024, 45(5): 1040-1049. |
[11] | 孙己龙 , 刘勇 , 周黎伟 , 路鑫 , 侯小龙 , 王亚琼 , 王志丰 . 基于DCNv2和Transformer Decoder的隧道衬砌裂缝高效检测模型研究[J]. 图学学报, 2024, 45(5): 1050-1061. |
[12] | 刘宗明 , 洪唯 , 龙睿 , 祝越 , 张小宇 . 基于自注意机制的乳源瑶绣自动生成与应用研究[J]. 图学学报, 2024, 45(5): 1096-1105. |
[13] | 程艳, 严志航, 赖建明, 王桂喜, 钟林辉. 基于语义引导的人像自动抠图模型[J]. 图学学报, 2024, 45(4): 683-695. |
[14] | 魏敏, 姚鑫. 基于多尺度与注意力机制的两阶段风暴单体外推研究[J]. 图学学报, 2024, 45(4): 696-704. |
[15] | 胡欣, 常娅姝, 秦皓, 肖剑, 程鸿亮. 基于改进YOLOv8和GMM图像点集匹配的双目测距方法[J]. 图学学报, 2024, 45(4): 714-725. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||