Journal of Graphics ›› 2024, Vol. 45 ›› Issue (4): 650-658.DOI: 10.11996/JG.j.2095-302X.2024040650
• Image Processing and Computer Vision • Previous Articles Next Articles
LI Daxiang(), JI Zhan, LIU Ying, TANG Yao
Received:
2023-07-17
Accepted:
2024-04-09
Online:
2024-08-31
Published:
2024-09-02
About author:
First author contact:LI Daxiang (1974-), associate professor, Ph.D. His main research interests cover remote sensing image classification, target detection and tracking, medical image segmentation, etc. E-mail:www_ldx@163.com
Supported by:
CLC Number:
LI Daxiang, JI Zhan, LIU Ying, TANG Yao. Improving YOLOv7 remote sensing image target detection algorithm[J]. Journal of Graphics, 2024, 45(4): 650-658.
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.txxb.com.cn/EN/10.11996/JG.j.2095-302X.2024040650
Fig. 4 Visualization of feature map ((a) YOLOv7 thermogram; (b) YOLOv7+WSA thermogram; (c) YOLOv7 detection results; (d) YOLOv7+WSA detection results)
方法 | APL | APO | BF | BC | BR | CH | DAM | ETS | ESA | GF | GTF |
---|---|---|---|---|---|---|---|---|---|---|---|
基线 | 72.48 | 30.00 | 81.03 | 81.56 | 33.78 | 72.66 | 20.11 | 72.23 | 79.96 | 49.88 | 74.93 |
+AM | 81.45 | 37.05 | 81.16 | 81.60 | 40.31 | 72.71 | 25.20 | 72.28 | 80.82 | 67.32 | 76.07 |
+AL | 81.57 | 30.07 | 81.03 | 81.48 | 34.07 | 72.60 | 18.93 | 72.04 | 80.46 | 60.84 | 75.28 |
Ours | 80.48 | 29.44 | 80.54 | 89.06 | 34.75 | 75.13 | 28.48 | 74.47 | 81.13 | 67.06 | 76.59 |
方法 | HA | OP | SH | STA | STO | TC | TS | VE | WM | mAP | |
基线 | 40.57 | 51.79 | 81.17 | 62.25 | 62.77 | 81.56 | 48.25 | 51.91 | 72.78 | 61.08 | |
+AM | 41.95 | 52.68 | 81.22 | 62.90 | 62.05 | 90.19 | 50.66 | 44.48 | 73.83 | 63.80 | |
+AL | 41.42 | 52.47 | 81.10 | 62.66 | 62.09 | 81.54 | 49.19 | 51.89 | 73.47 | 62.21 | |
Ours | 40.61 | 52.77 | 88.24 | 67.67 | 66.82 | 88.83 | 46.95 | 48.79 | 72.44 | 64.51 |
Table 1 Ablation experiment data/%
方法 | APL | APO | BF | BC | BR | CH | DAM | ETS | ESA | GF | GTF |
---|---|---|---|---|---|---|---|---|---|---|---|
基线 | 72.48 | 30.00 | 81.03 | 81.56 | 33.78 | 72.66 | 20.11 | 72.23 | 79.96 | 49.88 | 74.93 |
+AM | 81.45 | 37.05 | 81.16 | 81.60 | 40.31 | 72.71 | 25.20 | 72.28 | 80.82 | 67.32 | 76.07 |
+AL | 81.57 | 30.07 | 81.03 | 81.48 | 34.07 | 72.60 | 18.93 | 72.04 | 80.46 | 60.84 | 75.28 |
Ours | 80.48 | 29.44 | 80.54 | 89.06 | 34.75 | 75.13 | 28.48 | 74.47 | 81.13 | 67.06 | 76.59 |
方法 | HA | OP | SH | STA | STO | TC | TS | VE | WM | mAP | |
基线 | 40.57 | 51.79 | 81.17 | 62.25 | 62.77 | 81.56 | 48.25 | 51.91 | 72.78 | 61.08 | |
+AM | 41.95 | 52.68 | 81.22 | 62.90 | 62.05 | 90.19 | 50.66 | 44.48 | 73.83 | 63.80 | |
+AL | 41.42 | 52.47 | 81.10 | 62.66 | 62.09 | 81.54 | 49.19 | 51.89 | 73.47 | 62.21 | |
Ours | 40.61 | 52.77 | 88.24 | 67.67 | 66.82 | 88.83 | 46.95 | 48.79 | 72.44 | 64.51 |
方法 | AMSFE | ALAN | Params/M | FLOPs/G |
---|---|---|---|---|
基线 | - | - | 37.30 | 164.78 |
√ | - | 40.95 | 183.32 | |
- | √ | 37.53 | 171.32 | |
√ | √ | 41.18 | 189.85 |
Table 2 The impact of each module on the complexity of the algorithm
方法 | AMSFE | ALAN | Params/M | FLOPs/G |
---|---|---|---|---|
基线 | - | - | 37.30 | 164.78 |
√ | - | 40.95 | 183.32 | |
- | √ | 37.53 | 171.32 | |
√ | √ | 41.18 | 189.85 |
Methods | APL | APO | BF | BC | BR | CH | DAM | ETS | ESA | GF | GTF |
---|---|---|---|---|---|---|---|---|---|---|---|
Fsater rc-O[ | 62.79 | 26.80 | 71.72 | 80.91 | 34.20 | 72.57 | 18.95 | 66.45 | 65.75 | 66.63 | 79.24 |
RetinaNet-O[ | 61.49 | 28.52 | 73.57 | 81.17 | 23.98 | 72.54 | 19.94 | 72.39 | 58.20 | 69.25 | 79.54 |
Gliding V.[ | 65.35 | 28.87 | 74.96 | 81.33 | 33.88 | 74.31 | 19.58 | 70.72 | 64.70 | 72.30 | 78.68 |
RoI Trans.[ | 63.34 | 37.88 | 71.78 | 87.53 | 40.68 | 72.60 | 26.86 | 78.71 | 68.09 | 68.96 | 82.74 |
AOPG[ | 62.39 | 37.79 | 71.62 | 87.63 | 40.90 | 72.47 | 31.08 | 65.42 | 77.99 | 73.20 | 81.94 |
QPDet[ | 63.22 | 41.39 | 71.97 | 88.55 | 41.23 | 72.63 | 28.82 | 78.90 | 69.00 | 70.07 | 83.01 |
Ours | 80.48 | 29.44 | 80.54 | 89.06 | 34.75 | 75.13 | 28.48 | 74.47 | 81.13 | 67.06 | 76.59 |
Fsater rc-O[ | HA | OP | SH | STA | STO | TC | TS | VE | WM | mAP | |
RetinaNet-O[ | 34.95 | 48.79 | 81.14 | 64.34 | 71.21 | 81.44 | 47.31 | 50.46 | 65.21 | 59.54 | |
Gliding V.[ | 32.14 | 44.87 | 77.71 | 67.57 | 61.09 | 81.46 | 47.33 | 38.01 | 60.24 | 57.55 | |
RoI Trans.[ | 37.22 | 49.64 | 80.22 | 69.26 | 61.13 | 81.49 | 44.76 | 47.71 | 65.04 | 60.06 | |
AOPG[ | 47.71 | 55.61 | 81.21 | 78.23 | 70.26 | 81.61 | 54.86 | 43.27 | 65.52 | 63.87 | |
QPDet[ | 42.32 | 54.45 | 81.17 | 72.69 | 71.31 | 81.49 | 60.04 | 52.38 | 69.99 | 64.41 | |
Ours | 47.83 | 55.54 | 81.23 | 72.15 | 62.66 | 89.05 | 58.09 | 43.38 | 65.36 | 64.20 | |
Fsater rc-O[ | 40.61 | 52.77 | 88.24 | 67.67 | 66.82 | 88.83 | 46.95 | 48.79 | 72.44 | 64.51 |
Table 3 Comparison of experimental results in the DIOR-R data set/%
Methods | APL | APO | BF | BC | BR | CH | DAM | ETS | ESA | GF | GTF |
---|---|---|---|---|---|---|---|---|---|---|---|
Fsater rc-O[ | 62.79 | 26.80 | 71.72 | 80.91 | 34.20 | 72.57 | 18.95 | 66.45 | 65.75 | 66.63 | 79.24 |
RetinaNet-O[ | 61.49 | 28.52 | 73.57 | 81.17 | 23.98 | 72.54 | 19.94 | 72.39 | 58.20 | 69.25 | 79.54 |
Gliding V.[ | 65.35 | 28.87 | 74.96 | 81.33 | 33.88 | 74.31 | 19.58 | 70.72 | 64.70 | 72.30 | 78.68 |
RoI Trans.[ | 63.34 | 37.88 | 71.78 | 87.53 | 40.68 | 72.60 | 26.86 | 78.71 | 68.09 | 68.96 | 82.74 |
AOPG[ | 62.39 | 37.79 | 71.62 | 87.63 | 40.90 | 72.47 | 31.08 | 65.42 | 77.99 | 73.20 | 81.94 |
QPDet[ | 63.22 | 41.39 | 71.97 | 88.55 | 41.23 | 72.63 | 28.82 | 78.90 | 69.00 | 70.07 | 83.01 |
Ours | 80.48 | 29.44 | 80.54 | 89.06 | 34.75 | 75.13 | 28.48 | 74.47 | 81.13 | 67.06 | 76.59 |
Fsater rc-O[ | HA | OP | SH | STA | STO | TC | TS | VE | WM | mAP | |
RetinaNet-O[ | 34.95 | 48.79 | 81.14 | 64.34 | 71.21 | 81.44 | 47.31 | 50.46 | 65.21 | 59.54 | |
Gliding V.[ | 32.14 | 44.87 | 77.71 | 67.57 | 61.09 | 81.46 | 47.33 | 38.01 | 60.24 | 57.55 | |
RoI Trans.[ | 37.22 | 49.64 | 80.22 | 69.26 | 61.13 | 81.49 | 44.76 | 47.71 | 65.04 | 60.06 | |
AOPG[ | 47.71 | 55.61 | 81.21 | 78.23 | 70.26 | 81.61 | 54.86 | 43.27 | 65.52 | 63.87 | |
QPDet[ | 42.32 | 54.45 | 81.17 | 72.69 | 71.31 | 81.49 | 60.04 | 52.38 | 69.99 | 64.41 | |
Ours | 47.83 | 55.54 | 81.23 | 72.15 | 62.66 | 89.05 | 58.09 | 43.38 | 65.36 | 64.20 | |
Fsater rc-O[ | 40.61 | 52.77 | 88.24 | 67.67 | 66.82 | 88.83 | 46.95 | 48.79 | 72.44 | 64.51 |
方法 | Backbone | AP |
---|---|---|
RoI Transformer[ | R-101-FPN | 86.20 |
Gliding Ver[ | R-101-FPN | 88.20 |
RetinaNet-O[ | R-101-FPN | 89.18 |
Oriented RepP[ | R-50-FPN | 90.38 |
R3Det[ | R-101-FPN | 89.26 |
DAL[ | R-101-FPN | 89.77 |
Ours | CSPDarknet53 | 90.87 |
Table 4 Comparison of experimental results in the HRSC2016 data set/%
方法 | Backbone | AP |
---|---|---|
RoI Transformer[ | R-101-FPN | 86.20 |
Gliding Ver[ | R-101-FPN | 88.20 |
RetinaNet-O[ | R-101-FPN | 89.18 |
Oriented RepP[ | R-50-FPN | 90.38 |
R3Det[ | R-101-FPN | 89.26 |
DAL[ | R-101-FPN | 89.77 |
Ours | CSPDarknet53 | 90.87 |
[1] | 李德仁, 王密, 沈欣, 等. 从对地观测卫星到对地观测脑[J]. 武汉大学学报: 信息科学版, 2017, 42(2): 143-149. |
LI D R, WANG M, SHEN X, et al. From earth observation satellite to earth observation brain[J]. Geomatics and Information Science of Wuhan University, 2017, 42(2): 143-149 (in Chinese). | |
[2] | YUAN Q Q, SHEN H F, LI T W, et al. Deep learning in environmental remote sensing: achievements and challenges[J]. Remote Sensing of Environment, 2020, 241: 111716. |
[3] | CHENG G, ZHOU P C, YAO X W, et al. Object detection in VHR optical remote sensing images via learning rotation-invariant HOG feature[C]// 2016 4th International Workshop on Earth Observation and Remote Sensing Applications. New York: IEEE Press, 2016: 433-436. |
[4] | QI S X, MA J, LIN J, et al. Unsupervised ship detection based on saliency and S-HOG descriptor from optical satellite images[J]. IEEE Geoscience and Remote Sensing Letters, 2015, 12(7): 1451-1455. |
[5] | WU X, HONG D F, TIAN J J, et al. ORSIm detector: a novel object detection framework in optical remote sensing imagery using spatial-frequency channel features[J]. IEEE Transactions on Geoscience and Remote Sensing, 2019, 57(7): 5146-5158. |
[6] |
REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
DOI PMID |
[7] | REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 779-788. |
[8] | LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector[M]//Computer Vision - ECCV 2016. Cham: Springer International Publishing, 2016: 21-37. |
[9] | LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]// 2017 IEEE International Conference on Computer Vision. New York: IEEE Press, 2017: 2999-3007. |
[10] | MA J Q, SHAO W Y, YE H, et al. Arbitrary-oriented scene text detection via rotation proposals[J]. IEEE Transactions on Multimedia, 2018, 20(11): 3111-3122. |
[11] | YANG X, YAN J C, FENG Z M, et al. R3Det: refined single-stage detector with feature refinement for rotating object[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2021, 35(4): 3163-3171. |
[12] | HAN J M, DING J, XUE N, et al. ReDet: a rotation- equivariant detector for aerial object detection[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 2785-2794. |
[13] | YANG X, YAN J C, MING Q, et al. Rethinking rotated object detection with Gaussian Wasserstein distance loss[EB/OL]. [2023-04-08]. http://arxiv.org/abs/2101.11952. |
[14] | YANG X, YANG X J, YANG J R, et al. Learning high-precision bounding box for rotated object detection via kullback-leibler divergence[EB/OL]. [2023-04-08]. http://arxiv.org/abs/2106.01883. |
[15] | YANG X, YAN J C. Arbitrary-oriented object detection with circular smooth label[M]//Computer Vision - ECCV 2020. Cham: Springer International Publishing, 2020: 677-694. |
[16] | YANG X, HOU L P, ZHOU Y, et al. Dense label encoding for boundary discontinuity free rotation detection[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 15814-15824. |
[17] |
毛爱坤, 刘昕明, 陈文壮, 等. 改进YOLOv5算法的变电站仪表目标检测方法[J]. 图学学报, 2023, 44(3): 448-455.
DOI |
MAO A K, LIU X M, CHEN W Z, et al. Improved substation instrument target detection method for YOLOv5 algorithm[J]. Journal of Graphics, 2023, 44(3): 448-455 (in Chinese). | |
[18] | 东辉, 陈鑫凯, 孙浩, 等. 基于改进YOLOv4和图像处理的蔬菜田杂草检测[J]. 图学学报, 2022, 43(4): 559-569. |
DONG H, CHEN X K, SUN H, et al. Weed detection in vegetable field based on improved YOLOv4 and image processing[J]. Journal of Graphics, 2022, 43(4): 559-569 (in Chinese). | |
[19] | LIU L K, LIU Y X, YAN J N, et al. Object detection in large-scale remote sensing images with a distributed deep learning framework[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2022, 15: 8142-8154. |
[20] | WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 7464-7475. |
[21] | DING X H, ZHANG X Y, MA N N, et al. RepVGG: making VGG-style ConvNets great again[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 13728-13737. |
[22] | LIU S, QI L, QIN H F, et al. Path aggregation network for instance segmentation[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 8759-8768. |
[23] | YANG Y T, JIAO L C, LIU X, et al. Dual wavelet attention networks for image classification[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2023, 33(4): 1899-1910. |
[24] | WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[M]//Computer Vision - ECCV 2018. Cham: Springer International Publishing, 2018: 3-19. |
[25] | QIN Z Q, ZHANG P Y, WU F, et al. FcaNet: frequency channel attention networks[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 763-772. |
[26] | SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: visual explanations from deep networks via gradient-based localization[J]. International Journal of Computer Vision, 2020, 128(2): 336-359. |
[27] | ZHANG H, ZU K K, LU J, et al. EPSANet: an efficient pyramid squeeze attention block on convolutional neural network[C]// Computer Vision - ACCV 2022: 16th Asian Conference on Computer Vision. New York: ACM, 2022: 541-557. |
[28] | LI G Q, FANG Q, ZHA L L, et al. HAM: hybrid attention module in deep convolutional neural networks for image classification[J]. Pattern Recognition, 2022, 129: 108785. |
[29] | XU Y C, FU M T, WANG Q M, et al. Gliding vertex on the horizontal bounding box for multi-oriented object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(4): 1452-1459. |
[30] | CHENG G, WANG J B, LI K, et al. Anchor-free oriented proposal generator for object detection[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 5625411. |
[31] | YAO Y Q, CHENG G, WANG G X, et al. On improving bounding box representations for oriented object detection[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 61: 5600111. |
[32] | LI W T, CHEN Y J, HU K X, et al. Oriented RepPoints for aerial object detection[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 1819-1828. |
[33] | MA Y C, LIU S T, LI Z M, et al. IQDet: instance-wise quality distribution sampling for object detection[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 1717-1725. |
[1] | HU Fengkuo, YE Lan, TAN Xianfeng, ZHANG Qinzhan, HU Zhixin, FANG Qing, WANG Lei, MAN Xiaofeng. A refined YOLOv8-based algorithm for lightweight pavement disease detection [J]. Journal of Graphics, 2024, 45(5): 892-900. |
[2] |
WANG Yaru, FENG Lilong, SONG Xiaoke, QU Zhuo, YANG Ke, WANG Qianming, ZHAI Yongjie .
TFD-YOLOv8: a transmission line foreign object detection method
[J]. Journal of Graphics, 2024, 45(5): 901-912.
|
[3] | LIU Yiyan, HAO Tingnan, HE Chen, CHANG Yingjie. Photovoltaic cell surface defect detection based on DBBR-YOLO [J]. Journal of Graphics, 2024, 45(5): 913-921. |
[4] | WU Peichen, YUAN Lining, HU Hao, LIU Zhao, GUO Fang. Video anomaly detection based on attention feature fusion [J]. Journal of Graphics, 2024, 45(5): 922-929. |
[5] | LIU Li, ZHANG Qifan, BAI Yuang, HUANG Kaiye. Research on multi-scale remote sensing image change detection using Swin Transformer [J]. Journal of Graphics, 2024, 45(5): 941-956. |
[6] | JIANG Xiaoheng, DUAN Jinzhong, LU Yang, CUI Lisha, XU Mingliang. Fusing prior knowledge reasoning for surface defect detection [J]. Journal of Graphics, 2024, 45(5): 957-967. |
[7] | ZHANG Dongping, WEI Yangyue, HE Shuji, XU Yunchao, HU Haimiao, HUANG Wenjun. Feature fusion and inter-layer transmission: an improved object detection method based on Anchor DETR [J]. Journal of Graphics, 2024, 45(5): 968-978. |
[8] | XIE Guobo, LIN Songze, LIN Zhiyi, WU Chenfeng, LIANG Lihui. Road defect detection algorithm based on improved YOLOv7-tiny [J]. Journal of Graphics, 2024, 45(5): 987-997. |
[9] | SUN Jilong, LIU Yong, ZHOU Liwei, LU Xin, HOU Xiaolong, WANG Yaqiong, WANG Zhifeng. Research on efficient detection model of tunnel lining crack based on DCNv2 and Transformer Decoder [J]. Journal of Graphics, 2024, 45(5): 1050-1061. |
[10] | LIU Zongming, HONG Wei, LONG Rui, ZHU Yue, ZHANG Xiaoyu. Research on automatic generation and application of Ruyuan Yao embroidery based on self-attention mechanism [J]. Journal of Graphics, 2024, 45(5): 1096-1105. |
[11] | CHENG Yan, YAN Zhihang, LAI Jianming, WANG Guixi, ZHONG Linhui. Automatic portrait matting model based on semantic guidance [J]. Journal of Graphics, 2024, 45(4): 683-695. |
[12] | WEI Min, YAO Xin. Two-stage storm entity prediction based on multiscale and attention [J]. Journal of Graphics, 2024, 45(4): 696-704. |
[13] | HU Xin, CHANG Yashu, QIN Hao, XIAO Jian, CHENG Hongliang. Binocular ranging method based on improved YOLOv8 and GMM image point set matching [J]. Journal of Graphics, 2024, 45(4): 714-725. |
[14] | ZENG Zhichao, XU Yue, WANG Jingyu, YE Yuanlong, HUANG Zhikai, WANG Huan. A water surface target detection algorithm based on SOE-YOLO lightweight network [J]. Journal of Graphics, 2024, 45(4): 736-744. |
[15] | GONG Yongchao, SHEN Xukun. A deep architecture for reciprocal object detection and instance segmentation [J]. Journal of Graphics, 2024, 45(4): 745-759. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||