图学学报 ›› 2025, Vol. 46 ›› Issue (1): 70-80.DOI: 10.11996/JG.j.2095-302X.2025010070
王杨1(), 马唱1, 胡明1, 孙涛2, 饶元3, 袁振羽1
收稿日期:
2024-07-25
接受日期:
2024-10-14
出版日期:
2025-02-28
发布日期:
2025-02-14
第一作者:
王杨(1971-),男,教授,博士。主要研究方向为人工智能、增强现实和生态信息学等。E-mail:wycap@126.com
基金资助:
WANG Yang1(), MA Chang1, HU Ming1, SUN Tao2, RAO Yuan3, YUAN Zhenyu1
Received:
2024-07-25
Accepted:
2024-10-14
Published:
2025-02-28
Online:
2025-02-14
First author:
WANG Yang (1971-), professor, Ph.D. His main research interests cover artificial intelligence, augmented reality and ecological informatics, etc. E-mail:wycap@126.com
Supported by:
摘要:
野外蝙蝠检测对于生态保护和科学研究具有重要意义。针对计算资源有限和复杂野外环境带来的挑战,提出了一种轻量型蝙蝠检测模型(LiteDETR-Bat),旨在实现高效的实时检测。首先为了解决特征映射冗余问题,引入重参数卷积高效层聚合网络(RCELAN)替换传统的ResNet主干网络,采用多分支特征聚合机制,有效降低了计算复杂度和参数量。其次设计了动态采样多尺度特征融合(DS-MFF),该结构集成空洞卷积和动态采样算子,通过扩大感受野并自适应调整采样位置,优化多尺度特征融合,提升多样化特征处理时模型的灵活性和鲁棒性。最后在安徽省野外环境下采集了一个涵盖多种光照条件、视角变化及蝙蝠形态变化的蝙蝠数据集,并进行了模型性能等相关实验。实验结果表明,该LiteDETR-Bat模型不仅能够使参数量降低了46.5%,mAP达到97.2%,同时在准确性和实时性上相比于YOLO系列算法均取得了一定地提升。LiteDETR-Bat模型为野外蝙蝠的监测与保护工作提供了有力的技术支持,展现了其在生态监测和生物多样性保护中的应用潜力。
中图分类号:
王杨, 马唱, 胡明, 孙涛, 饶元, 袁振羽. 基于多尺度特征融合的轻量型野外蝙蝠检测[J]. 图学学报, 2025, 46(1): 70-80.
WANG Yang, MA Chang, HU Ming, SUN Tao, RAO Yuan, YUAN Zhenyu. Lightweight wild bat detection method based on multi-scale feature fusion[J]. Journal of Graphics, 2025, 46(1): 70-80.
图5 数据集中部分蝙蝠图像示例((a)栖息在洞穴中的蝙蝠群;(b)飞行中的蝙蝠;(c)从侧面观察的蝙蝠个体;(d)从正面观察的蝙蝠个体;(e)栖息在岩壁上的蝙蝠;(f)发出声波的蝙蝠;(g)从背部观察的蝙蝠个体;(h)夜间蝙蝠群)
Fig. 5 Examples of some bat images in the dataset((a) A colony of bats roosting in a cave; (b) A bat in flight; (c) A bat observed from the side; (d) A bat observed from the front; (e) A bat roosting on a rock wall; (f) A bat emitting sound waves; (g) A bat observed from the back; (h) A group of bats at night)
类别 | 数量/张 |
---|---|
大蹄蝠(Hipposideros armiger) | 614 |
普氏蹄蝠(Hipposideros pratti) | 496 |
棕蝠(Myoti) | 662 |
菊头蝠(Rousettus leschenaultia) | 656 |
总数 | 2428 |
表1 蝙蝠数据集信息
Table 1 The information of bat dataset
类别 | 数量/张 |
---|---|
大蹄蝠(Hipposideros armiger) | 614 |
普氏蹄蝠(Hipposideros pratti) | 496 |
棕蝠(Myoti) | 662 |
菊头蝠(Rousettus leschenaultia) | 656 |
总数 | 2428 |
实验参数 | 数值 |
---|---|
训练轮次(Epoch) | 100 |
批量大小(Batchsize) | 4 |
线程(Workers) | 4 |
优化器(Optimizer) | AdamW |
初始学习率(lr0) | 0.000 1 |
动量因子(Momentum) | 0.9 |
权重衰减系数(Weight decay) | 0.000 1 |
表2 实验参数设置
Table 2 Experimental parameter setting
实验参数 | 数值 |
---|---|
训练轮次(Epoch) | 100 |
批量大小(Batchsize) | 4 |
线程(Workers) | 4 |
优化器(Optimizer) | AdamW |
初始学习率(lr0) | 0.000 1 |
动量因子(Momentum) | 0.9 |
权重衰减系数(Weight decay) | 0.000 1 |
方法 | 类别 | 评价指标/% | |||
---|---|---|---|---|---|
AP | P | R | mAP | ||
RT-DETR[ | 大蹄蝠 (Hipposideros armiger) | 93.8 | 90.5 | 92.4 | 95.1 |
普氏蹄蝠 (Hipposideros pratti) | 94.3 | ||||
棕蝠(Myoti) | 93.1 | ||||
菊头蝠 (Rousettus leschenaultia) | 99.3 | ||||
LiteDETR-Bat | 大蹄蝠 (Hipposideros armiger) | 96.4 | 96.0 | 93.0 | 97.2 |
普氏蹄蝠 (Hipposideros pratti) | 97.9 | ||||
棕蝠(Myoti) | 94.9 | ||||
菊头蝠 (Rousettus leschenaultia) | 99.4 |
表3 RT-DETR与LiteDETR-Bat在蝙蝠数据集上的检测评价指标对比
Table 3 Comparison of detection evaluation indicators between RT-DETR and LiteDETR-Bat on bat dataset
方法 | 类别 | 评价指标/% | |||
---|---|---|---|---|---|
AP | P | R | mAP | ||
RT-DETR[ | 大蹄蝠 (Hipposideros armiger) | 93.8 | 90.5 | 92.4 | 95.1 |
普氏蹄蝠 (Hipposideros pratti) | 94.3 | ||||
棕蝠(Myoti) | 93.1 | ||||
菊头蝠 (Rousettus leschenaultia) | 99.3 | ||||
LiteDETR-Bat | 大蹄蝠 (Hipposideros armiger) | 96.4 | 96.0 | 93.0 | 97.2 |
普氏蹄蝠 (Hipposideros pratti) | 97.9 | ||||
棕蝠(Myoti) | 94.9 | ||||
菊头蝠 (Rousettus leschenaultia) | 99.4 |
图6 RT-DETR与LiteDETR-Bat的检测热力图对比
Fig. 6 Comparison of detection heatmaps between RT-DETR and LiteDETR-Bat ((a) Input image; (b) RT_DETR; (c) LiteDETR-Bat)
方法 | RCELAN | DS-MFF | Para/M | GFLOPs | FPS/帧每秒 | mAP/% |
---|---|---|---|---|---|---|
RT-DETR | - | - | 18.95 | 57.0 | 99.4 | 95.1 |
RT-DETR | √ | - | 8.63 | 26.4 | 117.0 | 95.4 |
RT-DETR | - | √ | 19.26 | 61.5 | 63.4 | 95.5 |
本算法 | √ | √ | 10.12 | 35.6 | 191.3 | 97.2 |
表4 改进模块对模型性能的影响
Table 4 Impact of the improvement module on the performance of the model
方法 | RCELAN | DS-MFF | Para/M | GFLOPs | FPS/帧每秒 | mAP/% |
---|---|---|---|---|---|---|
RT-DETR | - | - | 18.95 | 57.0 | 99.4 | 95.1 |
RT-DETR | √ | - | 8.63 | 26.4 | 117.0 | 95.4 |
RT-DETR | - | √ | 19.26 | 61.5 | 63.4 | 95.5 |
本算法 | √ | √ | 10.12 | 35.6 | 191.3 | 97.2 |
方法 | Backbone | Weight/M | mAP/% | Para/M | GFLOPs | FPS/帧每秒 |
---|---|---|---|---|---|---|
YOLOv5[ | CSPDarknet53 | 40.3 | 97.2 | 19.91 | 48.3 | 158.1 |
YOLOv6[ | EfficientRep | 31.3 | 95.1 | 15.54 | 44.0 | 164.2 |
YOLOv7[ | CSPDarknet53 | 71.4 | 94.2 | 35.48 | 105.2 | 108.7 |
YOLOv8[ | CSPDarknet53 | 21.5 | 95.3 | 10.61 | 28.4 | 176.8 |
YOLOv9[ | - | 540.1 | 98.4 | 67.03 | 313.4 | 47.9 |
YOLOv10[ | - | 39.5 | 97.0 | 19.47 | 98.0 | 129.0 |
RT-DETR[ | ResNet18 | 38.6 | 95.1 | 18.95 | 57.0 | 99.4 |
RT-DETR_RCELAN | RCELAN | 17.7 | 95.4 | 8.62 | 26.4 | 117.0 |
RT-DETR_DS-MFF | ResNet18 | 39.2 | 95.5 | 19.26 | 61.5 | 63.4 |
LiteDETR-Bat | RCELAN | 20.8 | 97.2 | 10.12 | 35.6 | 191.3 |
表5 不同模型在蝙蝠数据集上的性能对比
Table 5 Performance comparison of different models on the bat dataset
方法 | Backbone | Weight/M | mAP/% | Para/M | GFLOPs | FPS/帧每秒 |
---|---|---|---|---|---|---|
YOLOv5[ | CSPDarknet53 | 40.3 | 97.2 | 19.91 | 48.3 | 158.1 |
YOLOv6[ | EfficientRep | 31.3 | 95.1 | 15.54 | 44.0 | 164.2 |
YOLOv7[ | CSPDarknet53 | 71.4 | 94.2 | 35.48 | 105.2 | 108.7 |
YOLOv8[ | CSPDarknet53 | 21.5 | 95.3 | 10.61 | 28.4 | 176.8 |
YOLOv9[ | - | 540.1 | 98.4 | 67.03 | 313.4 | 47.9 |
YOLOv10[ | - | 39.5 | 97.0 | 19.47 | 98.0 | 129.0 |
RT-DETR[ | ResNet18 | 38.6 | 95.1 | 18.95 | 57.0 | 99.4 |
RT-DETR_RCELAN | RCELAN | 17.7 | 95.4 | 8.62 | 26.4 | 117.0 |
RT-DETR_DS-MFF | ResNet18 | 39.2 | 95.5 | 19.26 | 61.5 | 63.4 |
LiteDETR-Bat | RCELAN | 20.8 | 97.2 | 10.12 | 35.6 | 191.3 |
图10 不同模型检测效果可视化对比
Fig. 10 Visual comparison of different model checking effects ((a) Input image; (b) YOLOv5; (c) YOLOv7; (d) YOLOv9; (e) YOLOv10; (f) RT-DETR; (g) LiteDETR-Bat)
方法 | P/% | R/% | mAP/% | Para/M | FPS/帧每秒 |
---|---|---|---|---|---|
RT-DETR | 71.7 | 65.3 | 66.4 | 19.20 | 78.8 |
本文算法 | 74.8 | 67.0 | 70.0 | 11.14 | 104.9 |
表6 在CUB_200_2011数据集上的泛化性能对比
Table 6 Comparison of generalization performance on the CUB_200_2011 dataset
方法 | P/% | R/% | mAP/% | Para/M | FPS/帧每秒 |
---|---|---|---|---|---|
RT-DETR | 71.7 | 65.3 | 66.4 | 19.20 | 78.8 |
本文算法 | 74.8 | 67.0 | 70.0 | 11.14 | 104.9 |
[1] | FAO. Global Forest Resources Assessment 2020-key findings[R]. Rome: FAO, 2020. |
[2] | 胡丹, 罗正汉, 叶福强, 等. 蝙蝠携带重要病毒研究进展[J]. 中国病原生物学杂志, 2023, 18(1): 111-116. |
HU D, LUO Z H, YE F Q, et al. Advances in bat carrier of important viruses[J]. Journal of Pathogen Biology, 2023, 18(1): 111-116 (in Chinese). | |
[3] | PETSO T, JAMISOLA JR R S, MPOELENG D. Review on methods used for wildlife species and individual identification[J]. European Journal of Wildlife Research, 2022, 68(1): 3. |
[4] |
朱强军, 胡斌, 汪慧兰, 等. 基于轻量化YOLOv8s交通标志的检测[J]. 图学学报, 2024, 45(3): 422-432.
DOI |
ZHU Q J, HU B, WANG H L, et al. Detection of traffic signs based on lightweight YOLOv8s[J]. Journal of Graphics, 2024, 45(3): 422-432 (in Chinese).
DOI |
|
[5] | CHEN X, ZHAO J, CHEN Y H, et al. Automatic standardized processing and identification of tropical bat calls using deep learning approaches[J]. Biological Conservation, 2020, 241: 108269. |
[6] | KRIVEK G, GILLERT A, HARDER M, et al. BatNet: a deep learning‐based tool for automated bat species identification from camera trap images[J]. Remote Sensing in Ecology and Conservation, 2023, 9(6): 759-774. |
[7] | XIE J J, ZHONG Y J, ZHANG J G, et al. A review of automatic recognition technology for bird vocalizations in the deep learning era[J]. Ecological Informatics, 2023, 73: 101927. |
[8] | PENG J B, WANG D L, LIAO X H, et al. Wild animal survey using UAS imagery and deep learning: modified Faster R-CNN for kiang detection in Tibetan Plateau[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2020, 169: 364-376. |
[9] |
陈天华, 朱家煊, 印杰. 基于注意力机制的鸟类识别算法[J]. 计算机应用, 2024, 44(4): 1114-1120.
DOI |
CHEN T H, ZHU J X, YIN J. Bird recognition algorithm based on attention mechanism[J]. Journal of Computer Applications, 2024, 44(4): 1114-1120 (in Chinese).
DOI |
|
[10] |
苑朝, 赵亚冬, 张耀, 等. 基于YOLO轻量化的多模态行人检测算法[J]. 图学学报, 2024, 45(1): 35-46.
DOI |
YUAN C, ZHAO Y D, ZHANG Y, et al. Lightweight multi-modal pedestrian detection algorithm based on YOLO[J]. Journal of Graphics, 2024, 45(1): 35-46 (in Chinese).
DOI |
|
[11] | CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers[C]// The 16th European Conference on Computer Vision. Cham: Springer, 2020: 213-229. |
[12] | HUANG Z H, LI H, XIONG X R, et al. End-to-end individual pig detection based on transfer learning[C]// The 6th International Conference on Pattern Recognition and Artificial Intelligence. New York: IEEE Press, 2023: 236-241. |
[13] | LUO H T, LUO X N, LI F, et al. Identification and detection of marine mammal dorsal fin based on deformable-DETR[C]// The 13th International Conference on Information Science and Technology. New York: IEEE Press, 2023: 349-357. |
[14] |
李刚, 张运涛, 汪文凯, 等. 采用DETR与先验知识融合的输电线路螺栓缺陷检测方法[J]. 图学学报, 2023, 44(3): 438-447.
DOI |
LI G, ZHANG Y T, WANG W K, et al. Defect detection method of transmission line bolts based on DETR and prior knowledge fusion[J]. Journal of Graphics, 2023, 44(3): 438-447 (in Chinese).
DOI |
|
[15] | ZHAO Y, LV W Y, XU S L, et al. DETRs beat YOLOs on real-time object detection[C]// 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2024: 16965-16974. |
[16] | HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 770-778. |
[17] | LIU W Z, LU H, FU H T, et al. Learning to upsample by learning to sample[C]// 2023 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2023: 6027-6037. |
[18] | SHI W Z, CABALLERO J, HUSZÁR F, et al. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 1874-1883. |
[19] | Ultralytics. YOLOv5 in PyTorch[EB/OL]. [2024-06-01]. https://github.com/ultralytics/yolov5. |
[20] | LI C Y, LI L L, JIANG H L, et al. YOLOv6: a single-stage object detection framework for industrial applications[EB/OL]. (2022-09-07) [2024-05-26]. https://arxiv.org/abs/2209.02976. |
[21] | WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 7464-7475. |
[22] | WANG C Y, YEH I H, LIAO H Y M. YOLOv9: learning what you want to learn using programmable gradient information[EB/OL]. (2024-02-29) [2024-05-26]. https://arxiv.org/abs/2402.13616. |
[23] | WANG A, CHEN H, LIU L H, et al. YOLOv10: real-time end-to-end object detection[EB/OL]. (2024-05-23) [2024-06-01]. https://arxiv.org/abs/2405.14458. |
[24] | WAH C, BRANSON S, WELINDER P, et al. The caltech-ucsd birds-200-2011 dataset[J]. California Institute of Technology, 2011, 32(1): 1-6. |
[1] | 程旭东, 史彩娟, 高炜翔, 王森, 段昌钰, 闫晓东. 面向域自适应目标检测的一致无偏教师模型[J]. 图学学报, 2025, 46(1): 114-125. |
[2] | 崔克彬, 耿佳昌. 基于EE-YOLOv8s的多场景火灾迹象检测算法[J]. 图学学报, 2025, 46(1): 13-27. |
[3] | 苑朝, 赵明雪, 张丰羿, 冯晓勇, 李冰, 陈瑞. 基于点云特征增强的复杂室内场景3D目标检测[J]. 图学学报, 2025, 46(1): 59-69. |
[4] | 孙前来, 林绍杭, 刘东峰, 宋晓阳, 刘佳耀, 刘瑞珍. 基于元学习的小样本指针式仪表检测方法[J]. 图学学报, 2025, 46(1): 81-93. |
[5] | 李琼, 考月英, 张莹, 徐沛. 面向无人机航拍图像的目标检测研究综述[J]. 图学学报, 2024, 45(6): 1145-1164. |
[6] | 李珍峰, 符世琛, 徐乐, 孟博, 张昕, 秦建军. 基于MBI-YOLOv8的煤矸石目标检测算法研究[J]. 图学学报, 2024, 45(6): 1301-1312. |
[7] | 闫建红, 冉同霄. 基于YOLOv8的轻量化无人机图像目标检测算法[J]. 图学学报, 2024, 45(6): 1328-1337. |
[8] | 胡凤阔, 叶兰, 谭显峰, 张钦展, 胡志新, 方清, 王磊, 满孝锋. 一种基于改进YOLOv8的轻量化路面病害检测算法[J]. 图学学报, 2024, 45(5): 892-900. |
[9] | 刘丽, 张起凡, 白宇昂, 黄凯烨. 结合Swin Transformer的多尺度遥感图像变化检测研究[J]. 图学学报, 2024, 45(5): 941-956. |
[10] | 姜晓恒, 段金忠, 卢洋, 崔丽莎, 徐明亮. 融合先验知识推理的表面缺陷检测[J]. 图学学报, 2024, 45(5): 957-967. |
[11] | 章东平, 魏杨悦, 何数技, 徐云超, 胡海苗, 黄文君. 特征融合与层间传递:一种基于Anchor DETR改进的目标检测方法[J]. 图学学报, 2024, 45(5): 968-978. |
[12] | 李建华, 韩宇, 石开铭, 张可嘉, 郭红领, 方东平, 曹佳明. 施工现场小目标工人检测方法[J]. 图学学报, 2024, 45(5): 1040-1049. |
[13] | 孙己龙, 刘勇, 周黎伟, 路鑫, 侯小龙, 王亚琼, 王志丰. 基于DCNv2和Transformer Decoder的隧道衬砌裂缝高效检测模型研究[J]. 图学学报, 2024, 45(5): 1050-1061. |
[14] | 李大湘, 吉展, 刘颖, 唐垚. 改进YOLOv7遥感图像目标检测算法[J]. 图学学报, 2024, 45(4): 650-658. |
[15] | 张新宇, 张家意, 高欣. ASC-Net:腹腔镜视频中手术器械与脏器快速分割网络[J]. 图学学报, 2024, 45(4): 659-669. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||