Lightweight wild bat detection method based on multi-scale feature fusion

doi:10.11996/JG.j.2095-302X.2025010070

Abstract

Abstract:

Bat detection in the wild is crucial for ecological protection and scientific research. To address the challenges brought by limited computing resources and complex wild environments, a lightweight bat detection model (LiteDETR-Bat) was proposed to achieve efficient real-time detection. Firstly, in order to solve the problem of feature mapping redundancy, the reparameterized convolutional efficient layer aggregation network (RCELAN) was introduced, replacing the traditional ResNet backbone network and adopting a multi-branch feature aggregation mechanism, thereby reducing computational complexity and parameter quantity. Secondly, a dynamic sampling-multi scale feature fusion (DS-MFF) was designed. This structure integrated dilated convolution and dynamic sampling operators, optimizing 0multi-scale feature fusion by expanding the receptive field and adaptively adjusting sampling positions, which enhanced the flexibility and robustness of the model in processing diversified features. Finally, a bat dataset covering various lighting conditions, perspective changes, and bat morphology changes was collected in the wild environment of Anhui Province, and related experiments such as model performance were conducted on this dataset. Experimental results showed that the proposed LiteDETR-Bat model not only reduced the number of parameters by 46.5% and achieved an mAP of 97.2%, but also made certain improvements in accuracy and real-time performance compared with the YOLO series algorithms. The LiteDETR-Bat model provided strong technical support for the monitoring and protection of wild bats, and demonstrated its application potential in ecological monitoring and biodiversity conservation.

Key words: wild bats, RT-DETR, multi-scale feature, lightweight, object detection

CLC Number:

TP391.41
Q95

WANG Yang, MA Chang, HU Ming, SUN Tao, RAO Yuan, YUAN Zhenyu. Lightweight wild bat detection method based on multi-scale feature fusion[J]. Journal of Graphics, 2025, 46(1): 70-80.

Figures/Tables 17

References 24

[1]	FAO. Global Forest Resources Assessment 2020-key findings[R]. Rome: FAO, 2020.
[2]	胡丹, 罗正汉, 叶福强, 等. 蝙蝠携带重要病毒研究进展[J]. 中国病原生物学杂志, 2023, 18(1): 111-116.
	HU D, LUO Z H, YE F Q, et al. Advances in bat carrier of important viruses[J]. Journal of Pathogen Biology, 2023, 18(1): 111-116 (in Chinese).
[3]	PETSO T, JAMISOLA JR R S, MPOELENG D. Review on methods used for wildlife species and individual identification[J]. European Journal of Wildlife Research, 2022, 68(1): 3.
[4]	朱强军, 胡斌, 汪慧兰, 等. 基于轻量化YOLOv8s交通标志的检测[J]. 图学学报, 2024, 45(3): 422-432. DOI
	ZHU Q J, HU B, WANG H L, et al. Detection of traffic signs based on lightweight YOLOv8s[J]. Journal of Graphics, 2024, 45(3): 422-432 (in Chinese). DOI
[5]	CHEN X, ZHAO J, CHEN Y H, et al. Automatic standardized processing and identification of tropical bat calls using deep learning approaches[J]. Biological Conservation, 2020, 241: 108269.
[6]	KRIVEK G, GILLERT A, HARDER M, et al. BatNet: a deep learning‐based tool for automated bat species identification from camera trap images[J]. Remote Sensing in Ecology and Conservation, 2023, 9(6): 759-774.
[7]	XIE J J, ZHONG Y J, ZHANG J G, et al. A review of automatic recognition technology for bird vocalizations in the deep learning era[J]. Ecological Informatics, 2023, 73: 101927.
[8]	PENG J B, WANG D L, LIAO X H, et al. Wild animal survey using UAS imagery and deep learning: modified Faster R-CNN for kiang detection in Tibetan Plateau[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2020, 169: 364-376.
[9]	陈天华, 朱家煊, 印杰. 基于注意力机制的鸟类识别算法[J]. 计算机应用, 2024, 44(4): 1114-1120. DOI
	CHEN T H, ZHU J X, YIN J. Bird recognition algorithm based on attention mechanism[J]. Journal of Computer Applications, 2024, 44(4): 1114-1120 (in Chinese). DOI
[10]	苑朝, 赵亚冬, 张耀, 等. 基于YOLO轻量化的多模态行人检测算法[J]. 图学学报, 2024, 45(1): 35-46. DOI
	YUAN C, ZHAO Y D, ZHANG Y, et al. Lightweight multi-modal pedestrian detection algorithm based on YOLO[J]. Journal of Graphics, 2024, 45(1): 35-46 (in Chinese). DOI
[11]	CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers[C]// The 16th European Conference on Computer Vision. Cham: Springer, 2020: 213-229.
[12]	HUANG Z H, LI H, XIONG X R, et al. End-to-end individual pig detection based on transfer learning[C]// The 6th International Conference on Pattern Recognition and Artificial Intelligence. New York: IEEE Press, 2023: 236-241.
[13]	LUO H T, LUO X N, LI F, et al. Identification and detection of marine mammal dorsal fin based on deformable-DETR[C]// The 13th International Conference on Information Science and Technology. New York: IEEE Press, 2023: 349-357.
[14]	李刚, 张运涛, 汪文凯, 等. 采用DETR与先验知识融合的输电线路螺栓缺陷检测方法[J]. 图学学报, 2023, 44(3): 438-447. DOI
	LI G, ZHANG Y T, WANG W K, et al. Defect detection method of transmission line bolts based on DETR and prior knowledge fusion[J]. Journal of Graphics, 2023, 44(3): 438-447 (in Chinese). DOI
[15]	ZHAO Y, LV W Y, XU S L, et al. DETRs beat YOLOs on real-time object detection[C]// 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2024: 16965-16974.
[16]	HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 770-778.
[17]	LIU W Z, LU H, FU H T, et al. Learning to upsample by learning to sample[C]// 2023 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2023: 6027-6037.
[18]	SHI W Z, CABALLERO J, HUSZÁR F, et al. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 1874-1883.
[19]	Ultralytics. YOLOv5 in PyTorch[EB/OL]. [2024-06-01]. https://github.com/ultralytics/yolov5.
[20]	LI C Y, LI L L, JIANG H L, et al. YOLOv6: a single-stage object detection framework for industrial applications[EB/OL]. (2022-09-07) [2024-05-26]. https://arxiv.org/abs/2209.02976.
[21]	WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 7464-7475.
[22]	WANG C Y, YEH I H, LIAO H Y M. YOLOv9: learning what you want to learn using programmable gradient information[EB/OL]. (2024-02-29) [2024-05-26]. https://arxiv.org/abs/2402.13616.
[23]	WANG A, CHEN H, LIU L H, et al. YOLOv10: real-time end-to-end object detection[EB/OL]. (2024-05-23) [2024-06-01]. https://arxiv.org/abs/2405.14458.
[24]	WAH C, BRANSON S, WELINDER P, et al. The caltech-ucsd birds-200-2011 dataset[J]. California Institute of Technology, 2011, 32(1): 1-6.

类别	数量/张
大蹄蝠(Hipposideros armiger)	614
普氏蹄蝠(Hipposideros pratti)	496
棕蝠(Myoti)	662
菊头蝠(Rousettus leschenaultia)	656
总数	2428

类别	数量/张
大蹄蝠(Hipposideros armiger)	614
普氏蹄蝠(Hipposideros pratti)	496
棕蝠(Myoti)	662
菊头蝠(Rousettus leschenaultia)	656
总数	2428

实验参数	数值
训练轮次(Epoch)	100
批量大小(Batchsize)	4
线程(Workers)	4
优化器(Optimizer)	AdamW
初始学习率(lr0)	0.000 1
动量因子(Momentum)	0.9
权重衰减系数(Weight decay)	0.000 1

实验参数	数值
训练轮次(Epoch)	100
批量大小(Batchsize)	4
线程(Workers)	4
优化器(Optimizer)	AdamW
初始学习率(lr0)	0.000 1
动量因子(Momentum)	0.9
权重衰减系数(Weight decay)	0.000 1

方法	类别	评价指标/%
方法	类别	AP	P	R	mAP
RT-DETR^[15]	大蹄蝠 (Hipposideros armiger)	93.8	90.5	92.4	95.1
	普氏蹄蝠 (Hipposideros pratti)	94.3
	棕蝠(Myoti)	93.1
	菊头蝠 (Rousettus leschenaultia)	99.3
LiteDETR-Bat	大蹄蝠 (Hipposideros armiger)	96.4	96.0	93.0	97.2
	普氏蹄蝠 (Hipposideros pratti)	97.9
	棕蝠(Myoti)	94.9
	菊头蝠 (Rousettus leschenaultia)	99.4