欢迎访问《图学学报》 分享到:

图学学报 ›› 2023, Vol. 44 ›› Issue (4): 658-666.DOI: 10.11996/JG.j.2095-302X.2023040658

• 图像处理与计算机视觉 • 上一篇    下一篇

基于特征融合与注意力机制的无人机图像小目标检测算法

李利霞1(), 王鑫2,1,3(), 王军3, 张又元4   

  1. 1.桂林电子科技大学计算机与信息安全学院,广西 桂林 541010
    2.电子科技大学信息与软件工程学院,四川 成都 610000
    3.桂林电子科技大学海洋工程学院,广西 北海 536000
    4.兰州交通大学电子与信息工程学院,甘肃 兰州 730070
  • 收稿日期:2022-11-18 接受日期:2023-01-18 出版日期:2023-08-31 发布日期:2023-08-16
  • 通讯作者: 王鑫(1976-),男,教授,博士。主要研究方向为图像处理、网络信息安全、物联网和数据挖掘等。E-mail:304379506@qq.com
  • 作者简介:

    李利霞(1995-),女,硕士研究生。主要研究方向为图像处理和物体识别。E-mail:20032202019@mails.guet.edu.cn

  • 基金资助:
    广西科技重大专项(AA19254016);广西硕士研究生创新项目(YCSW2021174);北海市科技规划项目(202082033);北海市科技规划项目(202082023)

Small object detection algorithm in UAV image based on feature fusion and attention mechanism

LI Li-xia1(), WANG Xin2,1,3(), WANG Jun3, ZHANG You-yuan4   

  1. 1. School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin Guangxi 541010, China
    2. School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu Sichuan 610000, China
    3. School of Marine Engineering, Guilin University of Electronic Technology, Beihai Guangxi 536000, China
    4. School of Electronics and Information Engineering, Lanzhou Jiaotong University, Lanzhou Gansu 730070, China
  • Received:2022-11-18 Accepted:2023-01-18 Online:2023-08-31 Published:2023-08-16
  • Contact: WANG Xin (1976-), professor, Ph.D. His main research interests cover image processing, network information security, internet of things, data mining and other research, etc. E-mail:304379506@qq.com
  • About author:

    LI Li-xia (1995-), master student. Her main research interests cover image processing and object recognition. E-mail:20032202019@mails.guet.edu.cn

  • Supported by:
    Guangxi Science and Technology Major Project(AA19254016);Guangxi Graduate Student Innovation Project(YCSW2021174);Beihai City Science and Technology Planning Project(202082033);Beihai City Science and Technology Planning Project(202082023)

摘要:

由于无人机航拍图像目标物体尺寸太小、包含的特征信息少,导致现有的检测算法对小目标的检测效果不理想。针对该问题,在YOLOv5主干网络中融入多头注意力机制,可以有效整合全局特征信息。随着网络深度的不断加深,模型将更关注高层的语义信息,进而忽略对小目标检测至关重要的底层细节纹理特征,以致小目标的检测效果较差。因此,提出浅层特征增强模块来学习底层特征信息,达到增强小目标特征信息的目的。此外,为了加强特征融合的能力,设计了一种多级特征融合模块,将不同层级的特征信息进行聚合,使网络能够动态调节各输出检测层的权重。实验结果表明,该算法在公开数据集VisDrone2021平均均值精度达到45.7%,相比原YOLOv5算法提升了3.1%,对高分辨率图像的检测速度FPS达到41帧/秒,满足实时性,与其他主流算法相比该算法检测精度有明显提升。

关键词: 特征融合, 注意力机制, 无人机航拍图像, 小目标检测, YOLOv5

Abstract:

The task of detecting small objects in UAV aerial images is a formidable challenge due to their diminutive size and insufficient amount of feature information. To surmount this predicament, a multi-head attention mechanism was incorporated into the YOLOv5 backbone network in order to seamlessly integrate global feature information. As the network depth increased, the model tended to accentuate high-level semantic information at the expense of underlying detailed texture features vital for the detection of small objects. To address this issue, a shallow feature enhancement module was devised to acquire underlying feature information and augment small object feature information. Furthermore, a multi-level feature fusion module was developed to amalgamate feature information from different layers, thus enabling the network to dynamically adjust the weights of each output detection layer. Experimental results on the publicly available VisDrone2021 dataset demonstrated that the mean average precision of the proposed algorithm, attained a level of 45.7%, representing a 3.1% enhancement over the baseline YOLOv5 algorithm. Additionally, the proposed algorithm achieved a detection speed of 41 frames per second for high-resolution images, satisfying the requirement for real-time performance and exhibiting a noteworthy improvement in detection accuracy over other prevalent methods.

Key words: feature fusion, attention mechanism, UAV aerial imagery, small object detection, YOLOv5

中图分类号: