欢迎访问《图学学报》 分享到:

图学学报 ›› 2025, Vol. 46 ›› Issue (5): 1010-1017.DOI: 10.11996/JG.j.2095-302X.2025051010

• 图像处理与计算机视觉 • 上一篇    下一篇

基于深度强化学习的无人机三维场景导航方法研究

刘伯凯1(), 殷雪峰1, 孙传昱1, 葛慧林2(), 魏子麒3, 姜雨彤4, 朴海音5, 周东生6, 杨鑫1   

  1. 1 大连理工大学计算机学院社会计算与认知智能教育部重点实验室辽宁 大连 116024
    2 江苏科技大学自动化学院江苏 镇江 212100
    3 中国科学院自动化研究所北京 100190
    4 中国北方车辆研究所先进越野系统技术全国重点实验室北京 100072
    5 中国航空工业集团公司沈阳飞机设计研究所辽宁 沈阳 110035
    6 大连大学软件工程学院辽宁 大连 116024
  • 收稿日期:2024-12-17 接受日期:2025-04-21 出版日期:2025-10-30 发布日期:2025-09-10
  • 通讯作者:葛慧林(1989-),男,副研究员,博士。主要研究方向为水下目标探测、计算机视觉等。E-mail:ghl1989@just.edu.cn
  • 第一作者:刘伯凯(1999-),男,硕士研究生。主要研究方向为图形图像处理等。E-mail:lbk2593469678@163.com
  • 基金资助:
    国家自然科学基金(62441216);科技部“脑科学与类脑研究”重大项目(2022ZD0210500)

Research on UAV three-dimensional scene navigation based on deep reinforcement learning

LIU Bokai1(), YIN Xuefeng1, SUN Chuanyu1, GE Huilin2(), WEI Ziqi3, JIANG Yutong4, PIAO Haiyin5, ZHOU Dongsheng6, YANG Xin1   

  1. 1 Key Laboratory of Social Computing and Cognitive Intelligence, School of Computer Science, Dalian University of Technology, Dalian Liaoning 116024, China
    2 School of Automation, Jiangsu University of Science and Technology, Zhenjiang Jiangsu 212100, China
    3 Nstitute of Automation, Chinese Academy of Sciences, Beijing 100190, China
    4 National Key Laboratory of Advanced Off-road System Technology, China North Vehicle Research Institute, Beijing 100072, China
    5 Shenyang Aircraft Design and Research Institute, Aviation Industry Corporation of China, Shenyang Liaoning 110035, China
    6 School of Software Engineering, Dalian University, Shenyang Liaoning 116024, China
  • Received:2024-12-17 Accepted:2025-04-21 Published:2025-10-30 Online:2025-09-10
  • First author:LIU Bokai (1999-), master student. His main research interest covers graphic image processing, etc. E-mail:lbk2593469678@163.com
  • Supported by:
    National Natural Science Foundation of China(62441216);Major Project of the Ministry of Science and Technology on “Brain Science and Brain-like Research”(2022ZD0210500)

摘要:

近年来,无人机产业规模与应用需求不断扩大,实现无人机的自主化和智能化成为了行业内亟待解决的核心问题。无人机导航作为无人机自主控制领域的基础技术,已然成为无人机应用研究的重中之重。目前大多数无人机导航方法依赖于环境信息的重建,消耗过多的计算和内存,无法满足日益复杂的场景与实时性要求。因此,基于深度学习卓越的表征学习能力与强化学习的自主学习决策能力,提出无人机自主导航方法,通过不断自主学习优化决策策略,更好地完成导航任务。首先构造连续性动作空间以及非稀疏性奖励函数,用来引导无人机的学习过程;并设计特征提取模块与决策模块来提高无人机感知能力和决策能力。实验结果表明,在仿真三维场景下,该算法表现出最优的导航避障性能,在所设计的三维场景下导航成功率可达到87%,平均累计奖励收敛值较同期方法提高33%,同时缩短训练时长,提高训练稳定性。

关键词: 深度强化学习, 注意力机制, 无人机, 导航避障, 三维场景

Abstract:

In recent years, with the UAV industry and application demands expanding, the realization of UAV autonomy and intelligence has been identified as a critical challenge As a foundational technology in the field of autonomous control of UAVs, UAV navigation and exploration have become a top priority in UAV application research. Currently, most UAV navigation and exploration methods rely on the reconstruction of environmental information, consuming excessive computation and memory, thus failing to meet the increasingly complex scenarios and real-time requirements. Therefore, based on the excellent representation learning ability of deep learning and the self-learning decision-making ability of reinforcement learning, an autonomous navigation method for unmanned aerial vehicles was proposed. By continuously optimizing decision-making strategies through self-learning, the navigation task could be better completed. The method first constructed a continuous action space and a non-sparse reward function to guide the learning process of the drone; then designed feature-extraction and decision-making modules to enhance the perception and decision-making capabilities of the UAV. The experimental results demonstrated that the algorithm exhibited the best navigation and obstacle avoidance performance in the simulated 3D scene. The navigation success rate in the designed 3D scene reached 87%, a 33% increase in average cumulative reward convergence value over that of the same period method, reduced the training time, and improved training stability.

Key words: deep reinforcement learning, attention mechanism, unmanned aerial vehicle, navigation and obstacle avoidance, 3D scene

中图分类号: