Welcome to Journal of Graphics share: 

Journal of Graphics ›› 2023, Vol. 44 ›› Issue (6): 1104-1111.DOI: 10.11996/JG.j.2095-302X.2023061104

Previous Articles     Next Articles

YOLOv8 with bi-level routing attention for road scene object detection

WEI Chen-hao1(), YANG Rui1, LIU Zhen-bing1, LAN Ru-shi1(), SUN Xi-yan2, LUO Xiao-nan2   

  1. 1. Guangxi Key Laboratory of Image and Graphic Intelligent Processing (Guilin University of Electronic Technology), Guilin Guangxi 541004, China
    2. National Local Joint Engineering Research Center of Satellite Navigation and Location Service (Guilin University of Electronic Technology), Guilin Guangxi, 541004, China
  • Received:2023-06-29 Accepted:2023-09-16 Online:2023-12-31 Published:2023-12-17
  • Contact: LAN Rui-shi (1986-), professor, Ph.D. His main research interests cover artificial intelligence, image processing and medical information processing. E-mail:rslan2016@163.com
  • About author:

    WEI Chen-hao (1999-), master student. His main research interests cover object detection and deep learning. E-mail:chwei529@163.com

Abstract:

With the continuous increase of motor vehicles, the road traffic environment has become increasingly complex, particularly due to changes in light conditions and complex backgrounds that can interfere with the accuracy and precision of target detection algorithms. Meanwhile, the diverse shapes of targets in road scenes can pose challenges to the detection task. In response to these challenges, a method named YOLOv8n_T was proposed. Building on the YOLOv8 skeleton network, it incorporated a D_C2f block utilizing deformable convolution to enhance feature learning for targets under complex backgrounds, making it more adaptable to the diverse and complex scenarios of road targets. Furthermore, the model incorporated a dual routing attention module to query adaptively and remove irrelevant regions, retaining only the most relevant regions. For small targets such as pedestrians and traffic lights on the road, a small target detection layer was added. Experimental results demonstrated that the proposed YOLOv8n_T could significantly enhance the precision of target detection in road scenarios, with an average precision increase of 6.8 percentage points compared to the original YOLOv8n and 11.2 percentage points compared to YOLOv5n on the BDD100K dataset.

Key words: deformable convolution, road scene, object detection, YOLO, attention mechanism

CLC Number: