Welcome to Journal of Graphics share: 

Journal of Graphics ›› 2024, Vol. 45 ›› Issue (4): 714-725.DOI: 10.11996/JG.j.2095-302X.2024040714

• Image Processing and Computer Vision • Previous Articles     Next Articles

Binocular ranging method based on improved YOLOv8 and GMM image point set matching

HU Xin1(), CHANG Yashu1, QIN Hao2, XIAO Jian3(), CHENG Hongliang3   

  1. 1. School of Energy and Electrical Engineering, Chang’an University, Xi’an Shaanxi 710018, China
    2. BYD Auto Co., Ltd., Xi’an Shaanxi 710119, China
    3. School of Electronic and Control Engineering, Chang’an University, Xi’an Shaanxi 710061, China
  • Received:2024-02-10 Accepted:2024-04-15 Online:2024-08-31 Published:2024-09-03
  • Contact: XIAO Jian
  • About author:First author contact:

    HU Xin (1975-), professor, postdoc. Her main research interests cover power grid big data processing, machine learning, and deep learning. E-mail:huxin@chd.edu.cn

  • Supported by:
    Shaanxi Province Qin Chuang Yuan “Scientist + Engineer” Team Construction Project(2024QCY-KXJ-161);Key Industrial Chain in Xi’an(23ZDCYJSGG0013-2023)

Abstract:

Addressing the research needs for unmanned tower crane systems, a binocular ranging method was proposed, based on the improved YOLOv8 and GMM image point set matching to detect and recognize the hooks of tower cranes in the outdoor environment of the driver’s cab and measure the distance. Image acquisition was performed through binocular cameras, and the FasterNet backbone network and Slim-neck connection layer was introduced to improve the YOLOv8 target detection algorithm, thereby effectively detecting the hooks of tower cranes in the image and obtaining the two-dimensional coordinate information of the detection box. The local sensitive hashing method was employed, and a phased matching strategy was integrated to improve the matching efficiency of the GMM image point set matching model, performing feature point matching for the hooks of tower cranes in the detection box. Finally, the depth information of the tower crane hook was calculated through the principle of binocular camera triangulation. The experimental results demonstrated that compared to the original algorithm, the improved YOLOv8 algorithm had increased precision P by 2.9%, average precision AP50 by 2.2%, reduced model complexity by 10.01 GFLops, and reduced parameter quantity by 3.37 M. This achieved model light-weighting while enhancing detection accuracy. Compared with the original algorithm, the improved image point set matching algorithm exhibited better robustness in various indicators. Finally, the recognition and ranging of tower crane hooks were effectively completed within an acceptable margin of error at the engineering site, verifying the feasibility of this method.

Key words: YOLOv8 object detection, gaussian mixture model, point set matching, deep learning, binocular vision, smart construction site visualization

CLC Number: