欢迎访问《图学学报》 分享到:

图学学报 ›› 2024, Vol. 45 ›› Issue (4): 714-725.DOI: 10.11996/JG.j.2095-302X.2024040714

• 图像处理与计算机视觉 • 上一篇    下一篇

基于改进YOLOv8和GMM图像点集匹配的双目测距方法

胡欣1(), 常娅姝1, 秦皓2, 肖剑3(), 程鸿亮3   

  1. 1.长安大学能源与电气工程学院,陕西 西安 710018
    2.比亚迪汽车有限公司,陕西 西安 710119
    3.长安大学电子与控制工程学院,陕西 西安 710061
  • 收稿日期:2024-02-10 接受日期:2024-04-15 出版日期:2024-08-31 发布日期:2024-09-03
  • 通讯作者:肖剑(1975-),男,副教授,博士。主要研究方向为智能感知与计算、机器视觉与图像处理。E-mail:xiaojian@chd.edu.cn
  • 第一作者:胡欣(1975-),女,教授, 博士后。主要研究方向为电网大数据处理、机器学习与深度学习。E-mail:huxin@chd.edu.cn
  • 基金资助:
    陕西省秦创原“科学家+工程师”队伍建设项目(2024QCY-KXJ-161);西安市重点产业链项目(23ZDCYJSGG0013-2023)

Binocular ranging method based on improved YOLOv8 and GMM image point set matching

HU Xin1(), CHANG Yashu1, QIN Hao2, XIAO Jian3(), CHENG Hongliang3   

  1. 1. School of Energy and Electrical Engineering, Chang’an University, Xi’an Shaanxi 710018, China
    2. BYD Auto Co., Ltd., Xi’an Shaanxi 710119, China
    3. School of Electronic and Control Engineering, Chang’an University, Xi’an Shaanxi 710061, China
  • Received:2024-02-10 Accepted:2024-04-15 Published:2024-08-31 Online:2024-09-03
  • Contact: XIAO Jian (1975-), associate professor, Ph.D. His main research interests cover intelligent perception and computing, machine vision, and image processing. E-mail:xiaojian@chd.edu.cn
  • First author:HU Xin (1975-), professor, postdoc. Her main research interests cover power grid big data processing, machine learning, and deep learning. E-mail:huxin@chd.edu.cn
  • Supported by:
    Shaanxi Province Qin Chuang Yuan “Scientist + Engineer” Team Construction Project(2024QCY-KXJ-161);Key Industrial Chain in Xi’an(23ZDCYJSGG0013-2023)

摘要:

针对无人塔吊系统的研究需求,提出一种基于改进YOLOv8和GMM图像点集匹配的双目测距方法,对驾驶室外环境中的塔吊吊钩进行检测识别并测距。通过双目摄像头进行图像采集,引入FasterNet骨干网络和Slim-neck颈部连接层,对YOLOv8目标检测算法进行改进,有效检测画面中的塔吊吊钩并获取检测框的二维坐标信息;采用局部敏感哈希方法,并融合分阶段匹配策略,提升GMM图像点集匹配模型的匹配效率,针对检测框中的塔吊吊钩,进行特征点匹配;最后通过双目相机三角测量原理计算得出塔吊吊钩的深度信息。实验结果表明,改进后的YOLOv8算法与原算法相比,精确率P提高了2.9%,平均精度AP50提高了2.2%,模型复杂度降低了10.01 GFLops,参数量减少了3.37 M,在提升检测精度的同时实现了模型的轻量化。改进后的图像点集匹配算法与原算法相比,各个指标表现出更加良好的鲁棒性。最后在工程现场对塔吊吊钩进行识别与测距,误差可接受范围内有效完成了塔吊吊钩的检测识别与测距任务,验证了本方法的可行性。

关键词: YOLOv8目标检测, 高斯混合模型, 点集匹配, 深度学习, 双目视觉, 智慧工地可视化

Abstract:

Addressing the research needs for unmanned tower crane systems, a binocular ranging method was proposed, based on the improved YOLOv8 and GMM image point set matching to detect and recognize the hooks of tower cranes in the outdoor environment of the driver’s cab and measure the distance. Image acquisition was performed through binocular cameras, and the FasterNet backbone network and Slim-neck connection layer was introduced to improve the YOLOv8 target detection algorithm, thereby effectively detecting the hooks of tower cranes in the image and obtaining the two-dimensional coordinate information of the detection box. The local sensitive hashing method was employed, and a phased matching strategy was integrated to improve the matching efficiency of the GMM image point set matching model, performing feature point matching for the hooks of tower cranes in the detection box. Finally, the depth information of the tower crane hook was calculated through the principle of binocular camera triangulation. The experimental results demonstrated that compared to the original algorithm, the improved YOLOv8 algorithm had increased precision P by 2.9%, average precision AP50 by 2.2%, reduced model complexity by 10.01 GFLops, and reduced parameter quantity by 3.37 M. This achieved model light-weighting while enhancing detection accuracy. Compared with the original algorithm, the improved image point set matching algorithm exhibited better robustness in various indicators. Finally, the recognition and ranging of tower crane hooks were effectively completed within an acceptable margin of error at the engineering site, verifying the feasibility of this method.

Key words: YOLOv8 object detection, gaussian mixture model, point set matching, deep learning, binocular vision, smart construction site visualization

中图分类号: