欢迎访问《图学学报》 分享到:

图学学报 ›› 2023, Vol. 44 ›› Issue (3): 456-464.DOI: 10.11996/JG.j.2095-302X.2023030456

• 图像处理与计算机视觉 • 上一篇    下一篇

YOLO-RD-Apple果园异源图像遮挡果实检测模型

郝鹏飞(), 刘立群(), 顾任远   

  1. 甘肃农业大学信息科学技术学院,甘肃 兰州 730070
  • 收稿日期:2022-10-25 接受日期:2023-01-15 出版日期:2023-06-30 发布日期:2023-06-30
  • 通讯作者: 刘立群(1982-),女,副教授,硕士。主要研究方向为智能计算与深度学习等。E-mail:llqhjy@126.com
  • 作者简介:

    郝鹏飞(1998-),男,硕士研究生。主要研究方向为深度学习与图像处理。E-mail:hola_Cc@163.com

  • 基金资助:
    甘肃省高校教师创新基金项目(2023A-051);甘肃农业大学青年导师基金资助项目(GAU-QDFC-2020-08);甘肃省科技计划资助项目(20JR5RA032)

YOLO-RD-Apple orchard heterogenous image obscured fruit detection model

HAO Peng-fei(), LIU Li-qun(), GU Ren-yuan   

  1. College of Information Science and Technology, Gansu Agricultural University, Lanzhou Gansu 730070, China
  • Received:2022-10-25 Accepted:2023-01-15 Online:2023-06-30 Published:2023-06-30
  • Contact: LIU Li-qun (1982-), associate professor, master. Her main research interests cover intelligent computing and deep learning, etc. E-mail:llqhjy@126.com
  • About author:

    HAO Peng-fei (1998-), master student. His main research interests cover deep learning and image processing. E-mail:hola_Cc@163.com

  • Supported by:
    Gansu Provincial University Teacher Innovation Fund Project(2023A-051);Young Tutor Fund of Gansu Agricultural University(GAU-QDFC-2020-08);National Science Foundation of Gansu Province(20JR5RA032)

摘要:

为探索在自然苹果园环境中对高度遮挡果实进行机器人自动化采摘,提出基于RGB与Depth图像双输入的YOLO-RD-Apple果园异源图像遮挡果实检测模型。使用轻量化的MobileNetV2并在此基础上设计的更加轻量化的MobileNetV2-Lite分别作为RGB和Depth图像的特征提取器,保证特征提取能力的同时降低网络的计算量;将CSPNet与深度可分离卷积结合SE注意力模块,提出全新的SE-DWCSP3模块对PANet结构进行改进,提升网络对于残缺苹果目标的特征提取能力;引入Soft NMS算法替代一般NMS算法,以减少对密集目标的错误抑制现象,降低被遮挡苹果的漏检率。实验结果表明,YOLO-RD-Apple在自然遮挡苹果数据集上性能优秀,在测试集上的AP值达到93.1%,较YOLOv4提升了1.4%,参数量则降低了70%,在GPU(V100)的检测速度达到40.5 FPS,较YOLOv4速度提升了12.5%,在检测精度和速度上均有不同程度地提升,同时降低了网络参数量,更加适用于实际果园苹果采摘场景。

关键词: 目标检测, 异源图像, YOLOv4, 苹果采摘, 注意力机制

Abstract:

In order to address the challenge of robotic automated picking of highly occluded fruits in natural apple orchard environments, a YOLO-RD-Apple orchard heterogenous image occlusion fruit detection model based on dual inputs of RGB and Depth images was proposed. To reduce computational effort while ensuring the feature extraction capability, the lightweight MobileNetV2 and the lighter MobileNetV2-Lite, which was designed on the basis of MobileNetV2, were utilized as feature extractors for RGB and Depth images, respectively. Combining CSPNet with depth-separable convolution to accompany the SE attention module, the new SE-DWCSP3 module was proposed to improve the PANet structure and enhance the feature extraction capability of the network for stubby apple targets. Furthermore, the Soft NMS algorithm was introduced to replace the general NMS algorithm to address the false suppression phenomenon of the algorithm for dense targets and reduce the missed detection rate of obscured apples. The experimental results demonstrated the efficacy of this model on a natural obscured apple dataset, with an AP value of 93.1% on the test set, surpassing YOLOv4 by 1.4 percentage points, a 70% reduction in the number of parameters compared to YOLOv4, and a detection speed of 40.5 FPS on GPU (V100), which is 12.5% higher than that of YOLOv4. The proposed model exhibited improved detection accuracy and speed compared with YOLOv4, while simultaneously reducing the number of network parameters, making it more applicable to actual orchard apple picking scenarios.

Key words: target detection, heterogenous images, YOLOv4, apple picking, attention mechanism

中图分类号: