欢迎访问《图学学报》 分享到:

图学学报 ›› 2023, Vol. 44 ›› Issue (4): 691-698.DOI: 10.11996/JG.j.2095-302X.2023040691

• 图像处理与计算机视觉 • 上一篇    下一篇

基于深度学习的电力设备铭牌文本检测方法

王道累(), 康博, 朱瑞()   

  1. 上海电力大学能源与机械工程学院,上海 200090
  • 收稿日期:2022-11-08 接受日期:2023-01-12 出版日期:2023-08-31 发布日期:2023-08-16
  • 通讯作者: 朱瑞(1981-),女,副教授,博士。主要研究方向为绝缘子检测、输变电设备故障检测和深度学习。E-mail:zhuruish@163.com
  • 作者简介:

    王道累(1981-),男,教授,博士。主要研究方向为计算机视觉、图像处理和CAD/CAM。E-mail:alfredwdl@shiep.edu.cn

  • 基金资助:
    国家自然科学基金项目(61502297)

Text detection method for electrical equipment nameplates based on deep learning

WANG Dao-lei(), KANG Bo, ZHU Rui()   

  1. College of Energy and Mechanical Engineering, Shanghai University of Electric Power, Shanghai 200090, China
  • Received:2022-11-08 Accepted:2023-01-12 Online:2023-08-31 Published:2023-08-16
  • Contact: ZHU Rui (1981-), associate professor. Ph.D. Her main research interests cover insulator detection, transformation equipment fault detection and deep learning. E-mail:zhuruish@163.com
  • About author:

    WANG Dao-lei (1981-), professor, Ph.D. His main research interests cover computer vision, image processing and CAD/CAM. E-mail:alfredwdl@shiep.edu.cn

  • Supported by:
    National Natural Science Foundation of China(61502297)

摘要:

电力设备铭牌的快速检测可以帮助变电站、电厂了解设备信息,进行定期检修与维护,以保证设备的正常运行。针对目前的文本检测网络无法做到提高精确率的同时兼顾检测效率的问题,提出了在DBNet网络模型中引入注意力模块(CBAM),并改进检测头,在主干网络中引入多尺度特征金字塔(FPN)结构,并在原始的FPN上进行改进。针对目前电力设备铭牌并无公开数据集且较难采集数据的情况,提出了将数张铭牌图片裁剪成矩形,再以一定比例进行拼接成新的图像的数据增强方法,以此对数据集进行了有效地扩充。实验结果表明,数据增强方法和改进后的DBNet网络结构在检测性能上均有提升,优于目前大多数文本检测网络结构。改进后的DBNet网络结构检测精确率达到了90.3%,召回率达到了79.7%,F值达到了84.7%,相较于原始模型,F值提升了3.3个百分点。在检测速度变化损失很小的同时,极大地提高了检测性能。

关键词: 文本检测, DBNet, 注意力模块, 数据增强, 电力设备铭牌

Abstract:

The prompt detection of power equipment nameplates can help the complete transformer substations and power plants to efficiently comprehend device information and perform necessary maintenance, thus ensuring the proper functioning. This thesis addressed the problem of enhancing text detection efficiency while also taking into account the improvement of precision. To that end, we introduced the concept of convolutional block attention module (CBAM) into the DBNet, and improved the detection head. Multi-scale feature feature pyramid networks (FPN) structures were introduced into the backbone network, improving upon the original FPN. Meanwhile, in view of the absence of public data for power equipment nameplates and difficulties in obtaining it, we proposed a technique to enhance the data by cutting nameplate images into rectangles and then splicing them together into a new image. In this way, the data set could be effectively expanded. The experimental results showed that both the data enhancement method and the improved DBNet network structure proposed in this paper have played a role in improving the detection performance, surpassing most current text detection network structures on the market. The improved DBNet network structure combined with data enhancement method yielded a precision rate of 90.3% and a recalling rate of 79.7%. The rate of F-measure also increased to nearly 84.7%, a 3.3% improvement over the original model, indicating that the detection performance was greatly improved while the loss of detecting speed changes remained minimal.

Key words: text detection, DBNet, CBAM, data enhancement, electrical equipment nameplates

中图分类号: