欢迎访问《图学学报》 分享到:

图学学报 ›› 2022, Vol. 43 ›› Issue (3): 387-395.DOI: 10.11996/JG.j.2095-302X.2022030387

• 图像处理与计算机视觉 • 上一篇    下一篇

基于改进 YOLOv5s 的离线手写数学符号识别

  

  1. 1. 贵州大学大数据与信息工程学院,贵州 贵阳 550025;
    2. 教育部半导体功率器件可靠性工程中心,贵州 贵阳 550025;
    3. 贵州财经大学西部现代化研究中心,贵州 贵阳 550025;
    4. 贵州大学计算机科学与技术学院,贵州 贵阳 550025

  • 出版日期:2022-06-30 发布日期:2022-06-28
  • 基金资助:
    国家自然科学基金项目(61562009);国家重点研发计划课题(2016YFD0201305-07);贵州大学人才引进科研项目(贵大人基合字(2015)29号);半导体功率器件教育部工程研究中心开放基金项目(ERCMEKFJJ2019-(06))

Offline handwriting mathematical symbol recognition based on improved YOLOv5s

  1. 1. College of Big Data and Information Engineering, Guizhou University, Guiyang Guizhou 550025, China;
    2. Semiconductor Power Device Reliability Engineering Center of Ministry of Education, Guiyang Guizhou 550025, China;
    3. Western Modernization Research Center, Guizhou University of Finance and Economics, Guiyang Guizhou 550025, China;
    4. College of Computer Science and Technology, Guizhou University, Guiyang Guizhou 550025, China
  • Online:2022-06-30 Published:2022-06-28
  • Supported by:
     National Natural Science Foundation of China (61562009); National Key Research and Development Program of China
    (2016YFD0201305-07); Guizhou University Introduced Talent Research Project (2015-29); Open Fund Project in Semiconductor
    Power Device Reliability Engineering Center of Ministry of Education (ERCMEKFJJ2019-(06))

摘要: 离线数学符号识别是离线数学表达式识别的前提。针对现有离线符号识别方法只是单纯的对符号进行识别,对离线表达式识别的其他环节未有任何帮助,反而会限制表达式识别,提出一种改进 YOLOv5s的离线符号识别方法。首先,根据符号图像小的特点,用生成对抗网络(GAN)进行数据增强;其次,从符号类别的角度分析,在 YOLOv5s 模型中引入空间注意力机制,利用全局最大值和全局平均值池化,扩大类别间的差异特征;最后,从符号自身角度分析,引入双向长短期记忆网络(BiLSTM)对符号特征矩阵进行处理,使符

号特征具有上下相关联的信息。实验结果表明:改进后的 YOLOv5s 取得较好离线符号识别效果,有 92.47%的识别率,与其他方法进行对比,证明了其有效性和稳健性。同时,能有效避免离线数学表达式识别中错误累积的问题,且能为表达式的结构分析提供有效依据。

关键词: 离线手写数学符号识别, 数据增强, 生成对抗网络, 空间注意力机制, 双向长短期记忆网络

Abstract:

Offline mathematical symbol recognition is the premise of offline mathematical expression recognition. The existing offline symbol recognition methods can only recognize symbols, but is of no help to other steps of offline expression recognition, even restricting expression recognition. Thus, an improved YOLOv5s offline symbol recognition method was proposed. Firstly, considering the small size of symbolic image, generative adversarial network (GAN) was employed to enhance the data. Secondly, from the point of view of symbolic categories, the spatial attention mechanism was introduced to YOLOv5s model, and the global maximum and global mean were pooled to enlarge the differences between categories. Finally, from the point of view of the symbol itself, the bidirectional long-short-term memory network (BiLSTM) was utilized to process the symbol feature matrix, so that the symbol feature could possess the upper and lower related information. Experimental results show that the improved YOLOv5s achieves better offline symbol recognition, with a recognition rate of 92.47%. Compared with other methods, the proposed method is effective and robust. At the same time, it can effectively avoid the problem of error accumulation in offline mathematical expression recognition and provide an effective basis for expression structure analysis.

Key words: offline handwriting mathematical symbol recognition, data enhancement, generative adversarial network;
spatial attention mechanism,
bidirectional long-short-term memory network

中图分类号: