欢迎访问《图学学报》 分享到:

图学学报 ›› 2022, Vol. 43 ›› Issue (4): 729-735.DOI: 10.11996/JG.j.2095-302X.2022040729

• 建筑与城市信息模型 • 上一篇    下一篇

基于自适应尺度边缘特征的建筑施工图重叠字符识别方法研究

  

  1. 1. 上海交通大学船舶海洋与建筑工程学院,上海 200240;
    2. 上海市公共建筑和基础设施数字化运维重点实验室,上海 200240
  • 出版日期:2022-08-31 发布日期:2022-08-15
  • 通讯作者: 邓雪原(1973),男,副教授,博士。主要研究方向为建筑 CAD 协同设计与集成、基于 BIM 技术的建筑协同平台等
  • 基金资助:
    “十三五”国家重点研发计划项目(2016YFC0702001)

Research on recognition method of overlapped characters in construction drawings based on adaptive scale edge feature

  1. 1. School of Naval Architecture, Ocean & Civil Engineering, Shanghai Jiao Tong University, Shanghai 200240, China;
    2. Shanghai Key Laboratory for Digital Maintenance of Buildings and Infrastructure, Shanghai 200240, China
  • Online:2022-08-31 Published:2022-08-15
  • Contact: DENG Xue-yuan (1973), associate professor, Ph.D. His main research interests cover architectural CAD collaborative design and integration, building collaborative platform based on BIM technology, etc
  • Supported by:
    “Thirteenth Five-Year” National Key R&D Plan (2016YFC0702001)

摘要:

目前非重叠字符的识别技术已趋于完善,但难以识别建筑工程图纸标注等场景中的重叠字符,阻碍了基于二维扫描图纸的自动建模技术的突破。针对传统字符识别方法无法识别重叠字符的现状,提出了一套基于自适应尺度边缘特征的建筑施工图重叠字符识别新方法。基于像素空间分布特征初步确定重叠字符区域,定义并提取字符的自适应尺度边缘特征;借助双变量匹配概率函数筛选“位置+内容”的结果组合,并以全局最优原则代替绝对阈值作为识别标准,最终输出正确的识别结果。不同于先修复后识别的常规思路,该方法将特征匹配与干扰过滤相结合、字符定位与字符识别相关联,能解决百度等成熟商用 OCR 无法解决的重叠字符识别问题,且经数据实验证实具备较高的识别准确率。

关键词: 重叠字符, 字符识别, 自适应尺度, 分布概率, 投影分割

Abstract:

At present, the recognition technology of non-overlapped characters has been perfected, but it remains difficult to solve the recognition problem of common overlapped characters in scenarios such as the annotation of architectural engineering drawings, which hinders the breakthrough of automatic modeling technology based on 2D scanned drawings. To address the incapability of traditional character recognition methods to recognize overlapped characters, a new method was proposed for overlapped characters recognition in construction drawings based on adaptive scale edge features. Based on the spatial distribution characteristics of pixels, the overlapped character areas were preliminarily determined, and the adaptive scale edge features of characters were defined and extracted. The result combination of “position + content” was screened with the help of the bivariate matching probability function, and the global optimal principle was used instead of the absolute threshold as the identification standard. Finally, the correct recognition of overlapped characters was achieved. Different from the conventional idea of recognizing after repairing, the new method combined feature matching and interference filtering, character positioning and character recognition. The proposed method can solve the overlapping character recognition problem insolvable for mature commercial OCR such as Baidu, and the data experiment proves that this method is of high recognition accuracy.

Key words: overlapped characters, character recognition, adaptive scale, distribution probability, projection segmentation

中图分类号: