Welcome to Journal of Graphics share: 

Journal of Graphics ›› 2021, Vol. 42 ›› Issue (2): 316-324.DOI: 10.11996/JG.j.2095-302X.2021020316

• BIM/CIM • Previous Articles     Next Articles

A model adaptive method for Chinese word segmentation using IFC-based building information model 

  

  1. 1. School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing 100044, China; 2. Beijing Key Laboratory of Intelligent Processing for Building Big Data, Beijing 102616, China
  • Online:2021-04-30 Published:2021-04-30
  • Supported by:
    National Natural Science Foundation of China (71601013); Beijing Municipal Natural Science Foundation (4202017); Beijing Youth Talent Training Project (CIT&TCD201904050); Young Elite of Beijing University of Civil Engineering and Architecture; The Fundamental Research Funds for Beijing University of Civil Engineering and Architecture (X20039) 

Abstract: The building information model (BIM) has become an effective solution to information technology applications in the construction industry. With the continuous increase of BIM data, natural language processing (NLP) has been introduced into BIM applications in many studies to effectively utilize BIM data. In the Chinese language environment, due to the absence of terminology features in the building field, Chinese word segmentation cannot be efficiently adapted in BIM application. By analyzing the currently popular industry foundation class (IFC) files in BIM data format, this study extracted BIM model features from IFC files and added them together with architectural terminology characteristics into the statistical word segmentation model, thus improving the adaptability of Chinese word segmentation in the building field. The experimental results show that compared with the original conditional random fields (CRF)based word segmentation model, on the domain test set, the F-measure increased by 1.26%, and F-measure still increased by 0.10% with BIM model features added alone, indicating that appending BIM model features to the segmentation model can effectively improve the performance of Chinese word segmentation in the building field. Meanwhile, on the model test set, compared with the case of architectural terminology characteristics being appended alone, after BIM model features were appended, the precision rate increased from 46.97% to 87.74%, the recall rate from 67.60% to 94.77%, and the F-measure from 55.43% to 91.12% (by 35.69%), thereby effectively boosting the BIM model adaptability of Chinese word segmentation in the building field. 

Key words: building information model, industry foundation classes, Chinese word segmentation, model adaptation, building information extraction 

CLC Number: