Welcome to Journal of Graphics share: 

Journal of Graphics ›› 2021, Vol. 42 ›› Issue (2): 307-315.DOI: 10.11996/JG.j.2095-302X.2021020307

• BIM/CIM • Previous Articles     Next Articles

Research on named entity recognition of construction safety accident text based on pre-trained language model 

  

  1. 1. School of Civil Engineering and Transportation, South China University of Technology, Guangzhou Guangdong 510640, China;  2. State Key Laboratory of Subtropical Building Science, Guangzhou Guangdong 510640, China;  3. Sino-Singapore International Joint Research Institute, Guangzhou Guangdong 510555, China
  • Online:2021-04-30 Published:2021-04-30
  • Supported by:
    Natural Science Foundation of Guangdong Province (2018A030310363, 2017A030313393); Key Project of Guangzhou Science and Technology Plan (20181003SF0059)

Abstract: The construction safety accident analysis plays an important role in construction safety management, but the construction safety knowledge scattered in accident reports cannot be reused, nor can it shed sufficient light on construction safety management. Knowledge graph serves as a tool for structured storage and knowledge reuse, such as retrieval of accident cases, analysis of accident-related paths, and statistical analysis. Named Entity Recognition (NER) is the key task of automatic knowledge graph construction, and currently mainly concentrates on medical, financial, and military fields. In the realm of construction safety, there has been an absence of relevant research on NER. In this paper, five concepts in this field were defined, and the entity labeling specifications were clarified. The improved Bidirectional Encoder Representations from Transformers (BERT) pre-trained language model was employed to obtain dynamic word vectors, and the Bidirectional Long Short-Term Memory-Conditional Random Field (BiLSTM-CRF) model was utilized to gain the optimal entity tag sequence, thus proposing the NER model for the field of construction safety. In order to train and verify the proposed model, 1,000 accident reports on construction safety were collected, sorted, and annotated as an experimental corpus. Experiments show that compared with traditional models, the proposed model can yield a better recognition effect in texts on construction safety accident. 

Key words:  , knowledge graph, named entity recognition, construction safety, pre-trained language model, accident report 

CLC Number: