Welcome to Journal of Graphics share: 

Journal of Graphics ›› 2021, Vol. 42 ›› Issue (3): 398-405.DOI: 10.11996/JG.j.2095-302X.2021030398

• Image Processing and Computer Vision • Previous Articles     Next Articles

A method of automatic image annotation for image-text mixed domain books

  

  1. 1. School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing 100876, China;  2. School of Digital Media and Design Arts, Beijing University of Posts and Telecommunications, Beijing 100876, China
  • Online:2021-06-30 Published:2021-06-29
  • Supported by:
    Basic Scientific Research Funds of Beijing University of Posts and Telecommunications (2020RC26)

Abstract: Efficient interpretation and intelligent processing of massive text and text data is a very challenging and practical work, but the accuracy of automatic labeling is highly dependent on the quality and quantity of training samples. In this paper, an image annotation method of images and text data mixed information is proposed. The method consists of three parts: adaptive image and text separation preprocessing, domain image semantic label construction and text-based image annotation algorithm. Firstly, the proposed hybrid layout recognition algorithm is used to extract the image, title and description text in the hybrid layout of images and text data. Then, the Traditional Cultural Domain Lexicon (PatternNet) is established based on semantic tags of digital clothing image. Finally, according to the characteristics of domain lexicon's tag space, a text-based image annotation algorithm is proposed to improve the large tag space. The simulation experiment is carried out on the ethnic costumes books that images and text data hybrid layout, also compared with other data sets. The experimental results verify the effectiveness of the algorithm proposed in this paper. 

Key words:  , annotation image with text, PatternNet, digital image-text processing, domain keyword extraction

CLC Number: