Welcome to Journal of Graphics share: 

Journal of Graphics ›› 2023, Vol. 44 ›› Issue (2): 271-279.DOI: 10.11996/JG.j.2095-302X.2023020271

Previous Articles     Next Articles

Flowers recognition based on lightweight visual transformer

XIONG Ju-ju1(), XU Yang1,2(), FAN Run-ze1, SUN Shao-cong1   

  1. 1. College of Big Data and Information Engineering, Guizhou University, Guiyang Guizhou 550025, China
    2. Guiyang Aluminum-Magnesium Design and Research Institute Co., Ltd., Guiyang Guizhou 550025, China
  • Received:2022-09-02 Accepted:2022-11-24 Online:2023-04-30 Published:2023-05-01
  • Contact: XU Yang (1980-), associate professor, Ph.D. His main research interests cover data collection, machine learning, etc. E-mail:xuy@gzu.edu.cn
  • About author:XIONG Ju-ju (2000-), master student. His main research interest covers image processing. E-mail:juxiong0416@163.com
  • Supported by:
    Science and Technology Plan Project of Guizhou Province(Qian Kehe [2021] General 176)

Abstract:

Due to the similarity between different kinds of flowers and the dissimilarity within the same kind of flowers, the results of convolutional neural network (CNN) that extracts local feature information in flower image recognition are not ideal. Based on the Swin Transformer (Swin-T) network, this paper proposed a lightweight Transformer network LWFormer. Firstly, the network introduced the mobile window-based PoolFormer module into the first and second stages of the Swin-T network to lightweight the network. Secondly, a dual-channel attention mechanism was introduced, in which two independent channels focused on the “location” and “content” of the feature map, respectively, to improve the network′s ability to extract global feature information. Finally, a contrastive loss function was employed to further optimize the performance of the network. The enhanced model was evaluated on two public datasets, Oxford 102 Flower Dataset and 104 Flowers Garden of Eden, and compared with other methods. On these two datasets, the accuracy rates were 88.1% and 87.3%, respectively. Compared with the Swin-T network, the network parameters were reduced by 33.45%, FLOPs was reduced by 28.89%, throughput was increased by 91.45%, and accuracy was increased by 1.8%. Experimental results showed that the proposed network could improve the accuracy while reducing the number of parameters, thus enhancing the speed and accuracy.

Key words: flower recognition, lightweight, attention mechanism, dual-channel attention, contrastive loss function

CLC Number: