Welcome to Journal of Graphics share: 

Journal of Graphics ›› 2022, Vol. 43 ›› Issue (6): 1159-1169.DOI: 10.11996/JG.j.2095-302X.2022061159

• Image Processing and Computer Vision • Previous Articles     Next Articles

Multimodal emotion recognition with action features

  

  1. Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China
  • Online:2022-12-30 Published:2023-01-11
  • Supported by:
    Tsinghua University Initiative Scientific Research Program (20211080093); China Postdoctoral Science Foundation (2021M701891); National Natural Science Foundation of China (62202257, 61725204) 

Abstract: In recent years, using knowledge of computer science to realize emotion recognition based on multimodal data has become an important research direction in the fields of natural human-computer interaction and artificial intelligence. The emotion recognition research using visual modality information usually focuses on facial features, rarely considering action features or multimodal features fused with action features. Although action has a close relationship with emotion, it is difficult to extract valid action information from the visual modality. In this paper, we started with the relationship between action and emotion, and introduced action data extracted from visual modality to classic multimodal emotion recognition dataset, MELD. The body action features were extracted based on ST-GCN model, and the action features were applied to the LSTM model-based single-modal emotion recognition task. In addition, body action features were introduced to bi-modal emotion recognition in MELD dataset, improving the performance of the fusion model based on the LSTM network. The combination of body action features and text features enhanced the recognition accuracy of the context model with pre-trained memory compared with that only using the text features. The results of the experiment show that although the accuracy of body action features for emotion recognition is not higher than those of traditional text features and audio features, body action features play an important role in the process of multimodal emotion recognition. The experiments on emotion recognition based on single-modal and multimodal features validate that people use actions to convey their emotions, and that using body action features for emotion recognition has great potential. 

Key words:  , action features, emotion recognition, multimodality, action and emotion, visual modality 

CLC Number: