Welcome to Journal of Graphics share: 

Journal of Graphics ›› 2026, Vol. 47 ›› Issue (1): 223-233.DOI: 10.11996/JG.j.2095-302X.2026010223

• Industrial Design • Previous Articles    

How do robots attract children? The role of appearance, motion, and voice as multisensory features in early-stage interactions

LI Yi1,2(), CAO Chengcai3, SONG Zhangtong4, LI Zuoqi5, LI Xiao1,2, LI Hesen1,2   

  1. 1 School of Industrial Design, Hubei Institute of Fine Arts, Wuhan Hubei 430205, China
    2 Research Center for Modern Public Visual Arts and Design, Hubei Institute of Fine Arts, Wuhan Hubei 430205, China
    3 School of Art and Design, Wuhan Institute of Technology, Wuhan Hubei 430205, China
    4 School of Art, Wuhan Business University, Wuhan Hubei 430056, China
    5 School of Computer Science, China University of Geosciences, Wuhan Hubei 430078, China
  • Received:2025-03-24 Accepted:2025-09-02 Online:2026-02-28 Published:2026-03-16
  • Contact: LI Yi
  • Supported by:
    Major Projects of Key Research Institute of Humanities and Social Sciences at Universities in Hubei Province-Research Center for Modern Public Visual Arts and Design of Hubei Institute of Fine Arts(JD-2025-01)

Abstract:

With the rapid development of artificial intelligence technology, multimodal robots are playing an increasingly important role in preschool children’s education, entertainment, and daily life. Existing studies have primarily focused on the effects of single sensory cues of robots on children’s perception, while systematic research on multisensory integration effects remains limited. To explore how robots’ multimodal features jointly influence children’s emotional preferences and visual attention, 318 children aged 4-6 years were recruited to participate in an eye-tracking experiment. The experiment adopted a 2 (appearance features: humanoid vs. animal-like) × 3 (voice guidance: male voice, female voice, none) × 2 (gesture guidance: present vs. absent) mixed factorial design, with robot appearance features (humanoid vs. animal-like) and behavioral features (voice and gesture guidance) as independent variables, and children’s emotional preferences and eye-tracking indicators as dependent variables, thereby systematically examining the effects of multimodal features on child users. The results showed that, in terms of appearance features, no significant difference was observed in subjective preference ratings between humanoid and animal-like robots. However, humanoid robots attracted longer total fixation duration, more fixation counts, and shorter first-fixation latency, indicating superior attention-related performance compared with animal-like robots. Children were more readily attracted to humanoid robots during the initial stage of visual contact, and anthropomorphic design showed greater advantages in sustaining children’s attention. In terms of behavioral features, robots with gesture guidance received significantly higher subjective preference ratings than those without gestures, and also elicited longer total fixation duration and more fixation counts. Robots with female voices received slightly higher subjective preference ratings than those with male voices, and both were significantly preferred over robots without voices. Robots with male voices had slightly longer total fixation duration than those with female voices, and both significantly outperformed robots without voices. The difference in fixation counts between male- and female-voice robots was not significant, but both attracted significantly more fixations than robots without voices. Robots with gesture guidance and voice (especially female voice) performed better in subjective ratings and visual attention allocation, suggesting that behavioral features substantially enhanced children’s emotional preferences and interactive experiences. Furthermore, the effects of appearance and behavioral features on children’s emotional preferences and visual attention were relatively independent, and no significant interaction effects were observed. This study revealed the mechanisms through which robot appearance and behavioral features influenced preschool children’s emotional preferences and visual attention, thereby providing scientific evidence for designing child-oriented robots that align with users’ emotional needs.

Key words: preschool children, robot, appearance features, behavioral features, emotional preference, visual attention, multisensory integration

CLC Number: