基于长短时记忆和深度神经网络的视觉手势识别技术

doi:10.11996/JG.j.2095-302X.2020030372

图学学报

• 图像处理与计算机视觉 • 上一篇下一篇

基于长短时记忆和深度神经网络的视觉手势识别技术

(1. 北京市物联网软件与系统工程技术研究中心，北京 100124；
2. 北京工业大学信息学部，北京 100124)

出版日期:2020-06-30 发布日期:2020-08-18
基金资助:
国家自然科学基金项目(61602016)；北京市科技计划项目(D171100004017003)

Visual gesture recognition technology based on long short term memory and deep neural network

(1. Software and System Engineering Technology Center, Beijing 100124, China;
2. Faculty of Information, Beijing University of Technology, Beijing 100124, China)

Online:2020-06-30 Published:2020-08-18

摘要/Abstract

摘要： 针对基于视觉的动态手势识别易受光照、背景和手势形状变化影响等问题，在分
析人体手势空间上下文特征的基础上，首先建立一种基于人体骨架和部件轮廓特征的动态手势
模型，并采用卷积姿势机和单发多框检测器技术构造深度神经网络进行人体手势骨架和部件轮
廓特征提取。其次，引入长短时记忆网络提取动态人体手势中骨架、左右手和头部轮廓的时序
特征，进而分类识别手势。在此基础上，设计了一种空间上下文与时序特征融合的动态手势识
别机(GRSCTFF)，并通过交警指挥手势视频样本库对其进行网络训练和实验分析。实验证明，
该系统可以快速准确识别动态交警指挥手势，准确率达到94.12%，并对光线、背景和手势形
状变化具有较强的抗干扰能力。

关键词: 手势识别, 空间上下文, 长短时记忆, 特征提取

Abstract:

Aiming at the problem that visual gesture recognition is susceptible to light conditions,
background information and changes in gesture shape, this paper analyzed the spatial context features
of human gestures. First, this paper established a dynamic gesture model based on the contour
features of human skeleton and body parts. The convolutional pose machine (CPM) and the single
shot multibox detector (SSD) technology were utilized to build deep neural network, so as to extract
the contour features of human gesture skeleton and body parts. Next, the long short term memory
(LSTM) network was introduced to extract the temporal features of skeleton, left and right hand, and
head contour in dynamic human gestures, so as to further classify and recognize gestures. On this
basis, this paper designed a dynamic gesture recognizer based on spatial context and temporal feature
fusion (GRSCTFF), and conducted network training and experimental analysis on GRSCTFF through
the video sample database of traffic police command gestures. The experimental results show that
GRSCTFF can quickly and accurately recognize the dynamic traffic police command gestures with an accuracy of 94.12%, and it has strong anti-interference ability to light, background and gesture shape changes.

Key words: gesture recognition, spatial context, long short term memory, feature extraction

何坚1,2，廖俊杰2，张丞2，魏鑫2，白佳豪2，王伟东1,2. 基于长短时记忆和深度神经网络的视觉手势识别技术[J]. 图学学报, DOI: 10.11996/JG.j.2095-302X.2020030372.

HE Jian1,2, LIAO Jun-jie2, ZHANG Cheng2, WEI Xin2, BAI Jia-hao2, WANG Wei-dong1,2. Visual gesture recognition technology based on long short term memory and deep neural network[J]. Journal of Graphics, DOI: 10.11996/JG.j.2095-302X.2020030372.

[1]	曹力, 吴垚, 徐宜科. 基于中轴表达的三维模型轮廓提取方法[J]. 图学学报, 2022, 43(3): 461-468.
[2]	李扬科, 宋全博, 周元峰. 用于手势识别的时空融合网络以及虚拟签名系统[J]. 图学学报, 2022, 43(3): 504-512.
[3]	秦宇 , 曹力 , 吴垚 , 李琳 , . 一种三角网格模型的轮廓生成方法[J]. 图学学报, 2021, 42(6): 963-969.
[4]	张繁, 尹鑫, 徐宇扬, 郝鹏翼 . 基于多尺度特征提取的多导联心跳信号分类[J]. 图学学报, 2021, 42(4): 581-589.
[5]	薛搏，李威，宋海玉，方安琪，彭京涛，王鹏杰，郭宏烨 . 交通标志识别特征提取研究综述[J]. 图学学报, 2019, 40(6): 1024-1031.
[6]	刘璧钺，赵章焰 . 基于改进 LSD 和 AP 聚类的路径边缘识别策略[J]. 图学学报, 2019, 40(5): 915-924.
[7]	侯增选 1，李岩翔 1，杨武 2，赵有航 1，王军骅 1 . 智能配镜三维特征参数提取方法研究[J]. 图学学报, 2019, 40(4): 665-670.
[8]	杨玉婷 1，康厚良 2，廖国富 3 . 东巴象形文字特征曲线简化算法研究[J]. 图学学报, 2019, 40(4): 697-703.
[9]	高春艳，申紫铭，张明路，田颖 . 一种基于 RANSAC 的点云柱状化轴线特征表示法[J]. 图学学报, 2019, 40(3): 539-544.
[10]	刘瑜兴，王淑侠，徐光耀，兰望桂，何卫平 . 基于 Leap Motion 的三维手势交互系统研究[J]. 图学学报, 2019, 40(3): 556-564.
[11]	梁兴柱，林玉娥，许光宇 . 无参数无相关最大化判别边界算法[J]. 图学学报, 2019, 40(1): 105-110.
[12]	张鑫，刁麓弘，南东，王永利，刘阳. 基于主成分分析的自适应特征选择算法研究[J]. 图学学报, 2018, 39(3): 501-508.
[13]	王伟伟，张云彦，王毅，魏婷. 基于用户满意度的唐仕女俑气质特征提取及应用[J]. 图学学报, 2018, 39(1): 63-67.
[14]	王晨琛1,王业琳 2,葛中芹 1,储开岳 3,蔡晶 3,金建华3,陈颖 1,葛云 1. 基于卷积神经网络的中国水墨画风格提取[J]. 图学学报, 2017, 38(5): 754-759.
[15]	林洪彬，王伟，邵艳川，雷东. 基于解析张量投票的散乱点云特征提取[J]. 图学学报, 2017, 38(2): 137-143.

基于长短时记忆和深度神经网络的视觉手势识别技术

Visual gesture recognition technology based on long short term memory and deep neural network

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价