Journal of Graphics ›› 2024, Vol. 45 ›› Issue (1): 26-34.DOI: 10.11996/JG.j.2095-302X.2024010026
• Image Processing and Computer Vision • Previous Articles Next Articles
GUO Zongyang1(
), LIU Lidong1(
), JIANG Donghua2, LIU Zixiang1, ZHU Shukang1, CHEN Jinghua1
Received:2023-09-06
Accepted:2023-11-12
Online:2024-02-29
Published:2024-02-29
Contact:
LIU Lidong (1982-), professor, Ph.D. His main research interests cover graphic image processing, computer vision, etc. About author:GUO Zongyang (2000-), master student. His main research interests cover digital image processing and human action recognition, etc.
E-mail:gzy000119@chd.edu.cn
Supported by:CLC Number:
GUO Zongyang, LIU Lidong, JIANG Donghua, LIU Zixiang, ZHU Shukang, CHEN Jinghua. Human action recognition algorithm based on semantics guided neural networks[J]. Journal of Graphics, 2024, 45(1): 26-34.
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.txxb.com.cn/EN/10.11996/JG.j.2095-302X.2024010026
Fig. 5 Schematic diagram of the deformable convolution kernel ((a) The convolution kernel of an ordinary convolution; (b)~(d) Convolutional nuclei of deformable convolution)
| 网络 | 参数量/M | CS/% | CV/% |
|---|---|---|---|
| SGN | 0.659 0 | 89.0 | 94.5 |
| SGN+T | 0.722 5 | 91.2 | 95.6 |
| SGN+ECA | 0.659 1 | 92.5 | 96.3 |
| SGN+DCM | 1.824 0 | 90.1 | 95.5 |
| SGN+ALL | 1.877 6 | 93.0 | 96.5 |
Table 1 Parameter volume and identification accuracy on the NTURGB+D before and after the introduction of the module
| 网络 | 参数量/M | CS/% | CV/% |
|---|---|---|---|
| SGN | 0.659 0 | 89.0 | 94.5 |
| SGN+T | 0.722 5 | 91.2 | 95.6 |
| SGN+ECA | 0.659 1 | 92.5 | 96.3 |
| SGN+DCM | 1.824 0 | 90.1 | 95.5 |
| SGN+ALL | 1.877 6 | 93.0 | 96.5 |
| 网络 | 参数量/M | C-Sub/% | C-Set/% |
|---|---|---|---|
| SGN | 0.659 0 | 79.2 | 81.5 |
| SGN+T | 0.722 5 | 84.2 | 85.6 |
| SGN+ECA | 0.659 1 | 87.1 | 88.3 |
| SGN+DCM | 1.824 0 | 82.1 | 85.5 |
| SGN+ALL | 1.877 6 | 88.5 | 89.8 |
Table 2 Parameter volume and identification accuracy on the NTURGB+D 120 before and after the introduction of the module
| 网络 | 参数量/M | C-Sub/% | C-Set/% |
|---|---|---|---|
| SGN | 0.659 0 | 79.2 | 81.5 |
| SGN+T | 0.722 5 | 84.2 | 85.6 |
| SGN+ECA | 0.659 1 | 87.1 | 88.3 |
| SGN+DCM | 1.824 0 | 82.1 | 85.5 |
| SGN+ALL | 1.877 6 | 88.5 | 89.8 |
| 算法 | CS | CV |
|---|---|---|
| VA-LSTM[ | 79.4 | 87.6 |
| ST-GCN[ | 81.5 | 88.3 |
| SR-TSL[ | 84.8 | 89.8 |
| HCN[ | 86.5 | 91.1 |
| AS-GCN[ | 86.8 | 94.2 |
| 2s-AGCN[ | 88.5 | 95.1 |
| VA-CNN[ | 88.7 | 94.3 |
| SGN[ | 89.0 | 94.5 |
| AGC-LSTM[ | 89.2 | 95.0 |
| DGNN[ | 89.9 | 96.1 |
| Shift-GCN[ | 90.7 | 96.5 |
| PA-ResGCN-B19[ | 90.9 | 96.0 |
| Dynamic GCN[ | 91.5 | 96.0 |
| MS-G3D[ | 91.5 | 96.2 |
| EfficientGCN-B4[ | 91.7 | 95.7 |
| CTR-GCN[ | 92.4 | 96.8 |
| PSUMNet[ | 92.9 | 96.7 |
| 本文算法 | 93.0 | 96.5 |
Table 3 Performance of the related algorithms on the NTURGB+D dataset/%
| 算法 | CS | CV |
|---|---|---|
| VA-LSTM[ | 79.4 | 87.6 |
| ST-GCN[ | 81.5 | 88.3 |
| SR-TSL[ | 84.8 | 89.8 |
| HCN[ | 86.5 | 91.1 |
| AS-GCN[ | 86.8 | 94.2 |
| 2s-AGCN[ | 88.5 | 95.1 |
| VA-CNN[ | 88.7 | 94.3 |
| SGN[ | 89.0 | 94.5 |
| AGC-LSTM[ | 89.2 | 95.0 |
| DGNN[ | 89.9 | 96.1 |
| Shift-GCN[ | 90.7 | 96.5 |
| PA-ResGCN-B19[ | 90.9 | 96.0 |
| Dynamic GCN[ | 91.5 | 96.0 |
| MS-G3D[ | 91.5 | 96.2 |
| EfficientGCN-B4[ | 91.7 | 95.7 |
| CTR-GCN[ | 92.4 | 96.8 |
| PSUMNet[ | 92.9 | 96.7 |
| 本文算法 | 93.0 | 96.5 |
| 算法 | C-Sub | C-Set |
|---|---|---|
| GCA-LSTM[ | 58.3 | 59.2 |
| Clips+CNN+MTLN[ | 58.4 | 57.9 |
| Two-Stream GCA-LSTM[ | 61.2 | 63.3 |
| RotClips+MTCNN[ | 62.2 | 61.8 |
| TSRJI[ | 67.9 | 59.7 |
| SGN[ | 79.2 | 81.5 |
| MV-IGNET[ | 83.9 | 85.6 |
| 4s Shift-GCN[ | 85.9 | 87.6 |
| MS-G3D[ | 86.9 | 88.4 |
| PA-ResGCN-B19[ | 87.3 | 88.3 |
| EfficientGCN-B4[ | 88.3 | 89.1 |
| CTR-GCN[ | 88.9 | 90.6 |
| PoseC3D[ | 86.9 | 90.3 |
| 本文算法 | 88.5 | 89.8 |
Table 4 Performance of the related algorithms on the NTURGB+D 120 dataset/%
| 算法 | C-Sub | C-Set |
|---|---|---|
| GCA-LSTM[ | 58.3 | 59.2 |
| Clips+CNN+MTLN[ | 58.4 | 57.9 |
| Two-Stream GCA-LSTM[ | 61.2 | 63.3 |
| RotClips+MTCNN[ | 62.2 | 61.8 |
| TSRJI[ | 67.9 | 59.7 |
| SGN[ | 79.2 | 81.5 |
| MV-IGNET[ | 83.9 | 85.6 |
| 4s Shift-GCN[ | 85.9 | 87.6 |
| MS-G3D[ | 86.9 | 88.4 |
| PA-ResGCN-B19[ | 87.3 | 88.3 |
| EfficientGCN-B4[ | 88.3 | 89.1 |
| CTR-GCN[ | 88.9 | 90.6 |
| PoseC3D[ | 86.9 | 90.3 |
| 本文算法 | 88.5 | 89.8 |
| [1] |
蒋圣南, 陈恩庆, 郑铭耀, 等. 基于ResNeXt的人体动作识别[J]. 图学学报, 2020, 41(2): 277-282.
DOI |
| JIANG S N, CHEN E Q, ZHENG M Y, et al. Human action recognition based on ResNeXt[J]. Journal of Graphics, 2020, 41(2): 277-282 (in Chinese). | |
| [2] | 安峰, 戴军, 韩振, 等. 引入注意力机制的自监督光流计算[J]. 图学学报, 2022, 43(5): 841-848. |
| AN F, DAI J, HAN Z, et al. Self-supervised optical flow estimation with attention module[J]. Journal of Graphics, 2022, 43(5): 841-848 (in Chinese). | |
| [3] | 杨世强, 杨江涛, 李卓, 等. 基于LSTM神经网络的人体动作识别[J]. 图学学报, 2021, 42(2): 174-181. |
|
YANG S Q, YANG J T, LI Z, et al. Human action recognition based on LSTM neural network[J]. Journal of Graphics, 2021, 42(2): 174-181 (in Chinese).
DOI |
|
| [4] | YAN S J, XIONG Y J, LIN D H. Spatial temporal graph convolutional networks for skeleton-based action recognition[EB/OL]. [2023-08-22]. https://arxiv:1801.07455.pdf. |
| [5] | LI M S, CHEN S H, CHEN X, et al. Actional-structural graph convolutional networks for skeleton-based action recognition[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 3590-3598. |
| [6] | SHI L, ZHANG Y F, CHENG J, et al. Two-stream adaptive graph convolutional networks for skeleton-based action recognition[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 12018-12027. |
| [7] | THAKKAR K, NARAYANAN P J. Part-based graph convolutional network for action recognition[EB/OL]. [2023-08-22]. https://arxiv.org/abs/1809.04983.pdf. |
| [8] | SI C Y, CHEN W T, WANG W, et al. An attention enhanced graph convolutional LSTM network for skeleton-based action recognition[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 1227-1236. |
| [9] |
WEN Y H, GAO L, FU H B, et al. Graph CNNs with motif and variable temporal block for skeleton-based action recognition[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2019, 33(1): 8989-8996.
DOI URL |
| [10] |
YE F F, TANG H M. Skeleton-based action recognition with JRR-GCN[J]. Electronics Letters, 2019, 55(17): 933-935.
DOI |
| [11] | CHENG K, ZHANG Y F, HE X Y, et al. Skeleton-based action recognition with shift graph convolutional network[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 180-189. |
| [12] | ZHANG P F, LAN C L, ZENG W J, et al. Semantics-guided neural networks for efficient skeleton-based human action recognition[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 1109-1118. |
| [13] | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all You need[C]// The 31st International Conference on Neural Information Processing Systems. New York: ACM, 2017: 6000-6010. |
| [14] | WANG Q L, WU B G, ZHU P F, et al. ECA-net: efficient channel attention for deep convolutional neural networks[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 11531-11539. |
| [15] |
WANG M S, NI B B, YANG X K. Learning multi-view interactional skeleton graph for action recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(6): 6940-6954.
DOI URL |
| [16] | ZHANG P F, LAN C L, XING J L, et al. View adaptive recurrent neural networks for high performance human action recognition from skeleton data[C]// 2017 IEEE International Conference on Computer Vision. New York: IEEE Press, 2017: 2136-2145. |
| [17] | LI C, ZHONG Q Y, XIE D, et al. Co-occurrence feature learning from skeleton data for action recognition and detection with hierarchical aggregation[C]//The 27th International Joint Conference on Artificial Intelligence. California:International Joint Conferences on Artificial Intelligence Organization. New York: ACM, 2018: 786-792. |
| [18] | SI C Y, JING Y, WANG W, et al. Skeleton-based action recognition with spatial reasoning and temporal stack learning[C]// European Conference on Computer Vision. Cham: Springer, 2018: 106-121. |
| [19] |
ZHANG P F, LAN C L, XING J L, et al. View adaptive neural networks for high performance skeleton-based human action recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41(8): 1963-1978.
DOI PMID |
| [20] | SHI L, ZHANG Y F, CHENG J, et al. Skeleton-based action recognition with directed graph neural networks[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 7904-7913. |
| [21] | SONG Y F, ZHANG Z, SHAN C F, et al. Stronger, faster and more explainable: a graph convolutional baseline for skeleton-based action recognition[C]// The 28th ACM International Conference on Multimedia. New York: ACM, 2020: 1625-1633. |
| [22] | YE F F, PU S L, ZHONG Q Y, et al. Dynamic GCN: context-enriched topology learning for skeleton-based action recognition[C]// The 28th ACM International Conference on Multimedia. New York: ACM, 2020: 55-63. |
| [23] | LIU Z Y, ZHANG H W, CHEN Z H, et al. Disentangling and unifying graph convolutions for skeleton-based action recognition[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 140-149. |
| [24] |
SONG Y F, ZHANG Z, SHAN C F, et al. Constructing stronger and faster baselines for skeleton-based action recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(2): 1474-1488.
DOI URL |
| [25] | CHEN Y X, ZHANG Z Q, YUAN C F, et al. Channel-wise topology refinement graph convolution for skeleton-based action recognition[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2022: 13339-13348. |
| [26] | TRIVEDI N, SARVADEVABHATLA R K. PSUMNet: unified Modality Part Streams are All You Need for Efficient Pose-based Action Recognition[EB/OL]. [2023-08-22]. https://arxiv.org/abs/2208.05775.pdf. |
| [27] | LIU J, WANG G, HU P, et al. Global context-aware attention LSTM networks for 3D action recognition[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 3671-3680. |
| [28] | KE Q H, BENNAMOUN M, AN S J, et al. A new representation of skeleton sequences for 3D action recognition[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 4570-4579. |
| [29] |
LIU J, WANG G, DUAN L Y, et al. Skeleton-based human action recognition with global context-aware attention LSTM networks[J]. IEEE Transactions on Image Processing, 2018, 27(4): 1586-1599.
DOI PMID |
| [30] |
KE Q H, BENNAMOUN M, AN S J, et al. Learning clip representations for skeleton-based 3D action recognition[J]. IEEE Transactions on Image Processing, 2018, 27(6): 2842-2855.
DOI PMID |
| [31] | CAETANO C, BRÉMOND F, SCHWARTZ W R. Skeleton image representation for 3D action recognition based on tree structure and reference joints[C]// 2019 32nd SIBGRAPI Conference on Graphics, Patterns and Images. New York: IEEE Press, 2019: 16-23. |
| [32] | DUAN H D, ZHAO Y, CHEN K, et al. Revisiting skeleton-based action recognition[C]// The IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 2969-2978. |
| [1] | LI Daxiang, JI Zhan, LIU Ying, TANG Yao. Improving YOLOv7 remote sensing image target detection algorithm [J]. Journal of Graphics, 2024, 45(4): 650-658. |
| [2] | WEI Min, YAO Xin. Two-stage storm entity prediction based on multiscale and attention [J]. Journal of Graphics, 2024, 45(4): 696-704. |
| [3] | NIU Weihua, GUO Xun. Rotating target detection algorithm in ship remote sensing images based on YOLOv8 [J]. Journal of Graphics, 2024, 45(4): 726-735. |
| [4] | ZENG Zhichao, XU Yue, WANG Jingyu, YE Yuanlong, HUANG Zhikai, WANG Huan. A water surface target detection algorithm based on SOE-YOLO lightweight network [J]. Journal of Graphics, 2024, 45(4): 736-744. |
| [5] | LI Songyang, WANG Xueting, CHEN Xianglong, CHEN Enqing. Human action recognition based on skeleton dynamic temporal filter [J]. Journal of Graphics, 2024, 45(4): 760-769. |
| [6] | WU Bing, TIAN Ying. Research on multi-scale road damage detection algorithm based on attention mechanism [J]. Journal of Graphics, 2024, 45(4): 770-778. |
| [7] | ZHAO Lei, LI Dong, FANG Jiandong, CAO Qi. Improved YOLO object detection algorithm for traffic signs [J]. Journal of Graphics, 2024, 45(4): 779-790. |
| [8] | LIANG Chengwu, YANG Jie, HU Wei, JIANG Songqi, QIAN Qiyang, HOU Ning. Temporal dynamic frame selection and spatio-temporal graph convolution for interpretable skeleton-based action recognition [J]. Journal of Graphics, 2024, 45(4): 791-803. |
| [9] | LI Yuehua, ZHONG Xin, YAO Zhangyan, HU Bin. Detection of dress code violations based on improved YOLOv5s [J]. Journal of Graphics, 2024, 45(3): 433-445. |
| [10] | ZHANG Xiangsheng, YANG Xiao. Defect detection method of rubber seal ring based on improved YOLOv7-tiny [J]. Journal of Graphics, 2024, 45(3): 446-453. |
| [11] | LI Tao, HU Ting, WU Dandan. Monocular depth estimation combining pyramid structure and attention mechanism [J]. Journal of Graphics, 2024, 45(3): 454-463. |
| [12] | LU Longfei, WANG Junfeng, ZHAO Shiwen, LI Guang, DING Xintao. Peg-in-hole compliant assembly method based on skill learning of force-position perception [J]. Journal of Graphics, 2024, 45(2): 250-258. |
| [13] | LV Ling, LI Hua, WANG Wu. Multi-directional text detection based on the fusion of enhanced feature extraction network and semantic feature [J]. Journal of Graphics, 2024, 45(1): 56-64. |
| [14] | ZHAI Yongjie, ZHAO Xiaoyu, WANG Luyao, WANG Yaru, SONG Xiaoke, ZHU Haoshuo. IDD-YOLOv7: a lightweight method for multiple defect detection of insulators in transmission lines [J]. Journal of Graphics, 2024, 45(1): 90-101. |
| [15] | GU Tianjun, XIONG Suya, LIN Xiao. Diversified generation of theatrical masks based on SASGAN [J]. Journal of Graphics, 2024, 45(1): 102-111. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||