Journal of Graphics ›› 2023, Vol. 44 ›› Issue (5): 868-878.DOI: 10.11996/JG.j.2095-302X.2023050868
• Image Processing and Computer Vision • Previous Articles Next Articles
PI Jun(), NIU Hou-xing, GAO Zhi-yun(
)
Received:
2023-05-31
Accepted:
2023-08-03
Online:
2023-10-31
Published:
2023-10-31
Contact:
GAO Zhi-yun (1993-), lecturer, PH.D. Her main research interests cover image processing and pattern recognition. E-mail:About author:
PI Jun (1973-), associate professor, Ph.D. His main research interests cover object detection, image processing and pattern recognition. E-mail:jpi@cauc.edu.cn
Supported by:
CLC Number:
PI Jun, NIU Hou-xing, GAO Zhi-yun. Lightweight human pose estimation algorithm by integrating CA and BiFPN[J]. Journal of Graphics, 2023, 44(5): 868-878.
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.txxb.com.cn/EN/10.11996/JG.j.2095-302X.2023050868
Method | Backbone | Input size | Params (M) | GMACS | AP (%) | AP50 (%) | AP75 (%) | APL (%) | AR (%) |
---|---|---|---|---|---|---|---|---|---|
Lightweight OpenPose | - | 368×368 | 4.1 | 18.0 | 42.8 | - | - | - | - |
EfficientHRNet-H2 | EfficientNetB2 | 448×448 | 10.3 | 15.4 | 52.9 | 80.5 | - | - | - |
EfficientHRNet-H3 | EfficientNetB3 | 416×416 | 6.9 | 8.4 | 44.8 | 76.7 | - | - | - |
EfficientHRNet-H4 | EfficientNetB4 | 384×384 | 3.7 | 4.2 | 35.7 | 69.6 | - | - | - |
baseline | Darknet_csp-d53-s | 640×640 | 12.6 | 8.6 | 54.0 | 81.1 | 58.7 | 65.5 | 59.7 |
Ours-EIoU | Darknet_csp-d53-s | 640×640 | 9.3 | 6.1 | 55.0 | 82.2 | 58.4 | 70.0 | 61.9 |
Ours-WIoU | Darknet_csp-d53-s | 640×640 | 9.3 | 6.1 | 55.8 | 82.8 | 59.9 | 69.4 | 62.4 |
Table 1 Comparison of various methods on the COCO2017 dataset
Method | Backbone | Input size | Params (M) | GMACS | AP (%) | AP50 (%) | AP75 (%) | APL (%) | AR (%) |
---|---|---|---|---|---|---|---|---|---|
Lightweight OpenPose | - | 368×368 | 4.1 | 18.0 | 42.8 | - | - | - | - |
EfficientHRNet-H2 | EfficientNetB2 | 448×448 | 10.3 | 15.4 | 52.9 | 80.5 | - | - | - |
EfficientHRNet-H3 | EfficientNetB3 | 416×416 | 6.9 | 8.4 | 44.8 | 76.7 | - | - | - |
EfficientHRNet-H4 | EfficientNetB4 | 384×384 | 3.7 | 4.2 | 35.7 | 69.6 | - | - | - |
baseline | Darknet_csp-d53-s | 640×640 | 12.6 | 8.6 | 54.0 | 81.1 | 58.7 | 65.5 | 59.7 |
Ours-EIoU | Darknet_csp-d53-s | 640×640 | 9.3 | 6.1 | 55.0 | 82.2 | 58.4 | 70.0 | 61.9 |
Ours-WIoU | Darknet_csp-d53-s | 640×640 | 9.3 | 6.1 | 55.8 | 82.8 | 59.9 | 69.4 | 62.4 |
Fig. 8 Visual comparison of lightweight human pose estimation methods ((a) Lightweight OpenPose; (b) EfficientHRNet-H2; (c) EfficientHRNet-H3; (d) EfficientHRNet-H4; (e) Ours (WIoU))
Fig. 9 Pose estimation results on COCO 2017 human keypoint dataset ((a) Dense crowd; (b) Obstructed by obstacles; (c) Dark light environment; (d) Overlooking angle)
Method | GhostNet | CA | BiFPN | EIoU | WIoU |
---|---|---|---|---|---|
① | - | - | - | - | - |
② | √ | - | - | - | - |
③ | √ | √ | - | - | - |
④ | √ | √ | √ | - | - |
⑤ | √ | √ | √ | √ | - |
⑥ | √ | √ | √ | - | √ |
Table 2 Ablation experimental design
Method | GhostNet | CA | BiFPN | EIoU | WIoU |
---|---|---|---|---|---|
① | - | - | - | - | - |
② | √ | - | - | - | - |
③ | √ | √ | - | - | - |
④ | √ | √ | √ | - | - |
⑤ | √ | √ | √ | √ | - |
⑥ | √ | √ | √ | - | √ |
Method | Params (M) | GMACS | AP (%) | AP50 (%) | AR (%) |
---|---|---|---|---|---|
① | 12.6 | 8.7 | 54.0 | 81.1 | 59.7 |
② | 9.0 | 5.8 | 52.7 | 80.0 | 58.1 |
③ | 9.1 | 5.8 | 54.3 | 81.0 | 59.5 |
④ | 9.3 | 6.1 | 54.1 | 81.5 | 61.0 |
⑤ | 9.3 | 6.1 | 55.0 | 82.2 | 61.9 |
⑥ | 9.3 | 6.1 | 55.8 | 82.8 | 62.4 |
Table 3 Ablation experiment results
Method | Params (M) | GMACS | AP (%) | AP50 (%) | AR (%) |
---|---|---|---|---|---|
① | 12.6 | 8.7 | 54.0 | 81.1 | 59.7 |
② | 9.0 | 5.8 | 52.7 | 80.0 | 58.1 |
③ | 9.1 | 5.8 | 54.3 | 81.0 | 59.5 |
④ | 9.3 | 6.1 | 54.1 | 81.5 | 61.0 |
⑤ | 9.3 | 6.1 | 55.0 | 82.2 | 61.9 |
⑥ | 9.3 | 6.1 | 55.8 | 82.8 | 62.4 |
[1] | 冯杰, 郑建立. 基于卷积与Transformer的人体姿态估计方法对比研究[J]. 软件工程, 2023, 26(3): 18-24. |
FENG J, ZHENG J L. A comparative study of human pose estimation based on convolution and transformer[J]. Software Engineer, 2023, 26(3): 18-24. (in Chinese) | |
[2] | 罗梦诗, 徐杨, 叶星鑫. 基于轻量型高分辨率网络的被遮挡人体姿态估计[J]. 武汉大学学报: 理学版, 2021, 67(5): 403-410. |
LUO M S, XU Y, YE X X. Human pose estimation of occlusion based on light-weight high-resolution network[J]. Journal of Wuhan University: Natural Science Edition, 2021, 67(5): 403-410. (in Chinese) | |
[3] |
张越, 黄友锐, 刘鹏坤. 引入注意力机制的多分辨率人体姿态估计研究[J]. 计算机工程与应用, 2021, 57(8): 126-132.
DOI |
ZHANG Y, HUANG Y R, LIU P K. Research on multi-resolution human pose estimation with attention mechanism[J]. Computer Engineering and Applications, 2021, 57(8): 126-132. (in Chinese)
DOI |
|
[4] | 李崤河, 刘进锋. 二维人体姿态估计研究综述[J]. 现代计算机, 2019(22): 33-37. |
LI X H, LIU J F. A survey of two dimension human pose estimation[J]. Modern Computer, 2019(22): 33-37. (in Chinese) | |
[5] |
刘勇, 李杰, 张建林, 等. 基于深度学习的二维人体姿态估计研究进展[J]. 计算机工程, 2021, 47(3): 1-16.
DOI |
LIU Y, LI J, ZHANG J L, et al. Research progress of two-dimensional human pose estimation based on deep learning[J]. Computer Engineering, 2021, 47(3): 1-16. (in Chinese)
DOI |
|
[6] | TOSHEV A, SZEGEDY C. DeepPose: human pose estimation via deep neural networks[C]// 2014 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2014: 1653-1660. |
[7] | WEI S H, RAMAKRISHNA V, KANADE T, et al. Convolutional pose machines[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 4724-4732. |
[8] | NEWELL A, YANG K Y, DENG J. Stacked hourglass networks for human pose estimation[M]// Computer Vision - ECCV 2016. Cham: Springer International Publishing, 2016: 483-499. |
[9] | SUN K, XIAO B, LIU D, et al. Deep high-resolution representation learning for human pose estimation[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 5686-5696. |
[10] | 曾文献, 马月, 李伟光. 轻量化二维人体骨骼关键点检测算法综述[J]. 科学技术与工程, 2022, 22(16): 6377-6392. |
ZENG W X, MA Y, LI W G. A survey of lightweight two-dimensional human skeleton key point detection algorithms[J]. Science Technology and Engineering, 2022, 22(16): 6377-6392. (in Chinese) | |
[11] |
周燕, 刘紫琴, 曾凡智, 等. 深度学习的二维人体姿态估计综述[J]. 计算机科学与探索, 2021, 15(4): 641-657.
DOI |
ZHOU Y, LIU Z Q, ZENG F Z, et al. Survey on two-dimensional human pose estimation of deep learning[J]. Journal of Frontiers of Computer Science & Technology, 2021, 15(4): 641-657. (in Chinese) | |
[12] | FANG H S, XIE S Q, TAI Y W, et al. RMPE: regional multi-person pose estimation[C]// 2017 IEEE International Conference on Computer Vision. New York: IEEE Press, 2017: 2353-2362. |
[13] | CHEN Y L, WANG Z C, PENG Y X, et al. Cascaded pyramid network for multi-person pose estimation[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 7103-7112. |
[14] | CAO Z, SIMON T, WEI S H, et al. Realtime multi-person 2D pose estimation using part affinity fields[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 1302-1310. |
[15] | 梁桥康, 吴樾. 基于HRNet的轻量化人体姿态估计网络[J]. 湖南大学学报: 自然科学版, 2023, 50(2): 112-121. |
LIANG Q K, WU Y. Lightweight human pose estimation network based on HRNet[J]. Journal of Hunan University: Natural Sciences, 2023, 50(2): 112-121. (in Chinese) | |
[16] | MAJI D, NAGORI S, MATHEW M, et al. YOLO-pose: enhancing YOLO for multi person pose estimation using object keypoint similarity loss[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. New York: IEEE Press, 2022: 2636-2645. |
[17] |
廖永为, 张桂鹏, 杨振国, 等. 全卷积目标检测的改进算法[J]. 计算机工程与应用, 2022, 58(17): 158-164.
DOI |
LIAO Y W, ZHANG G P, YANG Z G, et al. Improved algorithm for fully convolutional object detection[J]. Computer Engineering and Applications, 2022, 58(17): 158-164. (in Chinese)
DOI |
|
[18] | 杨玉敏, 廖育荣, 林存宝, 等. 基于轻量化神经网络的空中目标检测算法[J]. 计算机仿真, 2022, 39(7): 70-73, 420. |
YANG Y M, LIAO Y R, LIN C B, et al. Aerial target detection algorithm based on lightweight neural network[J]. Computer Simulation, 2022, 39(7): 70-73, 420. (in Chinese) | |
[19] |
皮骏, 刘宇恒, 李久昊. 基于YOLOv5s的轻量化森林火灾检测算法研究[J]. 图学学报, 2023, 44(1): 26-32.
DOI |
PI J, LIU Y H, LI J H. Research on lightweight forest fire detection algorithm based on YOLOv5s[J]. Journal of Graphics, 2023, 44(1): 26-32. (in Chinese)
DOI |
|
[20] | HAN K, WANG Y H, TIAN Q, et al. GhostNet: more features from cheap operations[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 1577-1586. |
[21] | HOU Q B, ZHOU D Q, FENG J S. Coordinate attention for efficient mobile network design[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 13708-13717. |
[22] | WANG Z J, MA L Z, LIN X, et al. MSGC: a new bottom-up model for salient object detection[C]// 2018 IEEE International Conference on Multimedia and Expo. New York: IEEE Press, 2018: 1-6. |
[23] |
LIN X, WANG Z J, MA L Z, et al. Salient object detection based on multiscale segmentation and fuzzy broad learning[J]. The Computer Journal, 2022, 65(4): 1006-1019.
DOI URL |
[24] | TAN M X, PANG R M, LE Q V. EfficientDet: scalable and efficient object detection[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 10778-10787. |
[25] |
ZHANG Y F, REN W, ZHANG Z, et al. Focal and efficient IOU loss for accurate bounding box regression[J]. Neurocomputing, 2022, 506: 146-157.
DOI URL |
[26] | TONG Z, CHEN Y, XU Z, et al. Wise-IoU: bounding box regression loss with dynamic focusing mechanism[EB/OL]. (2023-01-24) [2023-05-27]. https://arxiv.org/abs/2301.10051. |
[27] | LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context[M]// Computer Vision - ECCV 2014. Cham: Springer International Publishing, 2014: 740-755. |
[28] | OSOKIN D. Real-time 2D multi-person pose estimation on CPU: lightweight OpenPose[C]// The 8th International Conference on Pattern Recognition Applications and Methods. Setúbal: SCITEPRESS - Science and Technology Publications, 2019: 744-748. |
[29] | NEFF C, SHETH A, FURGURSON S, et al. EfficientHRNet: efficient scaling for lightweight high-resolution multi-person pose estimation[EB/OL]. (2023-01-24) [2023-05-27]. https://arxiv.org/abs/2007.08090. |
[30] | 王名赫, 徐望明, 蒋昊坤. 一种改进的轻量级人体姿态估计算法[J]. 液晶与显示, 2023, 38(7): 955-963. |
WANG M H, XU W M, JIANG H K. An improved lightweight human attitude estimation algorithm[J]. Chinese Journal of Liquid Crystals and Displays, 2023, 38(7): 955-963. (in Chinese)
DOI URL |
[1] | HAO Shuai, ZHAO Xin-sheng, MA Xu, ZHANG Xu, HE Tian, HOU Li-xiang. Multi-class defect target detection method for transmission lines based on TR-YOLOv5 [J]. Journal of Graphics, 2023, 44(4): 667-676. |
[2] | CAO Yi-qin, ZHOU Yi-wei, XU Lu. A real-time metallic surface defect detection algorithm based on E-YOLOX [J]. Journal of Graphics, 2023, 44(4): 677-690. |
[3] | LI Gang, ZHANG Yun-tao, WANG Wen-kai, ZHANG Dong-yang. Defect detection method of transmission line bolts based on DETR and prior knowledge fusion [J]. Journal of Graphics, 2023, 44(3): 438-447. |
[4] | MAO Ai-kun, LIU Xin-ming, CHEN Wen-zhuang, SONG Shao-lou. Improved substation instrument target detection method for YOLOv5 algorithm [J]. Journal of Graphics, 2023, 44(3): 448-455. |
[5] | SUN Long-fei, LIU Hui, YANG Feng-chang, LI Pan. Research on cyclic generative network oriented to inter-layer interpolation of medical images [J]. Journal of Graphics, 2023, 44(3): 502-512. |
[6] | XIONG Ju-ju, XU Yang, FAN Run-ze, SUN Shao-cong. Flowers recognition based on lightweight visual transformer [J]. Journal of Graphics, 2023, 44(2): 271-279. |
[7] | HUANG Zhi-yong, HAN Sha-sha, CHEN Zhi-jun, YAO Yu, XIONG Biao, MA Kai. An imitation U-shaped network for video object segmentation [J]. Journal of Graphics, 2023, 44(1): 104-111. |
[8] | GUO Wen , LI Dong , YUAN Fei. 1. School of Information and Electronic Engineering, Shandong Technology and Business University, Yantai Shandong 264005, China; 2. Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100195, China [J]. Journal of Graphics, 2022, 43(6): 1124-1133. |
[9] | ZHAO Lu-lu , WANG Xue-ying , ZHANG Yi , ZHANG Mei-yue. Vehicle target detection based on YOLOv5s fusion SENet [J]. Journal of Graphics, 2022, 43(5): 776-782. |
[10] | WU Li-zhan, WANG Xia-li, ZHANG Qian, WANG Wei-hao, LI Chao . An object detection method of falling person based on optimized YOLOv5s [J]. Journal of Graphics, 2022, 43(5): 791-802. |
[11] | CAI Xing-quan, HUO Yu-qing, LI Fa-jian, SUN Hai-yan. Human pose estimation and similarity calculation for Tai Chi learning [J]. Journal of Graphics, 2022, 43(4): 695-706. |
[12] | ZHANG Yun-bo, YI Peng-fei, ZHOU Dong-sheng, ZHANG Qiang, WEI Xiao-peng. Efficient pedestrian detector combining depthwise separable convolution and standard convolution [J]. Journal of Graphics, 2022, 43(2): 230-238. |
[13] | ZHANG Ming, ZHANG Fang-hui, ZONG Jia-ping, SONG Zhi, CEN Yi-gang, ZHANG Lin-na . Face detection and embedded implementation of lightweight network [J]. Journal of Graphics, 2022, 43(2): 239-246. |
[14] | SU Chang-bao, GONG Shi-cai. Fully automatic matting algorithm for portraits based on deep learning [J]. Journal of Graphics, 2022, 43(2): 247-253. |
[15] | LI Ni-ni, WANG Xia-li, FU Yang-yang, ZHENG Feng-xian, HE Dan-dan, YUAN Shao-xin. A traffic police object detection method based on optimized YOLO model [J]. Journal of Graphics, 2022, 43(2): 296-305. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||