图学学报 ›› 2023, Vol. 44 ›› Issue (5): 868-878.DOI: 10.11996/JG.j.2095-302X.2023050868
收稿日期:
2023-05-31
接受日期:
2023-08-03
出版日期:
2023-10-31
发布日期:
2023-10-31
通讯作者:
高志云(1993-),女,讲师,博士。主要研究方向为图像处理与模式识别。E-mail:zhiyungao@163.com
作者简介:
皮骏(1973-),男,副教授,博士。主要研究方向为目标检测、图像处理与模式识别。E-mail:jpi@cauc.edu.cn
基金资助:
PI Jun(), NIU Hou-xing, GAO Zhi-yun(
)
Received:
2023-05-31
Accepted:
2023-08-03
Online:
2023-10-31
Published:
2023-10-31
Contact:
GAO Zhi-yun (1993-), lecturer, PH.D. Her main research interests cover image processing and pattern recognition. E-mail:About author:
PI Jun (1973-), associate professor, Ph.D. His main research interests cover object detection, image processing and pattern recognition. E-mail:jpi@cauc.edu.cn
Supported by:
摘要:
针对现有的基于热力图的人体姿态估计网络模型复杂度高、算力需求大、不易部署至嵌入式平台和无人机移动平台等问题,提出了一种基于YOLOv5s6-Pose-ti-lite不使用热力图的轻量化人体姿态估计网络模型。通过将主干网络替换为GhostNet网络,旨在以更少的计算资源输出更有效的特征信息,提升网络检测速度,缓解网络冗余的问题;在主干网络中结合轻量化的坐标注意力CA模块,将图片的人体关键点位置信息聚集到通道上,增强特征提取能力;引入加权双向特征金字塔网络,提升模型的特征融合能力,平衡不同尺度的特征信息;最后将CIoU损失函数替换为Wise-IoU (WIoU),进一步提升模型对人体关键点回归的性能。结果表明,在COCO2017人体关键点数据集上,优化后的网络模型参数量降低26.2%,计算量降低30.0%,平均精确度提升1.7个百分点、平均召回率提升2.7个百分点,能够满足实时性的效果,验证了所提模型的可行性和有效性。
中图分类号:
皮骏, 牛厚兴, 高志云. 融合CA-BiFPN的轻量化人体姿态估计算法[J]. 图学学报, 2023, 44(5): 868-878.
PI Jun, NIU Hou-xing, GAO Zhi-yun. Lightweight human pose estimation algorithm by integrating CA and BiFPN[J]. Journal of Graphics, 2023, 44(5): 868-878.
Method | Backbone | Input size | Params (M) | GMACS | AP (%) | AP50 (%) | AP75 (%) | APL (%) | AR (%) |
---|---|---|---|---|---|---|---|---|---|
Lightweight OpenPose | - | 368×368 | 4.1 | 18.0 | 42.8 | - | - | - | - |
EfficientHRNet-H2 | EfficientNetB2 | 448×448 | 10.3 | 15.4 | 52.9 | 80.5 | - | - | - |
EfficientHRNet-H3 | EfficientNetB3 | 416×416 | 6.9 | 8.4 | 44.8 | 76.7 | - | - | - |
EfficientHRNet-H4 | EfficientNetB4 | 384×384 | 3.7 | 4.2 | 35.7 | 69.6 | - | - | - |
baseline | Darknet_csp-d53-s | 640×640 | 12.6 | 8.6 | 54.0 | 81.1 | 58.7 | 65.5 | 59.7 |
Ours-EIoU | Darknet_csp-d53-s | 640×640 | 9.3 | 6.1 | 55.0 | 82.2 | 58.4 | 70.0 | 61.9 |
Ours-WIoU | Darknet_csp-d53-s | 640×640 | 9.3 | 6.1 | 55.8 | 82.8 | 59.9 | 69.4 | 62.4 |
表1 COCO2017人体关键点数据集下各方法对比
Table 1 Comparison of various methods on the COCO2017 dataset
Method | Backbone | Input size | Params (M) | GMACS | AP (%) | AP50 (%) | AP75 (%) | APL (%) | AR (%) |
---|---|---|---|---|---|---|---|---|---|
Lightweight OpenPose | - | 368×368 | 4.1 | 18.0 | 42.8 | - | - | - | - |
EfficientHRNet-H2 | EfficientNetB2 | 448×448 | 10.3 | 15.4 | 52.9 | 80.5 | - | - | - |
EfficientHRNet-H3 | EfficientNetB3 | 416×416 | 6.9 | 8.4 | 44.8 | 76.7 | - | - | - |
EfficientHRNet-H4 | EfficientNetB4 | 384×384 | 3.7 | 4.2 | 35.7 | 69.6 | - | - | - |
baseline | Darknet_csp-d53-s | 640×640 | 12.6 | 8.6 | 54.0 | 81.1 | 58.7 | 65.5 | 59.7 |
Ours-EIoU | Darknet_csp-d53-s | 640×640 | 9.3 | 6.1 | 55.0 | 82.2 | 58.4 | 70.0 | 61.9 |
Ours-WIoU | Darknet_csp-d53-s | 640×640 | 9.3 | 6.1 | 55.8 | 82.8 | 59.9 | 69.4 | 62.4 |
图8 轻量化人体姿态估计方法可视化对比
Fig. 8 Visual comparison of lightweight human pose estimation methods ((a) Lightweight OpenPose; (b) EfficientHRNet-H2; (c) EfficientHRNet-H3; (d) EfficientHRNet-H4; (e) Ours (WIoU))
图9 COCO2017人体关键点数据集检测结果((a)密集人群;(b)障碍物遮挡;(c)暗光环境;(d)俯视角)
Fig. 9 Pose estimation results on COCO 2017 human keypoint dataset ((a) Dense crowd; (b) Obstructed by obstacles; (c) Dark light environment; (d) Overlooking angle)
Method | GhostNet | CA | BiFPN | EIoU | WIoU |
---|---|---|---|---|---|
① | - | - | - | - | - |
② | √ | - | - | - | - |
③ | √ | √ | - | - | - |
④ | √ | √ | √ | - | - |
⑤ | √ | √ | √ | √ | - |
⑥ | √ | √ | √ | - | √ |
表2 消融实验设计
Table 2 Ablation experimental design
Method | GhostNet | CA | BiFPN | EIoU | WIoU |
---|---|---|---|---|---|
① | - | - | - | - | - |
② | √ | - | - | - | - |
③ | √ | √ | - | - | - |
④ | √ | √ | √ | - | - |
⑤ | √ | √ | √ | √ | - |
⑥ | √ | √ | √ | - | √ |
Method | Params (M) | GMACS | AP (%) | AP50 (%) | AR (%) |
---|---|---|---|---|---|
① | 12.6 | 8.7 | 54.0 | 81.1 | 59.7 |
② | 9.0 | 5.8 | 52.7 | 80.0 | 58.1 |
③ | 9.1 | 5.8 | 54.3 | 81.0 | 59.5 |
④ | 9.3 | 6.1 | 54.1 | 81.5 | 61.0 |
⑤ | 9.3 | 6.1 | 55.0 | 82.2 | 61.9 |
⑥ | 9.3 | 6.1 | 55.8 | 82.8 | 62.4 |
表3 消融实验结果
Table 3 Ablation experiment results
Method | Params (M) | GMACS | AP (%) | AP50 (%) | AR (%) |
---|---|---|---|---|---|
① | 12.6 | 8.7 | 54.0 | 81.1 | 59.7 |
② | 9.0 | 5.8 | 52.7 | 80.0 | 58.1 |
③ | 9.1 | 5.8 | 54.3 | 81.0 | 59.5 |
④ | 9.3 | 6.1 | 54.1 | 81.5 | 61.0 |
⑤ | 9.3 | 6.1 | 55.0 | 82.2 | 61.9 |
⑥ | 9.3 | 6.1 | 55.8 | 82.8 | 62.4 |
[1] | 冯杰, 郑建立. 基于卷积与Transformer的人体姿态估计方法对比研究[J]. 软件工程, 2023, 26(3): 18-24. |
FENG J, ZHENG J L. A comparative study of human pose estimation based on convolution and transformer[J]. Software Engineer, 2023, 26(3): 18-24. (in Chinese) | |
[2] | 罗梦诗, 徐杨, 叶星鑫. 基于轻量型高分辨率网络的被遮挡人体姿态估计[J]. 武汉大学学报: 理学版, 2021, 67(5): 403-410. |
LUO M S, XU Y, YE X X. Human pose estimation of occlusion based on light-weight high-resolution network[J]. Journal of Wuhan University: Natural Science Edition, 2021, 67(5): 403-410. (in Chinese) | |
[3] |
张越, 黄友锐, 刘鹏坤. 引入注意力机制的多分辨率人体姿态估计研究[J]. 计算机工程与应用, 2021, 57(8): 126-132.
DOI |
ZHANG Y, HUANG Y R, LIU P K. Research on multi-resolution human pose estimation with attention mechanism[J]. Computer Engineering and Applications, 2021, 57(8): 126-132. (in Chinese)
DOI |
|
[4] | 李崤河, 刘进锋. 二维人体姿态估计研究综述[J]. 现代计算机, 2019(22): 33-37. |
LI X H, LIU J F. A survey of two dimension human pose estimation[J]. Modern Computer, 2019(22): 33-37. (in Chinese) | |
[5] |
刘勇, 李杰, 张建林, 等. 基于深度学习的二维人体姿态估计研究进展[J]. 计算机工程, 2021, 47(3): 1-16.
DOI |
LIU Y, LI J, ZHANG J L, et al. Research progress of two-dimensional human pose estimation based on deep learning[J]. Computer Engineering, 2021, 47(3): 1-16. (in Chinese)
DOI |
|
[6] | TOSHEV A, SZEGEDY C. DeepPose: human pose estimation via deep neural networks[C]// 2014 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2014: 1653-1660. |
[7] | WEI S H, RAMAKRISHNA V, KANADE T, et al. Convolutional pose machines[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 4724-4732. |
[8] | NEWELL A, YANG K Y, DENG J. Stacked hourglass networks for human pose estimation[M]// Computer Vision - ECCV 2016. Cham: Springer International Publishing, 2016: 483-499. |
[9] | SUN K, XIAO B, LIU D, et al. Deep high-resolution representation learning for human pose estimation[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 5686-5696. |
[10] | 曾文献, 马月, 李伟光. 轻量化二维人体骨骼关键点检测算法综述[J]. 科学技术与工程, 2022, 22(16): 6377-6392. |
ZENG W X, MA Y, LI W G. A survey of lightweight two-dimensional human skeleton key point detection algorithms[J]. Science Technology and Engineering, 2022, 22(16): 6377-6392. (in Chinese) | |
[11] |
周燕, 刘紫琴, 曾凡智, 等. 深度学习的二维人体姿态估计综述[J]. 计算机科学与探索, 2021, 15(4): 641-657.
DOI |
ZHOU Y, LIU Z Q, ZENG F Z, et al. Survey on two-dimensional human pose estimation of deep learning[J]. Journal of Frontiers of Computer Science & Technology, 2021, 15(4): 641-657. (in Chinese) | |
[12] | FANG H S, XIE S Q, TAI Y W, et al. RMPE: regional multi-person pose estimation[C]// 2017 IEEE International Conference on Computer Vision. New York: IEEE Press, 2017: 2353-2362. |
[13] | CHEN Y L, WANG Z C, PENG Y X, et al. Cascaded pyramid network for multi-person pose estimation[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 7103-7112. |
[14] | CAO Z, SIMON T, WEI S H, et al. Realtime multi-person 2D pose estimation using part affinity fields[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 1302-1310. |
[15] | 梁桥康, 吴樾. 基于HRNet的轻量化人体姿态估计网络[J]. 湖南大学学报: 自然科学版, 2023, 50(2): 112-121. |
LIANG Q K, WU Y. Lightweight human pose estimation network based on HRNet[J]. Journal of Hunan University: Natural Sciences, 2023, 50(2): 112-121. (in Chinese) | |
[16] | MAJI D, NAGORI S, MATHEW M, et al. YOLO-pose: enhancing YOLO for multi person pose estimation using object keypoint similarity loss[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. New York: IEEE Press, 2022: 2636-2645. |
[17] |
廖永为, 张桂鹏, 杨振国, 等. 全卷积目标检测的改进算法[J]. 计算机工程与应用, 2022, 58(17): 158-164.
DOI |
LIAO Y W, ZHANG G P, YANG Z G, et al. Improved algorithm for fully convolutional object detection[J]. Computer Engineering and Applications, 2022, 58(17): 158-164. (in Chinese)
DOI |
|
[18] | 杨玉敏, 廖育荣, 林存宝, 等. 基于轻量化神经网络的空中目标检测算法[J]. 计算机仿真, 2022, 39(7): 70-73, 420. |
YANG Y M, LIAO Y R, LIN C B, et al. Aerial target detection algorithm based on lightweight neural network[J]. Computer Simulation, 2022, 39(7): 70-73, 420. (in Chinese) | |
[19] |
皮骏, 刘宇恒, 李久昊. 基于YOLOv5s的轻量化森林火灾检测算法研究[J]. 图学学报, 2023, 44(1): 26-32.
DOI |
PI J, LIU Y H, LI J H. Research on lightweight forest fire detection algorithm based on YOLOv5s[J]. Journal of Graphics, 2023, 44(1): 26-32. (in Chinese)
DOI |
|
[20] | HAN K, WANG Y H, TIAN Q, et al. GhostNet: more features from cheap operations[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 1577-1586. |
[21] | HOU Q B, ZHOU D Q, FENG J S. Coordinate attention for efficient mobile network design[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 13708-13717. |
[22] | WANG Z J, MA L Z, LIN X, et al. MSGC: a new bottom-up model for salient object detection[C]// 2018 IEEE International Conference on Multimedia and Expo. New York: IEEE Press, 2018: 1-6. |
[23] |
LIN X, WANG Z J, MA L Z, et al. Salient object detection based on multiscale segmentation and fuzzy broad learning[J]. The Computer Journal, 2022, 65(4): 1006-1019.
DOI URL |
[24] | TAN M X, PANG R M, LE Q V. EfficientDet: scalable and efficient object detection[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 10778-10787. |
[25] |
ZHANG Y F, REN W, ZHANG Z, et al. Focal and efficient IOU loss for accurate bounding box regression[J]. Neurocomputing, 2022, 506: 146-157.
DOI URL |
[26] | TONG Z, CHEN Y, XU Z, et al. Wise-IoU: bounding box regression loss with dynamic focusing mechanism[EB/OL]. (2023-01-24) [2023-05-27]. https://arxiv.org/abs/2301.10051. |
[27] | LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context[M]// Computer Vision - ECCV 2014. Cham: Springer International Publishing, 2014: 740-755. |
[28] | OSOKIN D. Real-time 2D multi-person pose estimation on CPU: lightweight OpenPose[C]// The 8th International Conference on Pattern Recognition Applications and Methods. Setúbal: SCITEPRESS - Science and Technology Publications, 2019: 744-748. |
[29] | NEFF C, SHETH A, FURGURSON S, et al. EfficientHRNet: efficient scaling for lightweight high-resolution multi-person pose estimation[EB/OL]. (2023-01-24) [2023-05-27]. https://arxiv.org/abs/2007.08090. |
[30] | 王名赫, 徐望明, 蒋昊坤. 一种改进的轻量级人体姿态估计算法[J]. 液晶与显示, 2023, 38(7): 955-963. |
WANG M H, XU W M, JIANG H K. An improved lightweight human attitude estimation algorithm[J]. Chinese Journal of Liquid Crystals and Displays, 2023, 38(7): 955-963. (in Chinese)
DOI URL |
[1] | 郝帅, 赵新生, 马旭, 张旭, 何田, 侯李祥. 基于TR-YOLOv5的输电线路多类缺陷目标检测方法[J]. 图学学报, 2023, 44(4): 667-676. |
[2] | 李刚, 张运涛, 汪文凯, 张东阳. 采用DETR与先验知识融合的输电线路螺栓缺陷检测方法[J]. 图学学报, 2023, 44(3): 438-447. |
[3] | 孙龙飞, 刘慧, 杨奉常, 李攀. 面向医学图像层间插值的循环生成网络研究[J]. 图学学报, 2023, 44(3): 502-512. |
[4] | 熊举举, 徐杨, 范润泽, 孙少聪. 基于轻量化视觉Transformer的花卉识别[J]. 图学学报, 2023, 44(2): 271-279. |
[5] | 皮骏, 刘宇恒, 李久昊. 基于YOLOv5s的轻量化森林火灾检测算法研究[J]. 图学学报, 2023, 44(1): 26-32. |
[6] | 黄志勇, 韩莎莎, 陈致君, 姚玉, 熊彪, 马凯. 一种用于视频对象分割的仿U形网络[J]. 图学学报, 2023, 44(1): 104-111. |
[7] | 郭文, 李冬, 袁飞 . 多尺度注意力融合和抗噪声的轻量点云人脸识别模型[J]. 图学学报, 2022, 43(6): 1124-1133. |
[8] | 赵璐璐 , 王学营 , 张 翼 , 张美月 . 基于 YOLOv5s 融合 SENet 的车辆目标 检测技术研究[J]. 图学学报, 2022, 43(5): 776-782. |
[9] | 武历展, 王夏黎, 张 倩, 王炜昊, 李 超. 基于优化 YOLOv5s 的跌倒人物目标检测方法[J]. 图学学报, 2022, 43(5): 791-802. |
[10] | 胡海涛 , 杜昊晨 , 王素琴 , 石 敏 , 朱登明 , . 改进 YOLOX 的药品泡罩铝箔表面缺陷 检测方法[J]. 图学学报, 2022, 43(5): 803-814. |
[11] | 蔡兴泉, 霍宇晴, 李发建, 孙海燕. 面向太极拳学习的人体姿态估计及相似度计算[J]. 图学学报, 2022, 43(4): 695-706. |
[12] | 张运波, 易鹏飞, 周东生, 张强, 魏小鹏. 深度可分离卷积和标准卷积相结合的高效行人检测器[J]. 图学学报, 2022, 43(2): 230-238. |
[13] | 李妮妮, 王夏黎, 付阳阳, 郑凤仙, 何丹丹, 袁绍欣. 一种优化 YOLO 模型的交通警察目标检测方法[J]. 图学学报, 2022, 43(2): 296-305. |
[14] | 刘玉杰, 张敏杰, 李宗民, 李华. 基于全局姿态感知的轻量级人体姿态估计[J]. 图学学报, 2022, 43(2): 333-341. |
[15] | 张芳兰, 刘龙吉, 姚宛彤. 面向关键用户需求的踝足矫形器定制化设计方法[J]. 图学学报, 2021, 42(5): 841-848. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||