图学学报 ›› 2024, Vol. 45 ›› Issue (5): 1040-1049.DOI: 10.11996/JG.j.2095-302X.2024051040
李建华1,2(), 韩宇1, 石开铭2, 张可嘉2, 郭红领1(
), 方东平1, 曹佳明2
收稿日期:
2024-06-18
修回日期:
2024-08-05
出版日期:
2024-10-31
发布日期:
2024-10-31
通讯作者:
郭红领(1978-),男,副教授,博士。主要研究方向为智能建造、建筑信息模型和数字施工安全管理等。E-mail:hlguo@tsinghua.edu.cn第一作者:
李建华(1970-),男,正高级工程师,博士研究生。主要研究方向为工程管理。E-mail:228996838@qq.com
基金资助:
LI Jianhua1,2(), HAN Yu1, SHI Kaiming2, ZHANG Kejia2, GUO Hongling1(
), FANG Dongping1, CAO Jiaming2
Received:
2024-06-18
Revised:
2024-08-05
Published:
2024-10-31
Online:
2024-10-31
Contact:
GUO Hongling (1978-), associate professor, PhD. His main research interest covers intelligent construction, building information modeling and digital construction safety management. E-mail:hlguo@tsinghua.edu.cnFirst author:
LI Jianhua (1970-), senior engineer, PhD candidate. His main research interest covers construction management. E-mail:228996838@qq.com
Supported by:
摘要:
利用施工现场监控视频或图像对工人进行精确检测,可为施工安全智能化管理提供基础支持。然而,较远的监控距离使得工人在画面中以小目标的形式出现,加之现场环境复杂多变,给工人检测带来了挑战。为此,本文提出了一种融合改进的YOLO模型与帧差法的小目标工人检测方法。一是通过改进YOLOv5模型对静态工人进行检测,即引入切片辅助推理(SAHI)获取小目标工人更清晰的特征,添加小目标检测头确保小目标对象特征的完整性,利用高效通道注意力机制(ECA)提高对小目标的检测效果;二是通过帧差法对图像特征较弱的运动工人进行检测,一定程度上弥补图像检测的不足。该方法在自建数据集上进行了验证,结果表明:改进后的YOLOv5模型F1-Score提升了11.3%,平均精确均值(mAP)提升了12.5%,而融合帧差法后的综合方法对小目标工人的检出率提高了3.6%,达到了84.2%,FPS达到6帧每秒,能更好地满足施工现场工人检测的需要。
中图分类号:
李建华, 韩宇, 石开铭, 张可嘉, 郭红领, 方东平, 曹佳明. 施工现场小目标工人检测方法[J]. 图学学报, 2024, 45(5): 1040-1049.
LI Jianhua, HAN Yu, SHI Kaiming, ZHANG Kejia, GUO Hongling, FANG Dongping, CAO Jiaming. Small-target worker detection on construction sites[J]. Journal of Graphics, 2024, 45(5): 1040-1049.
模型 | P | R | mAP50 | mAP50:95 |
---|---|---|---|---|
YOLOv5m | 76.7 | 64.8 | 72.1 | 35.1 |
YOLOv8m | 70.1 | 58.1 | 63.4 | 30.5 |
表1 YOLOv5与v8效果对比/%
Table 1 Effect comparison of YOLOv5 and v8/%
模型 | P | R | mAP50 | mAP50:95 |
---|---|---|---|---|
YOLOv5m | 76.7 | 64.8 | 72.1 | 35.1 |
YOLOv8m | 70.1 | 58.1 | 63.4 | 30.5 |
检测头 | 20×20 | 40×40 | 80×80 | 160×160 |
---|---|---|---|---|
小 | (116,90) | (30,61) | (10,13) | (4,5) |
中 | (156,198) | (62.45) | (16,30) | (8,10) |
大 | (373,326) | (59,119) | (33,23) | (22,18) |
表2 聚类出的锚点框尺寸/Pixel
Table 2 The size of the clustered anchor frame/Pixel
检测头 | 20×20 | 40×40 | 80×80 | 160×160 |
---|---|---|---|---|
小 | (116,90) | (30,61) | (10,13) | (4,5) |
中 | (156,198) | (62.45) | (16,30) | (8,10) |
大 | (373,326) | (59,119) | (33,23) | (22,18) |
图10 帧差法效果展示((a)原始图像;(b)采用两帧差分之后的像素差图;(c)二值化加形态学处理后的运动区域图;(d)方框标注的运动区域图)
Fig. 10 Frame difference method effect display ((a) The original image; (b) The pixel-difference map using two-frame differences; (c) The motion region map after binarization and morphological processing; (d) Box marked motion area map)
模型 | P/% | R/% | mAP50/ % | mAP50:95/ % | FPS/ 帧每秒 |
---|---|---|---|---|---|
YOLO-v5m | 76.7 | 64.8 | 72.1 | 35.1 | 54.0 |
SAHI-1280 | 78.5 | 68.4 | 74.3 | 35.9 | 19.0 |
SAHI-640 | 81.2 | 71.6 | 75.2 | 36.7 | 8.0 |
SAHI-320 | 83.9 | 73.9 | 78.1 | 37.4 | 3.1 |
表3 SAHI效果
Table 3 Performance of SAHI
模型 | P/% | R/% | mAP50/ % | mAP50:95/ % | FPS/ 帧每秒 |
---|---|---|---|---|---|
YOLO-v5m | 76.7 | 64.8 | 72.1 | 35.1 | 54.0 |
SAHI-1280 | 78.5 | 68.4 | 74.3 | 35.9 | 19.0 |
SAHI-640 | 81.2 | 71.6 | 75.2 | 36.7 | 8.0 |
SAHI-320 | 83.9 | 73.9 | 78.1 | 37.4 | 3.1 |
模型 | F1/% | P/% | R/% | mAP50/% | mAP50:95/% | FPS/帧每秒 |
---|---|---|---|---|---|---|
YOLOv5m | 70.2 | 76.7 | 64.8 | 72.1 | 35.1 | 54 |
小目标检测头 | 76.4 | 76.8 | 76.0 | 77.6 | 36.2 | 27 |
YOLOv5m-SE | 75.7 | 78.8 | 72.8 | 78.7 | 33.7 | 43 |
YOLOv5m-CBAM | 74.0 | 71.4 | 76.9 | 77.1 | 32.8 | 44 |
YOLOv5m-ECA | 76.8 | 77.9 | 75.7 | 78.0 | 33.3 | 40 |
YOLOv5m-CA | 75.6 | 79.5 | 72.0 | 77.7 | 33.0 | 41 |
YOLOv5m-GAM | 75.9 | 75.7 | 76.1 | 79.1 | 38.4 | 25 |
表4 模型优化效果对比
Table 4 Comparison of model optimization effect
模型 | F1/% | P/% | R/% | mAP50/% | mAP50:95/% | FPS/帧每秒 |
---|---|---|---|---|---|---|
YOLOv5m | 70.2 | 76.7 | 64.8 | 72.1 | 35.1 | 54 |
小目标检测头 | 76.4 | 76.8 | 76.0 | 77.6 | 36.2 | 27 |
YOLOv5m-SE | 75.7 | 78.8 | 72.8 | 78.7 | 33.7 | 43 |
YOLOv5m-CBAM | 74.0 | 71.4 | 76.9 | 77.1 | 32.8 | 44 |
YOLOv5m-ECA | 76.8 | 77.9 | 75.7 | 78.0 | 33.3 | 40 |
YOLOv5m-CA | 75.6 | 79.5 | 72.0 | 77.7 | 33.0 | 41 |
YOLOv5m-GAM | 75.9 | 75.7 | 76.1 | 79.1 | 38.4 | 25 |
SAHI | 检测头 | ECA | F1/% | P/% | R/% | mAP50/% | FPS/帧每秒 |
---|---|---|---|---|---|---|---|
- | - | - | 70.2 | 76.7 | 64.8 | 72.1 | 54 |
√ | - | - | 76.1 | 81.2 | 71.6 | 75.2 | 8 |
- | √ | - | 76.4 | 76.8 | 76.0 | 77.6 | 27 |
- | - | √ | 76.8 | 77.9 | 75.7 | 78.0 | 40 |
√ | √ | - | 81.1 | 80.5 | 81.8 | 84.9 | 5 |
√ | - | √ | 81.5 | 82.8 | 80.2 | 84.6 | 6 |
- | √ | √ | 77.3 | 75.4 | 79.3 | 80.2 | 13 |
√ | √ | √ | 78.2 | 80.0 | 76.4 | 82.6 | 5 |
表5 消融实验结果
Table 5 Ablation experiment result
SAHI | 检测头 | ECA | F1/% | P/% | R/% | mAP50/% | FPS/帧每秒 |
---|---|---|---|---|---|---|---|
- | - | - | 70.2 | 76.7 | 64.8 | 72.1 | 54 |
√ | - | - | 76.1 | 81.2 | 71.6 | 75.2 | 8 |
- | √ | - | 76.4 | 76.8 | 76.0 | 77.6 | 27 |
- | - | √ | 76.8 | 77.9 | 75.7 | 78.0 | 40 |
√ | √ | - | 81.1 | 80.5 | 81.8 | 84.9 | 5 |
√ | - | √ | 81.5 | 82.8 | 80.2 | 84.6 | 6 |
- | √ | √ | 77.3 | 75.4 | 79.3 | 80.2 | 13 |
√ | √ | √ | 78.2 | 80.0 | 76.4 | 82.6 | 5 |
图13 改进YOLOv5与帧差法独立使用效果对比((a)对比结果,其中图(a1)为原始图像帧,图(a2)为帧差法检测结果,图(a3)为帧差法检测出的运动区域,图(a4)为改进的YOLOv5模型检测结果;(b)局部放大)
Fig. 13 Comparison between Improved YOLOv5 and frame difference method for independent use ((a) Comparison results,figure (a1) is the original image frame, figure (a2) is the detection result of frame difference method, figure (a3) is the motion region detected by frame difference method, and figure (a4) is the detection result of the improved YOLOv5 model; (b) Local amplification)
模型 | R/% | T/ms |
---|---|---|
改进的YOLOv5 | 80.6 | 265 |
帧差法 | 53.1 | 37 |
融合算法 | 84.2 | 313 |
表6 融合算法效果对比
Table 6 Fusion algorithm effect comparison
模型 | R/% | T/ms |
---|---|---|
改进的YOLOv5 | 80.6 | 265 |
帧差法 | 53.1 | 37 |
融合算法 | 84.2 | 313 |
[1] | 段锐, 邓晖, 邓逸川. ICT支持的塔吊安全管理框架——回顾与展望[J]. 图学学报, 2022, 43(1): 11-20. |
DUAN R, DENG H, DENG Y C. Information communications technology assisted tower crane safety management-review and prospect[J]. Journal of Graphics, 2022, 43(1): 11-20 (in Chinese). | |
[2] | HAN K, ZENG X D. Deep learning-based workers safety helmet wearing detection on construction sites using multi-scale features[J]. IEEE Access, 2022, 10: 718-729. |
[3] | 李华, 薛曦澄, 吴立舟, 等. 深度学习下吊装作业工人防护装备及吊钩检测方法[J]. 安全与环境学报, 2024, 24(3): 1027-1035. |
LI H, XUE X C, WU L Z, et al. Protective equipment and hook testing methods for workers in lifting operations under deep learning[J]. Journal of Safety and Environment, 2024, 24(3): 1027-1035 (in Chinese). | |
[4] | WU H, HAN Y, ZHANG M, et al. Identifying unsafe behavior of construction workers: a dynamic approach combining skeleton information and spatiotemporal features[J]. Journal of Construction Engineering and Management, 2023, 149(11): 04023115. |
[5] | 张知田, 王园园, 罗柱邦, 等. 塔吊与工人空间交互下危险场景自动检测[J]. 清华大学学报: 自然科学版, 2024, 64(2): 198-204. |
ZHANG Z T, WANG Y Y, LUO Z B, et al. Automatic detection of hazardous scenarios during spatial interaction between tower cranes and workers[J]. Journal of Tsinghua University: Science and Technology, 2024, 64(2): 198-204 (in Chinese). | |
[6] |
GIRSHICK R, DONAHUE J, DARRELL T, et al. Region-based convolutional networks for accurate object detection and segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(1): 142-158.
DOI PMID |
[7] | GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]// 2014 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2014: 580-587. |
[8] |
REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
DOI PMID |
[9] | HE K M, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN[C]// 2017 IEEE International Conference on Computer Vision. New York: IEEE Press, 2017: 2980-2988. |
[10] | REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 779-788. |
[11] | LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector[C]// The 14th European Conference on Computer Vision. Cham: Springer, 2016: 21-37. |
[12] | 曹坤煜, 陈述, 陈云, 等. 混凝土坝面交叉作业安全风险智能识别方法[J]. 水力发电学报, 2023, 42(12): 146-158. |
CAO K Y, CHEN S, CHEN Y, et al. Intelligent identification method for safety risks in cross operation on concrete dam surface[J]. Journal of Hydroelectric Engineering, 2023, 42(12): 146-158 (in Chinese). | |
[13] | 邢金昊, 饶颖露, 张恒, 等. 基于改进YOLO的建筑预埋件检测算法[J]. 计算机应用与软件, 2021, 38(11): 179-184. |
XING J H, RAO Y L, ZHANG H, et al. Pre-buried parts detections based on improved YOLO model[J]. Computer Applications and Software, 2021, 38(11): 179-184 (in Chinese). | |
[14] | LI H B, WU D C, ZHANG W M, et al. YOLO-PL: helmet wearing detection algorithm based on improved YOLOv4[J]. Digital Signal Processing, 2024, 144: 104283. |
[15] | KOYUN O C, KESER R K, AKKAYA İ B, et al. Focus-and-Detect: a small object detection framework for aerial images[J]. Signal Processing: Image Communication, 2022, 104: 116675. |
[16] | YANG C H Y, HUANG Z H, WANG N Y. QueryDet: cascaded sparse query for accelerating high-resolution small object detection[C]// 2022 IEEE CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 13658-13667. |
[17] | GUO M H, XU T X, LIU J J, et al. Attention mechanisms in computer vision: a survey[J]. Computational Visual Media, 2022, 8(3): 331-368. |
[18] | MITTAL P, SHARMA A, SINGH R, et al. Dilated convolution based RCNN using feature fusion for low-altitude aerial objects[J]. Expert Systems with Applications, 2022, 199: 117106. |
[19] | XUE Y J, JU Z Y, LI Y M, et al. MAF-YOLO: multi-modal attention fusion based YOLO for pedestrian detection[J]. Infrared Physics & Technology, 2021, 118: 103906. |
[20] |
李小波, 李阳贵, 郭宁, 等. 融合注意力机制的YOLOv5口罩检测算法[J]. 图学学报, 2023, 44(1): 16-25.
DOI |
LI X B, LI Y G, GUO N, et al. Mask detection algorithm based on YOLOv5 integrating attention mechanism[J]. Journal of Graphics, 2023, 44(1): 16-25 (in Chinese). | |
[21] | AKYON F C, ALTINUC S O, TEMIZEL A. Slicing aided hyper inference and fine-tuning for small object detection[C]// 2022 IEEE International Conference on Image Processing. New York: IEEE Press, 2022: 966-970. |
[22] | KIM S, HONG S H, KIM H, et al. Small object detection (SOD) system for comprehensive construction site safety monitoring[J]. Automation in Construction, 2023, 156: 105103. |
[23] | HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 7132-7141. |
[24] | WANG Q L, WU B G, ZHU P F, et al. ECA-Net: efficient channel attention for deep convolutional neural networks[C]// 2020 IEEE CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 11531-11539. |
[25] | FAN Y, TIAN L W, YANG L. Moving object detection based on enhanced frame difference[C]// The 3rd International Conference on Computer Science and Communication Technology. Bellingham: SPIE, 2022: 125061C. |
[26] | 王玉萍, 曾毅. 结合帧差的核相关滤波弱小红外目标检测[J]. 红外技术, 2023, 45(7): 755-767. |
WANG Y P, ZENG Y. Weak and small infrared target detection combined with frame difference kernel correlation filtering[J]. Infrared Technology, 2023, 45(7): 755-767 (in Chinese). | |
[27] | HE H, MA S C, SUN L. Multi-moving target detection based on the combination of three frame difference algorithm and background difference algorithm[C]// 2018 WRC Symposium on Advanced Robotics and Automation. New York: IEEE Press, 2018: 141-146. |
[1] | 师皓, 王澍, 韩健鸿, 罗兆亿, 王裕沛. 基于视觉-文本损失的开放词汇检测大模型对抗样本生成方法[J]. 图学学报, 2024, 45(6): 1222-1230. |
[2] | 李珍峰, 符世琛, 徐乐, 孟博, 张昕, 秦建军. 基于MBI-YOLOv8的煤矸石目标检测算法研究[J]. 图学学报, 2024, 45(6): 1301-1312. |
[3] | 李盛涛, 侯立群, 董亚松. 基于R-YOLOv7和MIMO-CTFNet的指针式仪表自动读数方法[J]. 图学学报, 2024, 45(6): 1313-1327. |
[4] | 闫建红, 冉同霄. 基于YOLOv8的轻量化无人机图像目标检测算法[J]. 图学学报, 2024, 45(6): 1328-1337. |
[5] | 胡凤阔, 叶兰, 谭显峰, 张钦展, 胡志新, 方清, 王磊, 满孝锋. 一种基于改进YOLOv8的轻量化路面病害检测算法[J]. 图学学报, 2024, 45(5): 892-900. |
[6] | 王亚茹, 冯利龙, 宋晓轲, 屈卓, 杨珂, 王乾铭, 翟永杰. TFD-YOLOv8:一种用于输电线路的异物检测方法[J]. 图学学报, 2024, 45(5): 901-912. |
[7] | 刘义艳, 郝婷楠, 贺晨, 常英杰. 基于DBBR-YOLO的光伏电池表面缺陷检测[J]. 图学学报, 2024, 45(5): 913-921. |
[8] | 李刚, 蔡泽浩, 孙华勋, 赵振兵. 基于改进YOLOv8与语义知识融合的金具缺陷检测方法研究[J]. 图学学报, 2024, 45(5): 979-986. |
[9] | 谢国波, 林松泽, 林志毅, 吴陈锋, 梁立辉. 基于改进YOLOv7-tiny的道路病害检测算法[J]. 图学学报, 2024, 45(5): 987-997. |
[10] | 李大湘, 吉展, 刘颖, 唐垚. 改进YOLOv7遥感图像目标检测算法[J]. 图学学报, 2024, 45(4): 650-658. |
[11] | 胡欣, 常娅姝, 秦皓, 肖剑, 程鸿亮. 基于改进YOLOv8和GMM图像点集匹配的双目测距方法[J]. 图学学报, 2024, 45(4): 714-725. |
[12] | 牛为华, 郭迅. 基于改进YOLOv8的船舰遥感图像旋转目标检测算法[J]. 图学学报, 2024, 45(4): 726-735. |
[13] | 曾志超, 徐玥, 王景玉, 叶元龙, 黄志开, 王欢. 基于SOE-YOLO轻量化的水面目标检测算法[J]. 图学学报, 2024, 45(4): 736-744. |
[14] | 武兵, 田莹. 基于注意力机制的多尺度道路损伤检测算法研究[J]. 图学学报, 2024, 45(4): 770-778. |
[15] | 赵磊, 李栋, 房建东, 曹琪. 面向交通标志的改进YOLO目标检测算法[J]. 图学学报, 2024, 45(4): 779-790. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||