Journal of Graphics ›› 2023, Vol. 44 ›› Issue (1): 16-25.DOI: 10.11996/JG.j.2095-302X.2023010016
• Image Processing and Computer Vision • Previous Articles Next Articles
LI Xiao-bo1(), LI Yang-gui1,2(
), GUO Ning1, FAN Zhen1
Received:
2022-06-05
Revised:
2022-08-03
Online:
2023-10-31
Published:
2023-02-16
Contact:
LI Yang-gui
About author:
LI Xiao-bo (1998-), master student. His main research interests cover object detection and image processing. E-mail:1336441422@qq.com
Supported by:
CLC Number:
LI Xiao-bo, LI Yang-gui, GUO Ning, FAN Zhen. Mask detection algorithm based on YOLOv5 integrating attention mechanism[J]. Journal of Graphics, 2023, 44(1): 16-25.
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.txxb.com.cn/EN/10.11996/JG.j.2095-302X.2023010016
编号 | Class | Center_x | Center_y | Width | Height |
---|---|---|---|---|---|
0 (左下) | 0 | 0.208 065 | 0.594 470 | 0.129 032 | 0.253 456 |
1 (左上) | 1 | 0.301 613 | 0.395 161 | 0.103 226 | 0.168 203 |
2 (右上) | 0 | 0.662 097 | 0.364 055 | 0.111 290 | 0.161 290 |
3 (右下) | 1 | 0.789 516 | 0.597 926 | 0.104 839 | 0.191 244 |
Table 1 The contents of the YOLO format annotation file
编号 | Class | Center_x | Center_y | Width | Height |
---|---|---|---|---|---|
0 (左下) | 0 | 0.208 065 | 0.594 470 | 0.129 032 | 0.253 456 |
1 (左上) | 1 | 0.301 613 | 0.395 161 | 0.103 226 | 0.168 203 |
2 (右上) | 0 | 0.662 097 | 0.364 055 | 0.111 290 | 0.161 290 |
3 (右下) | 1 | 0.789 516 | 0.597 926 | 0.104 839 | 0.191 244 |
参数 | 配置 |
---|---|
操作系统 | Ubuntu 20.04.2 LTS |
CPU | Intel(R) Xeon(R) CPU E5-2603 v4 @1.70 GHz |
GPU | GeForce GTX 1080 Ti |
编程语言 | Python3.8.13 |
深度学习框架 | PyTorch1.9.1 |
加速环境 | CUDA11.1+cudnn8.3.3 |
Table 2 The experimental environment configuration
参数 | 配置 |
---|---|
操作系统 | Ubuntu 20.04.2 LTS |
CPU | Intel(R) Xeon(R) CPU E5-2603 v4 @1.70 GHz |
GPU | GeForce GTX 1080 Ti |
编程语言 | Python3.8.13 |
深度学习框架 | PyTorch1.9.1 |
加速环境 | CUDA11.1+cudnn8.3.3 |
算法 | 类别 | P | R | AP/mAP (%) | Parameters |
---|---|---|---|---|---|
nomask | 0.759 | 0.693 | 73.6 | ||
YOLOv5s | mask | 0.810 | 0.792 | 80.7 | 7 015 519 |
all | 0.785 | 0.743 | 77.1 | ||
nomask | 0.841 | 0.753 | 80.9 | ||
YOLOv5s-SE | mask | 0.845 | 0.820 | 85.0 | 7 222 879 |
all | 0.843 | 0.786 | 82.9 | ||
nomask | 0.920 | 0.727 | 81.9 | ||
YOLOv5s-CBAM | mask | 0.915 | 0.791 | 86.2 | 7 222 977 |
all | 0.917 | 0.759 | 84.0 | ||
nomask | 0.905 | 0.737 | 81.4 | ||
YOLOv5s-CA | mask | 0.898 | 0.802 | 85.8 | 7 215 759 |
all | 0.901 | 0.770 | 83.6 | ||
nomask | 0.883 | 0.743 | 81.1 | ||
YOLOv5s-NAM | mask | 0.886 | 0.822 | 86.5 | 7 191 135 |
all | 0.885 | 0.783 | 83.8 |
Table 3 The performance comparison of four attention mechanisms
算法 | 类别 | P | R | AP/mAP (%) | Parameters |
---|---|---|---|---|---|
nomask | 0.759 | 0.693 | 73.6 | ||
YOLOv5s | mask | 0.810 | 0.792 | 80.7 | 7 015 519 |
all | 0.785 | 0.743 | 77.1 | ||
nomask | 0.841 | 0.753 | 80.9 | ||
YOLOv5s-SE | mask | 0.845 | 0.820 | 85.0 | 7 222 879 |
all | 0.843 | 0.786 | 82.9 | ||
nomask | 0.920 | 0.727 | 81.9 | ||
YOLOv5s-CBAM | mask | 0.915 | 0.791 | 86.2 | 7 222 977 |
all | 0.917 | 0.759 | 84.0 | ||
nomask | 0.905 | 0.737 | 81.4 | ||
YOLOv5s-CA | mask | 0.898 | 0.802 | 85.8 | 7 215 759 |
all | 0.901 | 0.770 | 83.6 | ||
nomask | 0.883 | 0.743 | 81.1 | ||
YOLOv5s-NAM | mask | 0.886 | 0.822 | 86.5 | 7 191 135 |
all | 0.885 | 0.783 | 83.8 |
Fig. 14 Comparison of detection effects of different algorithms ((a) Sparse targets; (b) Dense targets; (c) More dense targets; (d) Very dense targets)
Loss | AP (%) | mAP (%) | |
---|---|---|---|
mask | nomask | ||
CIoU | 86.2 | 81.9 | 84.0 |
GIoU | 87.9 | 83.3 | 85.6 |
Table 4 Influence of CIoU and GIoU on algorithm results
Loss | AP (%) | mAP (%) | |
---|---|---|---|
mask | nomask | ||
CIoU | 86.2 | 81.9 | 84.0 |
GIoU | 87.9 | 83.3 | 85.6 |
[1] | 周艳萍, 饶翮, 姜怡, 等. 新型冠状病毒肺炎疫情后期公众正确使用口罩调查分析[J]. 药物流行病学杂志, 2021, 30(3): 205-209. |
ZHOU Y P, RAO H, JIANG Y, et al. Analysis on the accuracy of public use of masks in the post-epidemic period of COVID-19[J]. Chinese Journal of Pharmacoepidemiology, 2021, 30(3): 205-209 (in Chinese). | |
[2] |
LEUNG N H L, CHU D K W, SHIU E Y C, et al. Respiratory virus shedding in exhaled breath and efficacy of face masks[J]. Nature Medicine, 2020, 26(5): 676-680.
DOI PMID |
[3] | 左双燕, 陈玉华, 曾翠, 等. 各国口罩应用范围及相关标准介绍[J]. 中国感染控制杂志, 2020, 19(2): 109-116. |
ZUO S Y, CHEN Y H, ZENG C, et al. Application scope and relevant standards of masks in various countries[J]. Chinese Journal of Infection Control, 2020, 19(2): 109-116 (in Chinese). | |
[4] | 曹家乐, 李亚利, 孙汉卿, 等. 基于深度学习的视觉目标检测技术综述[J]. 中国图象图形学报, 2022, 27(6): 1697-1722. |
CAO J L, LI Y L, SUN H Q, et al. A survey on deep learning based visual object detection[J]. Journal of Image and Graphics, 2022, 27(6): 1697-1722 (in Chinese). | |
[5] | 包晓敏, 王思琪. 基于深度学习的目标检测算法综述[J]. 传感器与微系统, 2022, 41(4): 5-9. |
BAO X M, WANG S Q. Survey of object detection algorithm based on deep learning[J]. Transducer and Microsystem Technologies, 2022, 41(4): 5-9 (in Chinese). | |
[6] | REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 779-788. |
[7] | REDMON J, FARHADI A. YOLOv3: an incremental improvement[EB/OL]. [2022-02-05]. https://arxiv.org/abs/1804.02767. |
[8] | BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. [2022-02-03]. https://arxiv.org/abs/2004.10934. |
[9] | LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector[M]//Computer Vision - ECCV 2016. Cham: Springer International Publishing, 2016: 21-37. |
[10] | LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]// 2017 IEEE International Conference on Computer Vision. New York: IEEE Press, 2017: 2999-3007. |
[11] | GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//2014 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2014: 580-587. |
[12] | GIRSHICK R. Fast R-CNN[C]//2015 IEEE International Conference on Computer Vision. New York: IEEE Press, 2015: 1440-1448. |
[13] |
REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
DOI PMID |
[14] | HE K M, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN[C]//2017 IEEE International Conference on Computer Vision. New York: IEEE Press, 2017: 2980-2988. |
[15] | 陈昭俊, 储珺, 曾伦杰. 基于动态加权类别平衡损失的多类别口罩佩戴检测[EB/OL]. (2022-04-20) [2022-05-03]. https://kns.cnki.net/kcms/detail/10.1034.T.20220419.1111.002.html. |
CHEN Z J, CHU J, ZENG L J. Multi category mask wearing detection based on dynamic weighted category balance loss[EB/OL]. (2022-04-20) [2022-05-03]. https://kns.cnki.net/kcms/detail/10.1034.T.20220419.1111.002.html(in Chinese). | |
[16] | 张修宝, 林子原, 田万鑫, 等. 全天候自然场景下的人脸佩戴口罩识别技术[J]. 中国科学: 信息科学, 2020, 50(7): 1110-1120. |
ZHANG X B, LIN Z Y, TIAN W X, et al. Mask-wearing recognition in the wild[J]. Scientia Sinica: Informationis, 2020, 50(7): 1110-1120 (in Chinese).
DOI URL |
|
[17] |
牛作东, 覃涛, 李捍东, 等. 改进RetinaFace的自然场景口罩佩戴检测算法[J]. 计算机工程与应用, 2020, 56(12): 1-7.
DOI |
NIU Z D, QIN T, LI H D, et al. Improved algorithm of RetinaFace for natural scene mask wear detection[J]. Computer Engineering and Applications, 2020, 56(12): 1-7 (in Chinese).
DOI |
|
[18] |
彭成, 张乔虹, 唐朝晖, 等. 基于YOLOv5增强模型的口罩佩戴检测方法研究[J]. 计算机工程, 2022, 48(4): 39-49.
DOI |
PENG C, ZHANG Q H, TANG Z H, et al. Research on mask wearing detection method based on YOLOv5 enhancement model[J]. Computer Engineering, 2022, 48(4): 39-49 (in Chinese).
DOI |
|
[19] | HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 7132-7141. |
[20] | WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[M]//Computer Vision - ECCV 2018. Cham: Springer International Publishing, 2018: 3-19. |
[21] | HOU Q B, ZHOU D Q, FENG J S. Coordinate attention for efficient mobile network design[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 13708-13717. |
[22] | LIU Y C, SHAO Z R, TENG Y Y, et al. NAM: normalization- based attention module[EB/OL]. [2022-02-03]. https://arxiv.org/abs/2111.12419. |
[23] |
HE K M, ZHANG X Y, REN S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916.
DOI PMID |
[24] | LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 936-944. |
[25] | LIU S, QI L, QIN H F, et al. Path aggregation network for instance segmentation[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 8759-8768. |
[26] | 彭雅坤, 曹伊宁, 刘晓群. 基于YOLOv5s的滑雪人员检测研究[J]. 长江信息通信, 2021, 34(8): 24-26. |
PENG Y K, CAO Y N, LIU X Q. Research on the detection of skiers based on YOLOv5s[J]. Changjiang Information & Communications, 2021, 34(8): 24-26 (in Chinese). | |
[27] | MNIH V, HEESS N, GRAVES A, et al. Recurrent models of visual attention[EB/OL]. [2022-02-03]. https://arxiv.org/abs/1406.6247. |
[28] | REZATOFIGHI H, TSOI N, GWAK J, et al. Generalized intersection over union: a metric and a loss for bounding box regression[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 658-666. |
[29] |
ZHENG Z H, WANG P, REN D W, et al. Enhancing geometric factors in model learning and inference for object detection and instance segmentation[J]. IEEE Transactions on Cybernetics, 2022, 52(8): 8574-8586.
DOI URL |
[30] |
杨其晟, 李文宽, 杨晓峰, 等. 改进YOLOv5的苹果花生长状态检测方法[J]. 计算机工程与应用, 2022, 58(4): 237-246.
DOI |
YANG Q S, LI W K, YANG X F, et al. Improved YOLOv5 method for detecting growth status of apple flowers[J]. Computer Engineering and Applications, 2022, 58(4): 237-246 (in Chinese).
DOI |
[1] | YANG Chen-cheng, DONG Xiu-cheng, HOU Bing, ZHANG Dang-cheng, XIANG Xian-ming, FENG Qi-ming. Reference based transformer texture migrates depth images super resolution reconstruction [J]. Journal of Graphics, 2023, 44(5): 861-867. |
[2] | SONG Huan-sheng, WEN Ya, SUN Shi-jie, SONG Xiang-yu, ZHANG Chao-yang, LI Xu. Tunnel fire detection based on improved student-teacher network [J]. Journal of Graphics, 2023, 44(5): 978-987. |
[3] | LI Li-xia, WANG Xin, WANG Jun, ZHANG You-yuan. Small object detection algorithm in UAV image based on feature fusion and attention mechanism [J]. Journal of Graphics, 2023, 44(4): 658-666. |
[4] | HAO Shuai, ZHAO Xin-sheng, MA Xu, ZHANG Xu, HE Tian, HOU Li-xiang. Multi-class defect target detection method for transmission lines based on TR-YOLOv5 [J]. Journal of Graphics, 2023, 44(4): 667-676. |
[5] | LI Xin, PU Yuan-yuan, ZHAO Zheng-peng, XU Dan, QIAN Wen-hua. Content semantics and style features match consistent artistic style transfer [J]. Journal of Graphics, 2023, 44(4): 699-709. |
[6] | YU Wei-qun, LIU Jia-tao, ZHANG Ya-ping. Monocular depth estimation based on Laplacian pyramid with attention fusion [J]. Journal of Graphics, 2023, 44(4): 728-738. |
[7] | HU Xin, ZHOU Yun-qiang, XIAO Jian, YANG Jie. Surface defect detection of threaded steel based on improved YOLOv5 [J]. Journal of Graphics, 2023, 44(3): 427-437. |
[8] | MAO Ai-kun, LIU Xin-ming, CHEN Wen-zhuang, SONG Shao-lou. Improved substation instrument target detection method for YOLOv5 algorithm [J]. Journal of Graphics, 2023, 44(3): 448-455. |
[9] | HAO Peng-fei, LIU Li-qun, GU Ren-yuan. YOLO-RD-Apple orchard heterogenous image obscured fruit detection model [J]. Journal of Graphics, 2023, 44(3): 456-464. |
[10] | LI Yu, YAN Tian-tian, ZHOU Dong-sheng, WEI Xiao-peng. Natural scene text detection based on attention mechanism and deep multi-scale feature fusion [J]. Journal of Graphics, 2023, 44(3): 473-481. |
[11] | XIAO Tian-xing, WU Jing-jing. Segmentation of laser coding characters based on residual and feature-grouped attention [J]. Journal of Graphics, 2023, 44(3): 482-491. |
[12] | LIU Bing, YE Cheng-xu. Fine-grained classification model of lung disease for imbalanced data [J]. Journal of Graphics, 2023, 44(3): 513-520. |
[13] | SHI Cai-juan, SHI Ze, YAN Jin-wei, BI Yang-yang. Bi-directionally aligned VAE based on double semantics for generalized zero-shot learning [J]. Journal of Graphics, 2023, 44(3): 521-530. |
[14] | WU Wen-huan, ZHANG Hao-kun. Semantic segmentation with fusion of spatial criss-cross and channel multi-head attention [J]. Journal of Graphics, 2023, 44(3): 531-539. |
[15] | LU Qiu, SHAO Hua-ze, ZHANG Yun-lei. Dynamic balanced multi-scale feature fusion for colorectal polyp segmentation [J]. Journal of Graphics, 2023, 44(2): 225-232. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||