Journal of Graphics ›› 2025, Vol. 46 ›› Issue (6): 1327-1336.DOI: 10.11996/JG.j.2095-302X.2025061327
• Image Processing and Computer Vision • Previous Articles Next Articles
Received:2024-10-25
Accepted:2025-03-16
Online:2025-12-30
Published:2025-12-27
About author:First author contact:WANG Haihan (1973-), senior engineer, master. His main research interests cover structural health monitoring, image processing, computer vision, etc. E-mail:wanghaihan@126.com
CLC Number:
WANG Haihan. Multi object detection method for surface defects of steel arch towers based on YOLOv8-OSRA[J]. Journal of Graphics, 2025, 46(6): 1327-1336.
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.txxb.com.cn/EN/10.11996/JG.j.2095-302X.2025061327
| 类别 | 边界框 | 分割掩码 | ||
|---|---|---|---|---|
| YOLOv8n-Seg | YOLOv8-OSRA | YOLOv8n-Seg | YOLOv8-OSRA | |
| 锈蚀 | 88.3 | 90.9 | 77.8 | 81.5 |
| 表观裂缝 | 85.9 | 87.0 | 63.3 | 65.5 |
| 涂层脱落 | 79.8 | 81.9 | 65.6 | 67.3 |
| mAP@0.5 | 84.6 | 86.6 | 68.9 | 71.4 |
Table 1 Comparison of training results of different networks/%
| 类别 | 边界框 | 分割掩码 | ||
|---|---|---|---|---|
| YOLOv8n-Seg | YOLOv8-OSRA | YOLOv8n-Seg | YOLOv8-OSRA | |
| 锈蚀 | 88.3 | 90.9 | 77.8 | 81.5 |
| 表观裂缝 | 85.9 | 87.0 | 63.3 | 65.5 |
| 涂层脱落 | 79.8 | 81.9 | 65.6 | 67.3 |
| mAP@0.5 | 84.6 | 86.6 | 68.9 | 71.4 |
| 模型 | GFLOPs | 参数量/M | FPS (RTX3050) |
|---|---|---|---|
| YOLOv8n-Seg | 12.0 | 6.5 | 58.8 |
| YOLOv8-OSRA | 12.1 | 10.2 | 51.5 |
Table 2 Comparison of computational complexity
| 模型 | GFLOPs | 参数量/M | FPS (RTX3050) |
|---|---|---|---|
| YOLOv8n-Seg | 12.0 | 6.5 | 58.8 |
| YOLOv8-OSRA | 12.1 | 10.2 | 51.5 |
| 注意力 类别 | mAP@0.5 (锈蚀)/% | mAP@0.5 (裂缝)/% | mAP@0.5 (剥落)/% | GFLOPs | |||
|---|---|---|---|---|---|---|---|
| 边界框 | 分割掩码 | 边界框 | 分割掩码 | 边界框 | 分割掩码 | ||
| 无 | 88.3 | 77.8 | 85.9 | 63.3 | 79.8 | 65.6 | 12.0 |
| SE | 80.1 | 66.8 | 81.7 | 57.7 | 76.0 | 61.0 | 27.6 |
| CBAM | 80.9 | 65.8 | 82.7 | 57.6 | 76.5 | 62.3 | 27.7 |
| SRA | 78.9 | 64.6 | 81.6 | 57.7 | 76.0 | 62.3 | 27.8 |
| OSRA | 90.9 | 81.5 | 87.0 | 65.5 | 81.9 | 67.3 | 12.1 |
Table 3 Comparison of ablation experiments on attention mechanisms
| 注意力 类别 | mAP@0.5 (锈蚀)/% | mAP@0.5 (裂缝)/% | mAP@0.5 (剥落)/% | GFLOPs | |||
|---|---|---|---|---|---|---|---|
| 边界框 | 分割掩码 | 边界框 | 分割掩码 | 边界框 | 分割掩码 | ||
| 无 | 88.3 | 77.8 | 85.9 | 63.3 | 79.8 | 65.6 | 12.0 |
| SE | 80.1 | 66.8 | 81.7 | 57.7 | 76.0 | 61.0 | 27.6 |
| CBAM | 80.9 | 65.8 | 82.7 | 57.6 | 76.5 | 62.3 | 27.7 |
| SRA | 78.9 | 64.6 | 81.6 | 57.7 | 76.0 | 62.3 | 27.8 |
| OSRA | 90.9 | 81.5 | 87.0 | 65.5 | 81.9 | 67.3 | 12.1 |
| 模型 | mAP@0.5/% | FPS | Params/M | GFLOPs | |
|---|---|---|---|---|---|
| 检测框 | 分割掩码 | ||||
| YOLOv5 | 73.8 | 58.0 | 59.5 | 5.6 | 11.0 |
| YOLOv7 | 81.6 | 66.5 | 58.8 | 13.6 | 47.7 |
| YOLOv8 (基线) | 84.6 | 68.9 | 58.8 | 6.5 | 12.0 |
| YOLOv8-OSRA | 86.6 | 71.4 | 51.5 | 10.2 | 12.1 |
Table 4 Comparison of mainstream networks
| 模型 | mAP@0.5/% | FPS | Params/M | GFLOPs | |
|---|---|---|---|---|---|
| 检测框 | 分割掩码 | ||||
| YOLOv5 | 73.8 | 58.0 | 59.5 | 5.6 | 11.0 |
| YOLOv7 | 81.6 | 66.5 | 58.8 | 13.6 | 47.7 |
| YOLOv8 (基线) | 84.6 | 68.9 | 58.8 | 6.5 | 12.0 |
| YOLOv8-OSRA | 86.6 | 71.4 | 51.5 | 10.2 | 12.1 |
Fig. 11 Comparison of detection results before and after model optimization ((a) Original image; (b) Detection result of YOLOv8n-Seg model before optimization; (c) Detection result of YOLOv8-OSRA model after optimization)
| [1] |
CHA Y J, CHOI W, SUH G, et al. Autonomous structural visual inspection using region-based deep learning for detecting multiple damage types[J]. Computer-Aided Civil and Infrastructure Engineering, 2018, 33(9): 731-747.
DOI URL |
| [2] |
WANG D L, ZHANG Y Q, PAN Y, et al. An automated inspection method for the steel box girder bottom of long-span bridges based on deep learning[J]. IEEE Access, 2020, 8: 94010-94023.
DOI URL |
| [3] |
CHEN K H, HUANG Z D, CHEN C, et al. Surface crack detection of steel structures in railroad industry based on multi-model training comparison technique[J]. Processes, 2023, 11(4): 1208.
DOI URL |
| [4] |
LUO C, YU L J, YAN J X, et al. Autonomous detection of damage to multiple steel surfaces from 360° panoramas using deep neural networks[J]. Computer-Aided Civil and Infrastructure Engineering, 2021, 36(12): 1585-1599.
DOI URL |
| [5] |
LIU R Q, HUANG M, GAO Z M, et al. MSC-DNet: an efficient detector with multi-scale context for defect detection on strip steel surface[J]. Measurement, 2023, 209: 112467.
DOI URL |
| [6] | 斯新华, 刘大洋, 张朋, 等. 基于YOLOv4的桥梁钢结构表观病害自动识别方法的研究[J]. 公路, 2022, 67(8): 399-402. |
| SI X H, LIU D Y, ZHANG P, et al. Research on the automatic identification method of apparent diseases of bridge steel structure based on YOLOv4[J]. Highway, 2022, 67(8): 399-402 (in Chinese). | |
| [7] | 逯鹏, 赵天淞, 王剑, 等. 基于计算机视觉的钢结构表面锈蚀程度检测方法[J]. 工业建筑, 2024, 54(8): 133-139. |
| LU P, ZHAO T S, WANG J, et al. A method for detecting surface corrosion degree of steel structures based on computer vision[J]. Industrial Construction, 2024, 54(8): 133-139 (in Chinese). | |
| [8] |
王志东, 陈晨阳, 刘晓明. 基于轻量化改进YOLOv8的通信光缆缺陷检测[J]. 图学学报, 2025, 46(1): 28-34.
DOI |
|
WANG Z D, CHEN C Y, LIU X M. The defect detection method for communication optical cables based on lightweight improved YOLOv8[J]. Journal of Graphics, 2025, 46(1): 28-34 (in Chinese).
DOI |
|
| [9] |
崔克彬, 耿佳昌. 基于EE-YOLOv8s的多场景火灾迹象检测算法[J]. 图学学报, 2025, 46(1): 13-27.
DOI |
|
CUI K B, GENG J C. A multi-scene fire sign detection algorithm based on EE-YOLOv8s[J]. Journal of Graphics, 2025, 46(1): 13-27 (in Chinese).
DOI |
|
| [10] |
赵振兵, 韩钰, 唐辰康. 基于改进YOLOv8的配电线路绝缘子缺陷级联检测方法[J]. 图学学报, 2025, 46(1): 1-12.
DOI |
|
ZHAO Z B, HAN Y, TANG C K. Cascade detection method for insulator defects in distribution lines based on improved YOLOv8[J]. Journal of Graphics, 2025, 46(1): 1-12 (in Chinese).
DOI |
|
| [11] | WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 7464-7475. |
| [12] | LI C Y, LI L L, JIANG H L, et al. YOLOv6:a single-stage object detection framework for industrial applications[EB/OL]. [2024-05-07]. https://arxiv.org/abs/2209.02976. |
| [13] | LOU M, ZHANG S, ZHOU H Y, et al. TransXNet: learning both global and local dynamics with a dual dynamic token mixer for visual recognition[EB/OL]. [2024-05-30]. https://arxiv.org/abs/2310.19380. |
| [14] | WANG W H, XIE E Z, LI X, et al. Pyramid vision transformer: a versatile backbone for dense prediction without convolutions[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 548-558. |
| [15] | CHOLLET F. Xception: deep learning with depthwise separable convolutions[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 1800-1807. |
| [16] | GUO J Y, HAN K, WU H, et al. CMT: convolutional neural networks meet vision transformers[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 12165-12175. |
| [17] | LI Y Y, YUAN G, WEN Y, et al. EfficientFormer: vision transformers at MobileNet speed[EB/OL]. [2024-06-25]. https://proceedings.neurips.cc/paper_files/paper/2022/file/5452ad8ee6ea6e7dc41db1cbd31ba0b8-Paper-Conference.pdf. |
| [18] | HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 7132-7141. |
| [19] | WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]// The 15th European Conference on Computer Vision. Cham: Springer, 2018: 3-19. |
| [1] | JU Chen, DING Jiaxin, WANG Zexing, LI Guangzhao, GUAN Zhenxiang, ZHANG Changyou. Graph neural network-based method for approximating finite element shape functions [J]. Journal of Graphics, 2025, 46(6): 1161-1171. |
| [2] | YI Bin, ZHANG Libin, LIU Danying, TANG Jun, FANG Junjun, LI Wenqi. Prediction model of laser drilling ventilation rate in cigarette manufacturing process based on AMTA-Net [J]. Journal of Graphics, 2025, 46(6): 1224-1232. |
| [3] | BO Wen, JU Chen, LIU Weiqing, ZHANG Yan, HU Jingjing, CHENG Jinghan, ZHANG Changyou. Degradation-driven temporal modeling method for equipment maintenance interval prediction [J]. Journal of Graphics, 2025, 46(6): 1233-1246. |
| [4] | ZHAO Zhenbing, Ouyang Wenbin, FENG Shuo, LI Haopeng, MA Jun. A thermal image detection method for insulators incorporating within-class sparse prior knowledge and improved YOLOv8 [J]. Journal of Graphics, 2025, 46(6): 1247-1256. |
| [5] | HE Mengmeng, ZHANG Xiaoyan, LI Hongan. Lightweight skin lesion image segmentation network based on Mamba structure [J]. Journal of Graphics, 2025, 46(6): 1257-1266. |
| [6] | LI Xingchen, LI Zongmin, YANG Chaozhi. Test-time adaptation algorithm based on trusted pseudo-label fine-tuning [J]. Journal of Graphics, 2025, 46(6): 1292-1303. |
| [7] | FAN Lexiang, MA Ji, ZHOU Dengwen. Lightweight blind super-resolution network based on degradation separation [J]. Journal of Graphics, 2025, 46(6): 1304-1315. |
| [8] | LENG Shuo, WANG Wei, OU Jiayong, XUE Zhigang, SONG Yinglong, MO Sijun. On-Site construction safety monitoring based on large vision language models [J]. Journal of Graphics, 2025, 46(5): 960-968. |
| [9] | ZHU Hongmiao, ZHONG Guojie, ZHANG Yanci. Semantic segmentation of small-scale point clouds based on integration of mean shift and deep learning [J]. Journal of Graphics, 2025, 46(5): 998-1009. |
| [10] | GUO Ruidong, LAN Guiwen, FAN Donglin, ZHONG Zhan, XU Zirui, REN Xinyue. An object detection algorithm for powerline inspection based on the feature focus & diffusion network [J]. Journal of Graphics, 2025, 46(4): 719-726. |
| [11] | WANG Ziyu, CAO Weiwei, CAO Yuzhu, LIU Meng, CHEN Jun, LIU Zhaobang, ZHENG Jian. Semi-supervised pulmonary airway segmentation based on dynamically decoupling intra-class regions [J]. Journal of Graphics, 2025, 46(4): 763-774. |
| [12] | ZHANG Shuai, HONG Ao, HU Hengrui, LAN Mingying, XI Xiaochao. Study on the interaction of an AI-based motion capture technology in rehabilitation training systems for neuromyelitis optica [J]. Journal of Graphics, 2025, 46(4): 783-792. |
| [13] | WANG Daolei, DING Zijian, YANG Jun, ZHENG Shaokai, ZHU Rui, ZHAO Wenbin. Large scene reconstruction method based on voxel grid feature of NeRF [J]. Journal of Graphics, 2025, 46(3): 502-509. |
| [14] | SUN Hao, XIE Tao, HE Long, GUO Wenzhong, YU Yongfang, WU Qijun, WANG Jianwei, DONG Hui. Research on multimodal text-visual large model for robotic terrain perception algorithm [J]. Journal of Graphics, 2025, 46(3): 558-567. |
| [15] | ZHANG Lili, YANG Kang, ZHANG Ke, WEI Wei, LI Jing, TAN Hongxin, ZHANG Xiangyu. Research on improved YOLOv8 detection algorithm for diesel vehicle emission of black smoke [J]. Journal of Graphics, 2025, 46(2): 249-258. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||
