图学学报 ›› 2025, Vol. 46 ›› Issue (6): 1327-1336.DOI: 10.11996/JG.j.2095-302X.2025061327
收稿日期:2024-10-25
接受日期:2025-03-16
出版日期:2025-12-30
发布日期:2025-12-27
第一作者:王海涵(1973-),男,高级工程师,硕士。主要研究方向为结构健康监测、图像处理、计算机视觉等。E-mail:wanghaihan@126.com
Received:2024-10-25
Accepted:2025-03-16
Published:2025-12-30
Online:2025-12-27
First author:WANG Haihan (1973-), senior engineer, master. His main research interests cover structural health monitoring, image processing, computer vision, etc. E-mail:wanghaihan@126.com
摘要: 钢拱塔是钢拱斜拉桥的主要承重结构,其表观病害(如腐蚀、剥落和裂缝等)的早期检测与评估对保障桥梁结构安全至关重要。针对传统人工检测方法效率低下、主观性强且难以覆盖高空隐蔽区域的问题。提出了一种基于改进YOLOv8n-Seg深度学习框架与OSRA注意力机制的智能检测方法。利用自主研发的轮轨式检测机器人系统采集钢拱塔内部高分辨率图像数据,结合开源数据集构建了包含5 846张原始图像的多病害数据集,并通过随机裁剪、镜像翻转和亮度调整等数据增强技术将样本量扩展至23 378张图像。在算法设计层面,创新性地将OSRA注意力模块嵌入YOLOv8n-Seg网络的特征融合层,通过重叠分块策略和局部细化机制显著提升了模型对不规则边界和微小病害特征的捕捉能力。结果表明:优化后的YOLOv8-OSRA模型在独立测试集上取得了显著的性能提升,锈蚀检测mAP@0.5达到90.9% (提升2.6%),裂缝识别精度达到87.0% (提升1.1%),剥落检测准确率达到81.9% (提升2.1%)。消融实验进一步验证了OSRA模块在保持计算效率(仅增加0.8% GFLOPs)的同时,其性能优势显著优于SE和CBAM等注意力机制。研究成果为钢拱塔病害检测提供可部署于移动检测设备的轻量化解决方案,且提出的多尺度特征增强方法为复杂钢结构表面缺陷检测提供参考。
中图分类号:
王海涵. 基于YOLOv8-OSRA的钢拱塔表观病害多目标检测方法[J]. 图学学报, 2025, 46(6): 1327-1336.
WANG Haihan. Multi object detection method for surface defects of steel arch towers based on YOLOv8-OSRA[J]. Journal of Graphics, 2025, 46(6): 1327-1336.
图6 典型损伤图像数据增强((a) 原图;(b) 镜像;(c) 亮度调整;(d) 裁剪)
Fig. 6 Data enhancement of typical damaged images ((a) Original image; (b) Mirror image; (c) Brightness adjustment; (d) Cropping)
图8 损失值变化曲线((a) 边界框的损失值变化曲线;(b) 分割的损失值变化曲线)
Fig. 8 Loss value change curve ((a) Loss value change curve of bounding box; (b) Loss value change curve of segmentation)
| 类别 | 边界框 | 分割掩码 | ||
|---|---|---|---|---|
| YOLOv8n-Seg | YOLOv8-OSRA | YOLOv8n-Seg | YOLOv8-OSRA | |
| 锈蚀 | 88.3 | 90.9 | 77.8 | 81.5 |
| 表观裂缝 | 85.9 | 87.0 | 63.3 | 65.5 |
| 涂层脱落 | 79.8 | 81.9 | 65.6 | 67.3 |
| mAP@0.5 | 84.6 | 86.6 | 68.9 | 71.4 |
表1 不同网络训练结果对比/%
Table 1 Comparison of training results of different networks/%
| 类别 | 边界框 | 分割掩码 | ||
|---|---|---|---|---|
| YOLOv8n-Seg | YOLOv8-OSRA | YOLOv8n-Seg | YOLOv8-OSRA | |
| 锈蚀 | 88.3 | 90.9 | 77.8 | 81.5 |
| 表观裂缝 | 85.9 | 87.0 | 63.3 | 65.5 |
| 涂层脱落 | 79.8 | 81.9 | 65.6 | 67.3 |
| mAP@0.5 | 84.6 | 86.6 | 68.9 | 71.4 |
| 模型 | GFLOPs | 参数量/M | FPS (RTX3050) |
|---|---|---|---|
| YOLOv8n-Seg | 12.0 | 6.5 | 58.8 |
| YOLOv8-OSRA | 12.1 | 10.2 | 51.5 |
表2 计算复杂度对比
Table 2 Comparison of computational complexity
| 模型 | GFLOPs | 参数量/M | FPS (RTX3050) |
|---|---|---|---|
| YOLOv8n-Seg | 12.0 | 6.5 | 58.8 |
| YOLOv8-OSRA | 12.1 | 10.2 | 51.5 |
| 注意力 类别 | mAP@0.5 (锈蚀)/% | mAP@0.5 (裂缝)/% | mAP@0.5 (剥落)/% | GFLOPs | |||
|---|---|---|---|---|---|---|---|
| 边界框 | 分割掩码 | 边界框 | 分割掩码 | 边界框 | 分割掩码 | ||
| 无 | 88.3 | 77.8 | 85.9 | 63.3 | 79.8 | 65.6 | 12.0 |
| SE | 80.1 | 66.8 | 81.7 | 57.7 | 76.0 | 61.0 | 27.6 |
| CBAM | 80.9 | 65.8 | 82.7 | 57.6 | 76.5 | 62.3 | 27.7 |
| SRA | 78.9 | 64.6 | 81.6 | 57.7 | 76.0 | 62.3 | 27.8 |
| OSRA | 90.9 | 81.5 | 87.0 | 65.5 | 81.9 | 67.3 | 12.1 |
表3 注意力机制消融实验对比
Table 3 Comparison of ablation experiments on attention mechanisms
| 注意力 类别 | mAP@0.5 (锈蚀)/% | mAP@0.5 (裂缝)/% | mAP@0.5 (剥落)/% | GFLOPs | |||
|---|---|---|---|---|---|---|---|
| 边界框 | 分割掩码 | 边界框 | 分割掩码 | 边界框 | 分割掩码 | ||
| 无 | 88.3 | 77.8 | 85.9 | 63.3 | 79.8 | 65.6 | 12.0 |
| SE | 80.1 | 66.8 | 81.7 | 57.7 | 76.0 | 61.0 | 27.6 |
| CBAM | 80.9 | 65.8 | 82.7 | 57.6 | 76.5 | 62.3 | 27.7 |
| SRA | 78.9 | 64.6 | 81.6 | 57.7 | 76.0 | 62.3 | 27.8 |
| OSRA | 90.9 | 81.5 | 87.0 | 65.5 | 81.9 | 67.3 | 12.1 |
| 模型 | mAP@0.5/% | FPS | Params/M | GFLOPs | |
|---|---|---|---|---|---|
| 检测框 | 分割掩码 | ||||
| YOLOv5 | 73.8 | 58.0 | 59.5 | 5.6 | 11.0 |
| YOLOv7 | 81.6 | 66.5 | 58.8 | 13.6 | 47.7 |
| YOLOv8 (基线) | 84.6 | 68.9 | 58.8 | 6.5 | 12.0 |
| YOLOv8-OSRA | 86.6 | 71.4 | 51.5 | 10.2 | 12.1 |
表4 主流网络性能对比
Table 4 Comparison of mainstream networks
| 模型 | mAP@0.5/% | FPS | Params/M | GFLOPs | |
|---|---|---|---|---|---|
| 检测框 | 分割掩码 | ||||
| YOLOv5 | 73.8 | 58.0 | 59.5 | 5.6 | 11.0 |
| YOLOv7 | 81.6 | 66.5 | 58.8 | 13.6 | 47.7 |
| YOLOv8 (基线) | 84.6 | 68.9 | 58.8 | 6.5 | 12.0 |
| YOLOv8-OSRA | 86.6 | 71.4 | 51.5 | 10.2 | 12.1 |
图11 模型优化前后检测结果对比((a) 原图;(b) 优化前的YOLOv8n-Seg模型检测结果;(c) 优化后的YOLOv8-OSRA模型检测结果)
Fig. 11 Comparison of detection results before and after model optimization ((a) Original image; (b) Detection result of YOLOv8n-Seg model before optimization; (c) Detection result of YOLOv8-OSRA model after optimization)
| [1] |
CHA Y J, CHOI W, SUH G, et al. Autonomous structural visual inspection using region-based deep learning for detecting multiple damage types[J]. Computer-Aided Civil and Infrastructure Engineering, 2018, 33(9): 731-747.
DOI URL |
| [2] |
WANG D L, ZHANG Y Q, PAN Y, et al. An automated inspection method for the steel box girder bottom of long-span bridges based on deep learning[J]. IEEE Access, 2020, 8: 94010-94023.
DOI URL |
| [3] |
CHEN K H, HUANG Z D, CHEN C, et al. Surface crack detection of steel structures in railroad industry based on multi-model training comparison technique[J]. Processes, 2023, 11(4): 1208.
DOI URL |
| [4] |
LUO C, YU L J, YAN J X, et al. Autonomous detection of damage to multiple steel surfaces from 360° panoramas using deep neural networks[J]. Computer-Aided Civil and Infrastructure Engineering, 2021, 36(12): 1585-1599.
DOI URL |
| [5] |
LIU R Q, HUANG M, GAO Z M, et al. MSC-DNet: an efficient detector with multi-scale context for defect detection on strip steel surface[J]. Measurement, 2023, 209: 112467.
DOI URL |
| [6] | 斯新华, 刘大洋, 张朋, 等. 基于YOLOv4的桥梁钢结构表观病害自动识别方法的研究[J]. 公路, 2022, 67(8): 399-402. |
| SI X H, LIU D Y, ZHANG P, et al. Research on the automatic identification method of apparent diseases of bridge steel structure based on YOLOv4[J]. Highway, 2022, 67(8): 399-402 (in Chinese). | |
| [7] | 逯鹏, 赵天淞, 王剑, 等. 基于计算机视觉的钢结构表面锈蚀程度检测方法[J]. 工业建筑, 2024, 54(8): 133-139. |
| LU P, ZHAO T S, WANG J, et al. A method for detecting surface corrosion degree of steel structures based on computer vision[J]. Industrial Construction, 2024, 54(8): 133-139 (in Chinese). | |
| [8] |
王志东, 陈晨阳, 刘晓明. 基于轻量化改进YOLOv8的通信光缆缺陷检测[J]. 图学学报, 2025, 46(1): 28-34.
DOI |
|
WANG Z D, CHEN C Y, LIU X M. The defect detection method for communication optical cables based on lightweight improved YOLOv8[J]. Journal of Graphics, 2025, 46(1): 28-34 (in Chinese).
DOI |
|
| [9] |
崔克彬, 耿佳昌. 基于EE-YOLOv8s的多场景火灾迹象检测算法[J]. 图学学报, 2025, 46(1): 13-27.
DOI |
|
CUI K B, GENG J C. A multi-scene fire sign detection algorithm based on EE-YOLOv8s[J]. Journal of Graphics, 2025, 46(1): 13-27 (in Chinese).
DOI |
|
| [10] |
赵振兵, 韩钰, 唐辰康. 基于改进YOLOv8的配电线路绝缘子缺陷级联检测方法[J]. 图学学报, 2025, 46(1): 1-12.
DOI |
|
ZHAO Z B, HAN Y, TANG C K. Cascade detection method for insulator defects in distribution lines based on improved YOLOv8[J]. Journal of Graphics, 2025, 46(1): 1-12 (in Chinese).
DOI |
|
| [11] | WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 7464-7475. |
| [12] | LI C Y, LI L L, JIANG H L, et al. YOLOv6:a single-stage object detection framework for industrial applications[EB/OL]. [2024-05-07]. https://arxiv.org/abs/2209.02976. |
| [13] | LOU M, ZHANG S, ZHOU H Y, et al. TransXNet: learning both global and local dynamics with a dual dynamic token mixer for visual recognition[EB/OL]. [2024-05-30]. https://arxiv.org/abs/2310.19380. |
| [14] | WANG W H, XIE E Z, LI X, et al. Pyramid vision transformer: a versatile backbone for dense prediction without convolutions[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 548-558. |
| [15] | CHOLLET F. Xception: deep learning with depthwise separable convolutions[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 1800-1807. |
| [16] | GUO J Y, HAN K, WU H, et al. CMT: convolutional neural networks meet vision transformers[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 12165-12175. |
| [17] | LI Y Y, YUAN G, WEN Y, et al. EfficientFormer: vision transformers at MobileNet speed[EB/OL]. [2024-06-25]. https://proceedings.neurips.cc/paper_files/paper/2022/file/5452ad8ee6ea6e7dc41db1cbd31ba0b8-Paper-Conference.pdf. |
| [18] | HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 7132-7141. |
| [19] | WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]// The 15th European Conference on Computer Vision. Cham: Springer, 2018: 3-19. |
| [1] | 琚晨, 丁嘉欣, 王泽兴, 李广钊, 管振祥, 张常有. 面向有限元法的图神经网络形函数近似方法[J]. 图学学报, 2025, 46(6): 1161-1171. |
| [2] | 易斌, 张立斌, 刘丹楹, 唐军, 方俊俊, 李雯琦. 基于AMTA-Net的卷制过程激光打孔通风率预测模型[J]. 图学学报, 2025, 46(6): 1224-1232. |
| [3] | 薄文, 琚晨, 刘维青, 张焱, 胡晶晶, 程婧晗, 张常有. 基于退化感知时序建模的装备维保时机预测方法[J]. 图学学报, 2025, 46(6): 1233-1246. |
| [4] | 赵振兵, 欧阳文斌, 冯烁, 李浩鹏, 马隽. 基于类内稀疏先验与改进YOLOv8的绝缘子红外图像检测方法[J]. 图学学报, 2025, 46(6): 1247-1256. |
| [5] | 贺蒙蒙, 张小艳, 李洪安. 基于Mamba结构的轻量级皮肤病变图像分割网络[J]. 图学学报, 2025, 46(6): 1257-1266. |
| [6] | 李星辰, 李宗民, 杨超智. 基于可信伪标签微调的测试时适应算法[J]. 图学学报, 2025, 46(6): 1292-1303. |
| [7] | 樊乐翔, 马冀, 周登文. 基于退化分离的轻量级盲超分辨率重建网络[J]. 图学学报, 2025, 46(6): 1304-1315. |
| [8] | 朱泓淼, 钟国杰, 张严辞. 基于均值漂移与深度学习融合的小语义点云语义分割[J]. 图学学报, 2025, 46(5): 998-1009. |
| [9] | 郭瑞东, 蓝贵文, 范冬林, 钟展, 徐梓睿, 任新月. 基于特征聚焦扩散网络的电力巡检目标检测算法[J]. 图学学报, 2025, 46(4): 719-726. |
| [10] | 汪子宇, 曹维维, 曹玉柱, 刘猛, 陈俊, 刘兆邦, 郑健. 基于类内区域动态解耦的半监督肺气管分割[J]. 图学学报, 2025, 46(4): 763-774. |
| [11] | 王道累, 丁子健, 杨君, 郑劭恺, 朱瑞, 赵文彬. 基于体素网格特征的NeRF大场景重建方法[J]. 图学学报, 2025, 46(3): 502-509. |
| [12] | 孙浩, 谢滔, 何龙, 郭文忠, 虞永方, 吴其军, 王建伟, 东辉. 多模态文本视觉大模型机器人地形感知算法研究[J]. 图学学报, 2025, 46(3): 558-567. |
| [13] | 张立立, 杨康, 张珂, 魏薇, 李晶, 谭洪鑫, 张翔宇. 面向柴油车辆排放黑烟的改进型YOLOv8检测算法研究[J]. 图学学报, 2025, 46(2): 249-258. |
| [14] | 翟永杰, 王璐瑶, 赵晓瑜, 胡哲东, 王乾铭, 王亚茹. 基于级联查询-位置关系的输电线路多金具检测方法[J]. 图学学报, 2025, 46(2): 288-299. |
| [15] | 潘树焱, 刘立群. MSFAFuse:基于多尺度特征信息与注意力机制的SAR和可见光图像融合模型[J]. 图学学报, 2025, 46(2): 300-311. |
| 阅读次数 | ||||||
|
全文 |
|
|||||
|
摘要 |
|
|||||
