Journal of Graphics ›› 2024, Vol. 45 ›› Issue (4): 714-725.DOI: 10.11996/JG.j.2095-302X.2024040714
• Image Processing and Computer Vision • Previous Articles Next Articles
HU Xin1(), CHANG Yashu1, QIN Hao2, XIAO Jian3(
), CHENG Hongliang3
Received:
2024-02-10
Accepted:
2024-04-15
Online:
2024-08-31
Published:
2024-09-03
Contact:
XIAO Jian
About author:
First author contact:HU Xin (1975-), professor, postdoc. Her main research interests cover power grid big data processing, machine learning, and deep learning. E-mail:huxin@chd.edu.cn
Supported by:
CLC Number:
HU Xin, CHANG Yashu, QIN Hao, XIAO Jian, CHENG Hongliang. Binocular ranging method based on improved YOLOv8 and GMM image point set matching[J]. Journal of Graphics, 2024, 45(4): 714-725.
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.txxb.com.cn/EN/10.11996/JG.j.2095-302X.2024040714
名称 | 实验配置 |
---|---|
操作系统 | Windows 11 |
编程语言 | Python3.8 |
深度学习框架 | PyTorch1.13.1 |
CPU | Intel(R)Core(TM)i7-13700H |
GPU | NVIDIA GeForce RTX 4060(8 G) |
Cuda | 11.6 |
平台 | Pycharm2022,Matlab2021 |
Table 1 Software and hardware configuration
名称 | 实验配置 |
---|---|
操作系统 | Windows 11 |
编程语言 | Python3.8 |
深度学习框架 | PyTorch1.13.1 |
CPU | Intel(R)Core(TM)i7-13700H |
GPU | NVIDIA GeForce RTX 4060(8 G) |
Cuda | 11.6 |
平台 | Pycharm2022,Matlab2021 |
Fig. 9 Hook dataset under complex factors ((a) In dim light conditions; (b) In nighttime lighting environments; (c) With occlusion; (d) Small targets at long distances; (e) Complex backgrounds; (f) Under normal light conditions; (g) Negative sample 1; (h) Negative sample 2)
模型 | 骨干网络 | P | AP50 | FLOPs | Parameters/M |
---|---|---|---|---|---|
YOLOv3-spp | Darknet-53 | 0.944 | 0.945 | 283.1 | 104.710 |
YOLOv5s | CSP-Darknet-53 | 0.953 | 0.921 | 23.8 | 9.112 |
YOLOv6s | RepVGG | 0.964 | 0.945 | 44.0 | 16.297 |
YOLOv8s | C2f-sppf-Darknet-53 | 0.930 | 0.932 | 28.4 | 11.126 |
FS-YOLO(Ours) | FasterNet | 0.960 | 0.951 | 18.3 | 7.756 |
Table 2 Comparison of different network performance for object detection
模型 | 骨干网络 | P | AP50 | FLOPs | Parameters/M |
---|---|---|---|---|---|
YOLOv3-spp | Darknet-53 | 0.944 | 0.945 | 283.1 | 104.710 |
YOLOv5s | CSP-Darknet-53 | 0.953 | 0.921 | 23.8 | 9.112 |
YOLOv6s | RepVGG | 0.964 | 0.945 | 44.0 | 16.297 |
YOLOv8s | C2f-sppf-Darknet-53 | 0.930 | 0.932 | 28.4 | 11.126 |
FS-YOLO(Ours) | FasterNet | 0.960 | 0.951 | 18.3 | 7.756 |
FasterNet | Slim-Neck | P | AP50 | FLOPs | Parameters/M |
---|---|---|---|---|---|
_ | _ | 0.930 | 0.928 | 28.4 | 11.126 |
| _ | 0.934 | 0.939 | 21.7 | 8.616 |
_ | | 0.955 | 0.941 | 25.1 | 10.265 |
| | 0.959 | 0.950 | 18.3 | 7.756 |
Table 3 Target detection algorithm ablation experiment
FasterNet | Slim-Neck | P | AP50 | FLOPs | Parameters/M |
---|---|---|---|---|---|
_ | _ | 0.930 | 0.928 | 28.4 | 11.126 |
| _ | 0.934 | 0.939 | 21.7 | 8.616 |
_ | | 0.955 | 0.941 | 25.1 | 10.265 |
| | 0.959 | 0.950 | 18.3 | 7.756 |
模型 | 噪点个数60 | 噪点个数120 | 噪点个数180 | 噪点个数240 | 噪点个数300 | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
P | Recall | F1 | P | Recall | F1 | P | Recall | F1 | P | Recall | F1 | P | Recall | F1 | |
CPD(GMM) | 0.419 | 0.233 | 0.299 | 0.184 | 0.028 | 0.048 | 0.206 | 0.012 | 0.023 | 0.042 | 0.002 | 0.004 | 0.111 | 0.003 | 0.006 |
NGMM | 0.978 | 0.959 | 0.968 | 0.828 | 0.929 | 0.875 | 0.768 | 0.926 | 0.839 | 0.687 | 0.920 | 0.787 | 0.641 | 0.920 | 0.755 |
PGMM(Ours) | 0.992 | 0.966 | 0.979 | 0.970 | 0.968 | 0.969 | 0.954 | 0.966 | 0.960 | 0.927 | 0.969 | 0.948 | 0.899 | 0.965 | 0.931 |
Table 4 Performance comparison of different models for point set matching
模型 | 噪点个数60 | 噪点个数120 | 噪点个数180 | 噪点个数240 | 噪点个数300 | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
P | Recall | F1 | P | Recall | F1 | P | Recall | F1 | P | Recall | F1 | P | Recall | F1 | |
CPD(GMM) | 0.419 | 0.233 | 0.299 | 0.184 | 0.028 | 0.048 | 0.206 | 0.012 | 0.023 | 0.042 | 0.002 | 0.004 | 0.111 | 0.003 | 0.006 |
NGMM | 0.978 | 0.959 | 0.968 | 0.828 | 0.929 | 0.875 | 0.768 | 0.926 | 0.839 | 0.687 | 0.920 | 0.787 | 0.641 | 0.920 | 0.755 |
PGMM(Ours) | 0.992 | 0.966 | 0.979 | 0.970 | 0.968 | 0.969 | 0.954 | 0.966 | 0.960 | 0.927 | 0.969 | 0.948 | 0.899 | 0.965 | 0.931 |
组数 | 实际距离/m | 计算距离/m | 误差/m | 相对误差/% |
---|---|---|---|---|
1 | 2.32 | 2.336 3 | -0.016 3 | -0.7017 |
2 | 3.98 | 4.015 7 | 0.035 7 | 0.8950 |
3 | 6.13 | 6.043 6 | 0.086 4 | 1.4085 |
4 | 7.91 | 8.108 7 | -0.198 7 | -2.5090 |
5 | 10.14 | 10.553 2 | -0.413 2 | -4.0737 |
Table 5 Experimental results of physical distance measurement for tower cranes
组数 | 实际距离/m | 计算距离/m | 误差/m | 相对误差/% |
---|---|---|---|---|
1 | 2.32 | 2.336 3 | -0.016 3 | -0.7017 |
2 | 3.98 | 4.015 7 | 0.035 7 | 0.8950 |
3 | 6.13 | 6.043 6 | 0.086 4 | 1.4085 |
4 | 7.91 | 8.108 7 | -0.198 7 | -2.5090 |
5 | 10.14 | 10.553 2 | -0.413 2 | -4.0737 |
[1] | LIU C, HOU C J, ZHONG D C. An adaptive hierarchical sliding mode control scheme with accurate positioning and sway suppression for underactuated tower cranes[C]// 2023 China Automation Congress. New York: IEEE Press, 2023: 974-979. |
[2] | CHEN Y, ZENG Q, ZHENG X Z, et al. Safety supervision of tower crane operation on construction sites: an evolutionary game analysis[J]. Safety Science, 2022, 152: 105578. |
[3] | WU H T, ZHONG B T, LI H, et al. On-site safety inspection of tower cranes: a blockchain-enabled conceptual framework[J]. Safety Science, 2022, 153: 105815. |
[4] | AGHDAM H H, HERAVI E J, DEMILEW S S, et al. RAD: realtime and accurate 3D object detection on embedded systems[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. New York: IEEE Press, 2021: 2869-2877. |
[5] |
陈炎, 杨丽丽, 王振鹏. 双目视觉的匹配算法综述[J]. 图学学报, 2020, 41(5): 702-708.
DOI |
CHEN Y, YANG L L, WANG Z P. Literature survey on stereo vision matching algorithms[J]. Journal of Graphics, 2020, 41(5): 702-708 (in Chinese).
DOI |
|
[6] |
亢宇欣, 谌贵辉, 邓宇, 等. 多测度融合的立体匹配算法研究[J]. 图学学报, 2019, 40(4): 711-717.
DOI |
KANG Y X, CHEN G H, DENG Y, et al. Research on stereo matching algorithms based on multi-measure fusion[J]. Journal of Graphics, 2019, 40(4): 711-717 (in Chinese). | |
[7] | 赵杰, 汪志成, 黄南海, 等. 基于双目视觉的物料三维空间定位算法[J]. 科学技术与工程, 2023, 23(18): 7861-7867. |
ZHAO J, WANG Z C, HUANG N H, et al. Three-dimensional material positioning algorithm based on binocular vision[J]. Science Technology and Engineering, 2023, 23(18): 7861-7867 (in Chinese). | |
[8] | 颜佳桂, 李宏胜, 任飞. 基于SSD和改进双目测距模型的车辆测距方法研究[J]. 激光杂志, 2020, 41(11): 42-47. |
YAN J G, LI H S, REN F. Research on vehicle ranging method based on SSD algorithm and improved binocular ranging model[J]. Laser Journal, 2020, 41(11): 42-47 (in Chinese). | |
[9] | 颜麟, 曹守启. 基于双目视觉的无人补料装置测距技术[J]. 上海海洋大学学报, 2023, 32(5): 1006-1014. |
YAN L, CAO S Q. Ranging technology of unmanned feeding device based on binocular vision[J]. Journal of Shanghai Ocean University, 2023, 32(5): 1006-1014 (in Chinese). | |
[10] | JEON S, KIM S, KANG S, et al. Smart safety hook monitoring system for construction site[C]// 2020 IEEE International Conference on Consumer Electronics - Asia. New York: IEEE Press, 2020: 1-4. |
[11] | 刘刚, 占升, 贾潇. 建筑工程智慧工地建设[J]. 智能建筑与智慧城市, 2023(2): 121-123. |
LIU G, ZHAN S, JIA X. The construction of smart construction site of construction engineering[J]. Intelligent Building & Smart City, 2023(2): 121-123 (in Chinese). | |
[12] |
HE K M, ZHANG X Y, REN S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916.
DOI PMID |
[13] | LIU S, QI L, QIN H F, et al. Path aggregation network for instance segmentation[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 8759-8768. |
[14] | CHEN J R, KAO S H, HE H, et al. Run, don’t walk: chasing higher FLOPS for faster neural networks[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 12021-12031. |
[15] | LI H L, LI J, WEI H B, et al. Slim-neck by GSConv: a better design paradigm of detector architectures for autonomous vehicles[EB/OL]. [2024-01-20]. https://arxiv.org/abs/2206.02424. |
[16] | JIANG X Y, MA J Y, FAN A X, et al. Robust feature matching for remote sensing image registration via linear adaptive filtering[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 59(2): 1577-1591. |
[17] | MIN Z, WANG J L, MENG M Q H. Joint rigid registration of multiple generalized point sets with hybrid mixture models[J]. IEEE Transactions on Automation Science and Engineering, 2019, 17(1): 334-347. |
[18] | CHENG J, LENG C, WU J X, et al. Fast and accurate image matching with cascade hashing for 3D reconstruction[C]// 2014 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2014: 1-8. |
[19] | 原培新, 蔡炟, 曹文伟, 等. 基于双目立体视觉的列车目标识别和测距技术[J]. 东北大学学报: 自然科学版, 2022, 43(3): 335-343. |
YUAN P X, CAI D, CAO W W, et al. Train target recognition and ranging technology based on binocular stereoscopic vision[J]. Journal of Northeastern University: Natural Science, 2022, 43(3): 335-343 (in Chinese). |
[1] | HU Fengkuo, YE Lan, TAN Xianfeng, ZHANG Qinzhan, HU Zhixin, FANG Qing, WANG Lei, MAN Xiaofeng. A refined YOLOv8-based algorithm for lightweight pavement disease detection [J]. Journal of Graphics, 2024, 45(5): 892-900. |
[2] | LIU Yiyan, HAO Tingnan, HE Chen, CHANG Yingjie. Photovoltaic cell surface defect detection based on DBBR-YOLO [J]. Journal of Graphics, 2024, 45(5): 913-921. |
[3] | ZHAI Yongjie, LI Jiawei, CHEN Nianhao, WANG Qianming, WANG Xinying. The vehicle parts detection method enhanced with Transformer integration [J]. Journal of Graphics, 2024, 45(5): 930-940. |
[4] | JIANG Xiaoheng, DUAN Jinzhong, LU Yang, CUI Lisha, XU Mingliang. Fusing prior knowledge reasoning for surface defect detection [J]. Journal of Graphics, 2024, 45(5): 957-967. |
[5] | XIONG Chao, WANG Yunyan, LUO Yuhao. Multi-view stereo network reconstruction with feature alignment and context-guided [J]. Journal of Graphics, 2024, 45(5): 1008-1016. |
[6] | NIU Weihua, GUO Xun. Rotating target detection algorithm in ship remote sensing images based on YOLOv8 [J]. Journal of Graphics, 2024, 45(4): 726-735. |
[7] | LI Tao, HU Ting, WU Dandan. Monocular depth estimation combining pyramid structure and attention mechanism [J]. Journal of Graphics, 2024, 45(3): 454-463. |
[8] | ZHU Guanghui, MIAO Jun, HU Hongli, SHEN Ji, DU Ronghua. 3D piece-wise planar reconstruction from a single indoor image based on self-augmented -attention mechanism [J]. Journal of Graphics, 2024, 45(3): 464-471. |
[9] | WANG Zhiru, CHANG Yuan, LU Peng, PAN Chengwei. A review on neural radiance fields acceleration [J]. Journal of Graphics, 2024, 45(1): 1-13. |
[10] | WANG Xinyu, LIU Hui, ZHU Jicheng, SHENG Yurui, ZHANG Caiming. Deep multimodal medical image fusion network based on high-low frequency feature decomposition [J]. Journal of Graphics, 2024, 45(1): 65-77. |
[11] | LI Jiaqi, WANG Hui, GUO Yu. Classification and segmentation network based on Transformer for triangular mesh [J]. Journal of Graphics, 2024, 45(1): 78-89. |
[12] | HAN Yazhen, YIN Mengxiao, MA Weizhao, YANG Shigeng, HU Jinfei, ZHU Congyang. DGOA: point cloud upsampling based on dynamic graph and offset attention [J]. Journal of Graphics, 2024, 45(1): 219-229. |
[13] | WANG Jiang’an, HUANG Le, PANG Dawei, QIN Linzhen, LIANG Wenqian. Dense point cloud reconstruction network based on adaptive aggregation recurrent recursion [J]. Journal of Graphics, 2024, 45(1): 230-239. |
[14] | ZHOU Rui-chuang, TIAN Jin, YAN Feng-ting, ZHU Tian-xiao, ZHANG Yu-jin. Point cloud classification model incorporating external attention and graph convolution [J]. Journal of Graphics, 2023, 44(6): 1162-1172. |
[15] | WANG Ji, WANG Sen, JIANG Zhi-wen, XIE Zhi-feng, LI Meng-tian. Zero-shot text-driven avatar generation based on depth-conditioned diffusion model [J]. Journal of Graphics, 2023, 44(6): 1218-1226. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||