Journal of Graphics ›› 2026, Vol. 47 ›› Issue (1): 68-77.DOI: 10.11996/JG.j.2095-302X.2026010068
• Image Processing and Computer Vision • Previous Articles Next Articles
SONG Zhuo1, LU Dehui1, HUANG Zhichao1, TIAN Shiyu1, YAN Ronglong2, DENG Yichuan2,3(
)
Received:2025-03-19
Accepted:2025-07-23
Online:2026-02-28
Published:2026-03-16
Contact:
DENG Yichuan
Supported by:CLC Number:
SONG Zhuo, LU Dehui, HUANG Zhichao, TIAN Shiyu, YAN Ronglong, DENG Yichuan. Performance evaluation of construction site object detection under drone-captured perspective[J]. Journal of Graphics, 2026, 47(1): 68-77.
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.txxb.com.cn/EN/10.11996/JG.j.2095-302X.2026010068
| 数据集 | 模态 | 图像数/k | 图像尺寸 | 目标数/k | 类别数 | 任务场景 | 开源与否 |
|---|---|---|---|---|---|---|---|
| CARPK[ | 可见光 | 1.45 | 1 280×720 | 89.78 | 1 | 车辆计数 | 是 |
| UAVDT[ | 可见光 | 80.00 | 1 080×540 | 840.00 | 3 | 车辆检测追踪 | 是 |
| VisDrone[ | 可见光 | 10.21 | 2 000×1 500 | 540.00 | 10 | 多类目标检测 | 是 |
| DAC-SDC[ | 可见光 | 150.00 | 640×360 | ─ | 95 | 多类目标检测 | 是 |
| AU-Air[ | 多模态 | 32.82 | 1 920×1 080 | 132.00 | 8 | 交通监测 | 是 |
| UVSD[ | 可见光 | 5.87 | 960×540~5 280×2 970 | 58.60 | 1 | 车辆检测 | 是 |
| MOHR[ | 可见光 | 10.63 | 5 472×3 078/7 360×4 192/8 688×5 792 | 90.01 | 5 | 多类目标检测 | 否 |
| DroneVehicle[ | 可见光 红外线 | 56.88 | 840×712 | 819.00 | 5 | 车辆检测 | 是 |
| SeaDroneSee[ | 多光谱 | 54.00 | 3 840×2 160~5 456×3 632 | 400.00 | 6 | 海上人员检测 | 是 |
| ManipalUAV[ | 可见光 | 13.46 | 1 280×720 | 153.11 | 1 | 行人检测 | 是 |
Table 1 Survey on drone-captured image datasets
| 数据集 | 模态 | 图像数/k | 图像尺寸 | 目标数/k | 类别数 | 任务场景 | 开源与否 |
|---|---|---|---|---|---|---|---|
| CARPK[ | 可见光 | 1.45 | 1 280×720 | 89.78 | 1 | 车辆计数 | 是 |
| UAVDT[ | 可见光 | 80.00 | 1 080×540 | 840.00 | 3 | 车辆检测追踪 | 是 |
| VisDrone[ | 可见光 | 10.21 | 2 000×1 500 | 540.00 | 10 | 多类目标检测 | 是 |
| DAC-SDC[ | 可见光 | 150.00 | 640×360 | ─ | 95 | 多类目标检测 | 是 |
| AU-Air[ | 多模态 | 32.82 | 1 920×1 080 | 132.00 | 8 | 交通监测 | 是 |
| UVSD[ | 可见光 | 5.87 | 960×540~5 280×2 970 | 58.60 | 1 | 车辆检测 | 是 |
| MOHR[ | 可见光 | 10.63 | 5 472×3 078/7 360×4 192/8 688×5 792 | 90.01 | 5 | 多类目标检测 | 否 |
| DroneVehicle[ | 可见光 红外线 | 56.88 | 840×712 | 819.00 | 5 | 车辆检测 | 是 |
| SeaDroneSee[ | 多光谱 | 54.00 | 3 840×2 160~5 456×3 632 | 400.00 | 6 | 海上人员检测 | 是 |
| ManipalUAV[ | 可见光 | 13.46 | 1 280×720 | 153.11 | 1 | 行人检测 | 是 |
Fig. 2 Human posture comparison between frontal and downward perspectives ((a) Human posture from frontal perspective; (b) Human posture under downward perspective)
| 训练模型 | 是否使用预训练权重 | 训练世代 | 批处理规模 | 初始学习率 | 动量 |
|---|---|---|---|---|---|
| YOLO系列 | 否 | 100 | 16 | 0.010 0 | 0.9 |
| Faster-RCNN | 否 | 10 000 | 256 | 0.001 0 | 0.9 |
| RetinaNet | 否 | 冻结阶段:50 解冻阶段:50 | 冻结阶段:16 解冻阶段:8 | 0.000 1 | 0.9 |
| DETR | 是 | 100 | 4 | 0.000 1 | 0.9 |
Table 2 Algorithm train parameter table
| 训练模型 | 是否使用预训练权重 | 训练世代 | 批处理规模 | 初始学习率 | 动量 |
|---|---|---|---|---|---|
| YOLO系列 | 否 | 100 | 16 | 0.010 0 | 0.9 |
| Faster-RCNN | 否 | 10 000 | 256 | 0.001 0 | 0.9 |
| RetinaNet | 否 | 冻结阶段:50 解冻阶段:50 | 冻结阶段:16 解冻阶段:8 | 0.000 1 | 0.9 |
| DETR | 是 | 100 | 4 | 0.000 1 | 0.9 |
| 算法 | 人员 | 轿车 | 水泥搅拌车 | 卡车 | 水泥泵车 | 旋挖钻机 | 挖掘机 | 起重车 | 挖沟机 | mAP |
|---|---|---|---|---|---|---|---|---|---|---|
| YOLOv8 | 0.888 | 0.921 | 0.981 | 0.973 | 0.987 | 0.994 | 0.993 | 0.985 | 0.931 | 0.961 |
| YOLOv9 | 0.887 | 0.913 | 0.980 | 0.968 | 0.988 | 0.994 | 0.992 | 0.984 | 0.934 | 0.960 |
| YOLOv10 | 0.888 | 0.921 | 0.981 | 0.973 | 0.987 | 0.994 | 0.993 | 0.985 | 0.931 | 0.961 |
| YOLO11 | 0.885 | 0.911 | 0.980 | 0.970 | 0.984 | 0.994 | 0.993 | 0.982 | 0.911 | 0.957 |
| Faster-RCNN | 0.484 | 0.743 | 0.888 | 0.754 | 0.784 | 0.871 | 0.812 | 0.799 | 0.733 | 0.763 |
| RetinaNet | 0.360 | 0.654 | 0.885 | 0.666 | 0.838 | 0.757 | 0.877 | 0.705 | 0.744 | 0.721 |
| DETR | 0.802 | 0.951 | 0.973 | 0.940 | 0.979 | 0.990 | 0.988 | 0.979 | 0.979 | 0.953 |
Table 3 Comparison on detection performance indicators between algorithms
| 算法 | 人员 | 轿车 | 水泥搅拌车 | 卡车 | 水泥泵车 | 旋挖钻机 | 挖掘机 | 起重车 | 挖沟机 | mAP |
|---|---|---|---|---|---|---|---|---|---|---|
| YOLOv8 | 0.888 | 0.921 | 0.981 | 0.973 | 0.987 | 0.994 | 0.993 | 0.985 | 0.931 | 0.961 |
| YOLOv9 | 0.887 | 0.913 | 0.980 | 0.968 | 0.988 | 0.994 | 0.992 | 0.984 | 0.934 | 0.960 |
| YOLOv10 | 0.888 | 0.921 | 0.981 | 0.973 | 0.987 | 0.994 | 0.993 | 0.985 | 0.931 | 0.961 |
| YOLO11 | 0.885 | 0.911 | 0.980 | 0.970 | 0.984 | 0.994 | 0.993 | 0.982 | 0.911 | 0.957 |
| Faster-RCNN | 0.484 | 0.743 | 0.888 | 0.754 | 0.784 | 0.871 | 0.812 | 0.799 | 0.733 | 0.763 |
| RetinaNet | 0.360 | 0.654 | 0.885 | 0.666 | 0.838 | 0.757 | 0.877 | 0.705 | 0.744 | 0.721 |
| DETR | 0.802 | 0.951 | 0.973 | 0.940 | 0.979 | 0.990 | 0.988 | 0.979 | 0.979 | 0.953 |
| [1] | 朱密. 基于图像语义的建筑施工风险场景识别[D]. 大连. 大连理工大学, 2020. |
| ZHU M. Recognition of high-risk scenarios in building construction based on image semantics[D]. Dalian: Dalian University of Technology, 2020 (in Chinese). | |
| [2] | 崔自强, 杨淑娟, 于德湖. 人工智能在建筑施工领域应用研究进展[J]. 山东建筑大学学报, 2023, 38(4): 117-125, 134. |
| CUI Z Q, YANG S J, YU D H. Research progress on the application of artificial intelligence in the field of building construction[J]. Journal of Shandong Jianzhu University, 2023, 38(4): 117-125, 134 (in Chinese). | |
| [3] |
PANERU S, JEELANI I. Computer vision applications in construction: current state, opportunities & challenges[J]. Automation in Construction, 2021, 132: 103940.
DOI URL |
| [4] | 吴一全, 童康. 基于深度学习的无人机航拍图像小目标检测研究进展[J]. 航空学报, 2025, 46(3): 30848. |
| WU Y Q, TONG K. Research advances on deep learning-based small object detection in UAV aerial images[J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(3): 30848 (in Chinese). | |
| [5] | 尹东. 基于无人机和计算机视觉的智慧工地管理方法研究[D]. 长沙: 湖南大学, 2022. |
| YIN D. Study of intelligent construction site management based on UAV and computer vision[D]. Changsha: Hunan University, 2022 (in Chinese). | |
| [6] | 石智强. 基于无人机遥感数据的施工现场不安全行为检测和安全状态分析研究[D]. 宜昌: 三峡大学, 2023. |
| SHI Z Q. Research on unsafe behavior detection and safety state analysis of construction site based on UAV remote sensing data[D]. Yichang: China Three Gorges University, 2023 (in Chinese). | |
| [7] |
GIRSHICK R, DONAHUE J, DARRELL T, et al. Region-based convolutional networks for accurate object detection and segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(1): 142-158.
DOI PMID |
| [8] | GIRSHICK R. Fast R-CNN[C]// 2015 IEEE International Conference on Computer Vision. New York: IEEE Press, 2015: 1440-1448. |
| [9] |
REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
DOI PMID |
| [10] |
HE K M, ZHANG X Y, REN S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916.
DOI PMID |
| [11] | LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 936-944. |
| [12] | KHANAM R, HUSSAIN M. YOLOv11:an overview of the key architectural enhancements[EB/OL]. [2025-03-05]. https://arxiv.org/pdf/2410.17725. |
| [13] | JOCHER G. Ultralytics YOLO[EB/OL]. [2025-03-05]. https://github.com/ultralytics/ultralytics. |
| [14] | LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector[C]// The 14th European Conference on Computer Vision. Cham: Springer, 2016: 21-37. |
| [15] |
LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(2): 318-327.
DOI URL |
| [16] | CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers[C]// The 16th European Conference on Computer Vision. Cham: Springer, 2020: 213-229. |
| [17] |
WU X, LI W, HONG D F, et al. Deep learning for unmanned aerial vehicle-based object detection and tracking: a survey[J]. IEEE Geoscience and Remote Sensing Magazine, 2022, 10(1): 91-124.
DOI URL |
| [18] |
TANG G Y, NI J J, ZHAO Y H, et al. A survey of object detection for UAVs based on deep learning[J]. Remote Sensing, 2024, 16(1): 149.
DOI URL |
| [19] |
XIANG T Z, XIA G S, ZHANG L P. Mini-unmanned aerial vehicle-based remote sensing: techniques, applications, and prospects[J]. IEEE Geoscience and Remote Sensing Magazine, 2019, 7(3): 29-63.
DOI URL |
| [20] |
DING J J, ZHANG J H, ZHAN Z Q, et al. A precision efficient method for collapsed building detection in post-earthquake UAV images based on the improved NMS algorithm and faster R-CNN[J]. Remote Sensing, 2022, 14(3): 663.
DOI URL |
| [21] |
CHEN F C, JAHANSHAHI M R. ARF-Crack: rotation invariant deep fully convolutional network for pixel-level crack detection[J]. Machine Vision and Applications, 2020, 31(6): 47.
DOI |
| [22] | ZHOU Y Z, YE Q X, QIU Q, et al. Oriented response networks[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 4961-4970. |
| [23] |
HU G S, YAO P, WAN M Z, et al. Detection and classification of diseased pine trees with different levels of severity from UAV remote sensing images[J]. Ecological Informatics, 2022, 72: 101844.
DOI URL |
| [24] |
BASHIR S M A, WANG Y. Small object detection in remote sensing images with residual feature aggregation-based super-resolution and object detector network[J]. Remote Sensing, 2021, 13(9): 1854.
DOI URL |
| [25] | 蒋文全, 高豪云, 郑佳秋, 等. 无人机在民用行业应用研究综述[J]. 机电工程技术, 2025, 54(9): 119-124, 183. |
| JIANG W Q, GAO H Y, ZHENG J Q, et al. Review of researches on the application of UAV in the civilian industry[J]. Mechanical & Electrical Engineering Technology, 2025, 54(9): 119-124, 183 (in Chinese). | |
| [26] | VARGA L A, KIEFER B, MESSMER M, et al. SeaDronesSee: a maritime benchmark for detecting humans in open water[C]// 2022 IEEE/CVF Winter Conference on Applications of Computer Vision. New York: IEEE Press, 2022: 3686-3696. |
| [27] |
DENG J N, SHI Z G, ZHUO C. Energy-efficient real-time UAV object detection on embedded platforms[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2020, 39(10): 3123-3127.
DOI URL |
| [28] | HSIEH M R, LIN Y L, HSU W H. Drone-based object counting by spatially regularized regional proposal network[C]// 2017 IEEE International Conference on Computer Vision. New York: IEEE Press, 2017: 4165-4173. |
| [29] | DU D W, QI Y K, YU H Y, et al. The unmanned aerial vehicle benchmark: object detection and tracking[C]// The 15th European Conference on Computer Vision. Cham: Springer, 2018: 375-391. |
| [30] | ZHU P F, WEN L Y, BIAN X, et al. Vision meets drones: a challenge[EB/OL]. [2025-03-05]. https://arxiv.org/abs/1804.07437. |
| [31] |
XU X W, ZHANG X Y, YU B, et al. DAC-SDC low power object detection challenge for UAV applications[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(2): 392-403.
DOI URL |
| [32] | BOZCAN I, KAYACAN E. AU-AIR: a multi-modal unmanned aerial vehicle dataset for low altitude traffic surveillance[C]// 2020 IEEE International Conference on Robotics and Automation. New York: IEEE Press, 2020: 8504-8510. |
| [33] |
ZHANG W, LIU C S, CHANG F L, et al. Multi-scale and occlusion aware network for vehicle detection and segmentation on UAV aerial images[J]. Remote Sensing, 2020, 12(11): 1760.
DOI URL |
| [34] |
ZHANG H J, SUN M S, LI Q, et al. An empirical study of multi-scale object detection in high resolution UAV images[J]. Neurocomputing, 2021, 421, 173-182.
DOI URL |
| [35] |
SUN Y M, CAO B, ZHU P F, et al. Drone-based RGB-infrared cross-modality vehicle detection via uncertainty-aware learning[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(10): 6700-6713.
DOI URL |
| [36] |
AKSHATHA K R, KARUNAKAR A K, SHENOY B S, et al. Manipal-UAV person detection dataset: a step towards benchmarking dataset and algorithms for small object detection[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2023, 195: 77-89.
DOI URL |
| [37] |
AHMED I, AHMAD M, ADNAN A, et al. Person detector for different overhead views using machine learning[J]. International Journal of Machine Learning and Cybernetics, 2019, 10(10): 2657-2668.
DOI |
| [38] |
CAO Z, KOOISTRA L, WANG W S, et al. Real-time object detection based on UAV remote sensing: a systematic literature review[J]. Drones, 2023, 7(10): 620.
DOI URL |
| [39] | SUN Z Q, CAO S C, YANG Y M, et al. Rethinking transformer-based set prediction for object detection[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 3591-3600. |
| [40] | SZELISKI R. Computer vision: algorithms and applications[M]. 2nd ed. New York: Springer, 2022: 30-35. |
| [41] | DEAN J, CORRADO G S, MONGA R, et al. Large scale distributed deep networks[C]// The 26th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2012: 1223-1231. |
| [1] | DONG Wenyi, YANG Weidong, TANG Binghui, WANG Qi, XIAO Hongyu. Review of deep learning based methods for detecting focal liver lesions [J]. Journal of Graphics, 2026, 47(1): 1-16. |
| [2] | YANG Biao, WANG Xue, GUAN Zheng, LONG Ping. BSD-YOLO: a small target vehicle detection method based on dynamic sparse attention and adaptive detection head [J]. Journal of Graphics, 2026, 47(1): 99-110. |
| [3] | ZHAO Zhenbing, Ouyang Wenbin, FENG Shuo, LI Haopeng, MA Jun. A thermal image detection method for insulators incorporating within-class sparse prior knowledge and improved YOLOv8 [J]. Journal of Graphics, 2025, 46(6): 1247-1256. |
| [4] | WANG Haihan. Multi object detection method for surface defects of steel arch towers based on YOLOv8-OSRA [J]. Journal of Graphics, 2025, 46(6): 1327-1336. |
| [5] | ZHAI Yongjie, ZHAI Bangchao, HU Zhedong, YANG Ke, WANG Qianming, ZHAO Xiaoyu. Adaptive feature fusion pyramid and attention mechanism-based method for transmission line insulator defect detection [J]. Journal of Graphics, 2025, 46(5): 950-959. |
| [6] | GUO Ruidong, LAN Guiwen, FAN Donglin, ZHONG Zhan, XU Zirui, REN Xinyue. An object detection algorithm for powerline inspection based on the feature focus & diffusion network [J]. Journal of Graphics, 2025, 46(4): 719-726. |
| [7] | WANG Zhidong, CHEN Chenyang, LIU Xiaoming. Defect detection method of communication optical cable based on adaptive feature extraction [J]. Journal of Graphics, 2025, 46(2): 241-248. |
| [8] | ZHANG Lili, YANG Kang, ZHANG Ke, WEI Wei, LI Jing, TAN Hongxin, ZHANG Xiangyu. Research on improved YOLOv8 detection algorithm for diesel vehicle emission of black smoke [J]. Journal of Graphics, 2025, 46(2): 249-258. |
| [9] | ZHAI Yongjie, WANG Luyao, ZHAO Xiaoyu, HU Zhedong, WANG Qianming, WANG Yaru. Multi-fitting detection for transmission lines based on a cascade query-position relationship method [J]. Journal of Graphics, 2025, 46(2): 288-299. |
| [10] | ZHAO Zhenbing, HAN Yu, TANG Chenkang. Cascade detection method for insulator defects in distribution lines based on improved YOLOv8 [J]. Journal of Graphics, 2025, 46(1): 1-12. |
| [11] | CHENG Xudong, SHI Caijuan, GAO Weixiang, WANG Sen, DUAN Changyu, YAN Xiaodong. Consistent and unbiased teacher model research for domain adaptive object detection [J]. Journal of Graphics, 2025, 46(1): 114-125. |
| [12] | WANG Zhidong, CHEN Chenyang, LIU Xiaoming. The defect detection method for communication optical cables based on lightweight improved YOLOv8 [J]. Journal of Graphics, 2025, 46(1): 28-34. |
| [13] | YUAN Chao, ZHAO Mingxue, ZHANG Fengyi, FENG Xiaoyong, LI Bing, CHEN Rui. Point cloud feature enhanced 3D object detection in complex indoor scenes [J]. Journal of Graphics, 2025, 46(1): 59-69. |
| [14] | WANG Yang, MA Chang, HU Ming, SUN Tao, RAO Yuan, YUAN Zhenyu. Lightweight wild bat detection method based on multi-scale feature fusion [J]. Journal of Graphics, 2025, 46(1): 70-80. |
| [15] | SUN Qianlai, LIN Shaohang, LIU Dongfeng, SONG Xiaoyang, LIU Jiayao, LIU Ruizhen. Few-shot pointer meters detection method based on meta-learning [J]. Journal of Graphics, 2025, 46(1): 81-93. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||