UAV-YOLOv5: A Lightweight Object Detection Algorithm on Drone-captured Scenarios
Object Detection; Small Object; Drone; YOLO.Abstract
Aiming at the problems of common object detection algorithms on drone-captured scenarios, such as too large model, difficult deployment, low accuracy of small-scale object detection, this paper proposed a series of improved methods based on YOLOv5, which effectively improved the performance of the algorithm on drone-captured scenarios. A new dual-branch CSPNet (DB-CSPNet) structure was proposed, which effectively reduced the complexity and computation of the model. A new feature fusion path (FS-FPN) was proposed, which effectively improved the detection accuracy of the model. By integrating a attention mechanism (ACmix), the performance of the model is effectively improved. The experimental results shown that the proposed methods have a significant improvement effect on the accuracy of the object detection algorithm on drone-captured scenarios. The mAP@0.5 and mAP@0.5:0.95 of the algorithm which used the method 2 and 3 proposed in this paper can be improved by 2.5% and 1.6%. At the same time, the method 1 proposed in this paper can also achieve good lightweight effect, the model parameters and FLOPs can be reduced by 26.6% and 30.4%. The UAV-YOLOv5 implemented by all methods in this paper can also achieve a good balance between precision and lightweight. Compared with the default YOLOv5s, the mAP@0.5 and mAP@0.5:0.95 increased by 1.5% and 1.0%, and the parameters and FLOPs decreased by 3.7% and 7.0% respectively.
Xia, G.S.; Bai, X.; Ding, J.; Zhu, Z.; Belongie, S.; Luo, J.; Datcu, M.; Pelillo, M.; Zhang, L. DOTA: A Large-Scale Dataset for Object Detection in Aerial Images. In Proceedings of the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
Deng, J.; Dong, W.; Socher, R.; Li, L.J.; Li, K.; Fei-Fei, L. ImageNet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 248–255.
Lin, T.Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft coco: Common objects in context. In Proceedings of the European conference on computer vision (ECCV), 2014, pp. 740–755.
Glenn, J. 6.2 - YOLOv5 Classification Models, Apple M1, Reproducibility, ClearML and integrations, 2022. https: //
Wang, C.Y.; Liao, H.Y.M.; Wu, Y.H.; Chen, P.Y.; Hsieh, J.W.; Yeh, I.H. CSPNet: A New Backbone That Can Enhance Learning Capability of CNN. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR) Workshops, June 2020.
Pan, X.; Ge, C.; Lu, R.; Song, S.; Chen, G.; Huang, Z.; Huang, G. On the Integration of Self-Attention and Convolution. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2022, pp. 815–825.
Ding, X.; Zhang, X.; Ma, N.; Han, J.; Ding, G.; Sun, J. RepVGG: Making VGG-Style ConvNets Great Again. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2021, pp. 13733–13742.
Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.109342020.
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016.
Xie, S.; Girshick, R.; Dollar, P.; Tu, Z.; He, K. Aggregated Residual Transformations for Deep Neural Networks. In Proceedings of the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017.
Lin, T.Y.; Dollar, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. In Proceedings of the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017.
Liu, S.; Qi, L.; Qin, H.; Shi, J.; Jia, J. Path Aggregation Network for Instance Segmentation. In Proceedings of the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
Tan, M.; Pang, R.; Le, Q.V. EfficientDet: Scalable and Efficient Object Detection. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020.
Ghiasi, G.; Lin, T.Y.; Le, Q.V. NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019.
Zhu, P.; Wen, L.; Du, D.; Bian, X.; Fan, H.; Hu, Q.; Ling, H. Detection and Tracking Meet Drones Challenge. IEEE Transactions on Pattern Analysis and Machine Intelligence 2021, pp. 1–1.
Copyright (c) 2024 Scientific Journal of Intelligent Systems Research
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.