This is a preview and has not been published. View submission

Enhanced Real-Time Object Detection using YOLOv7 and MobileNetv3

Authors

  • Sara Ennaama SIGL LAB., ENSA of Tetouan, Abdelmalek Essaadi University, Tetouan, Morocco
  • Hassan Silkan Department of Computer Science, Laboratory LAROSERI, Faculty of Sciences, University of Chouaib Doukkali, El Jadida, Morocco
  • Ahmed Bentajer SIGL LAB., ENSA of Tetouan, Abdelmalek Essaadi University, Tetouan, Morocco
  • Abderrahim Tahiri SIGL LAB., ENSA of Tetouan, Abdelmalek Essaadi University, Tetouan, Morocco
Volume: 15 | Issue: 1 | Pages: 19181-19187 | February 2025 | https://doi.org/10.48084/etasr.8777

Abstract

Object detection serves as a crucial element in computer vision, increasingly relying on deep learning techniques. Among various methods, the YOLO series has gained recognition as an effective solution. This research enhances object detection by merging YOLOv7 with MobileNetv3, known for its efficiency and feature extraction. The integrated model was tested using the COCO dataset, which contains over 164,000 images across 80 categories, achieving a mAP score of 0.61. Additionally, confusion matrix analysis confirmed its accuracy, especially in detecting common objects such as 'person' and 'car' with minimal misclassifications. The results demonstrate the potential of the proposed model to address the complexities of real-world scenarios, highlighting its applicability in various scientific and industrial domains.

Keywords:

real-time object detection, deep learning, YOLOv7, MobileNetv3, computer vision

Downloads

Download data is not yet available.

References

P. Viola and M. Jones, "Rapid object detection using a boosted cascade of simple features," in Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, Kauai, HI, USA, 2001, vol. 1, p. I-511-I–518.

N. Dalal and B. Triggs, "Histograms of Oriented Gradients for Human Detection," in 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, 2005, vol. 1, pp. 886–893.

R. Girshick, J. Donahue, T. Darrell, and J. Malik, "Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation," in 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, Jun. 2014, pp. 580–587.

R. Girshick, "Fast R-CNN," in 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, Dec. 2015, pp. 1440–1448.

S. Ren, K. He, R. Girshick, and J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks." arXiv, 2015.

W. Liu et al., "SSD: Single Shot MultiBox Detector," in Computer Vision – ECCV 2016, 2016, vol. 9905, pp. 21–37.

T. Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollar, "Focal Loss for Dense Object Detection," in 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, Oct. 2017, pp. 2999–3007.

J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You Only Look Once: Unified, Real-Time Object Detection," in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, Jun. 2016, pp. 779–788.

J. Redmon and A. Farhadi, "YOLO9000: Better, Faster, Stronger," in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, Jul. 2017, pp. 6517–6525.

J. Redmon and A. Farhadi, "YOLOv3: An Incremental Improvement." arXiv, 2018.

A. Bochkovskiy, C. Y. Wang, and H. Y. M. Liao, "YOLOv4: Optimal Speed and Accuracy of Object Detection." arXiv, 2020.

E. Iren, "Comparison of YOLOv5 and YOLOv6 Models for Plant Leaf Disease Detection," Engineering, Technology & Applied Science Research, vol. 14, no. 2, pp. 13714–13719, Apr. 2024.

C. Y. Wang, A. Bochkovskiy, and H. Y. M. Liao, "YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors." arXiv, 2022.

A. G. Howard et al., "MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications." arXiv, 2017.

M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, "MobileNetV2: Inverted Residuals and Linear Bottlenecks," in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, Jun. 2018, pp. 4510–4520.

A. Howard et al., "Searching for MobileNetV3," in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South), Oct. 2019, pp. 1314–1324.

T. Y. Lin et al., "Microsoft COCO: Common Objects in Context," in Computer Vision – ECCV 2014, Zurich, Switzerland, 2014, pp. 740–755.

"Deed - Attribution 4.0 International - Creative Commons." https://creativecommons.org/licenses/by/4.0/.

X. Ding, X. Zhang, N. Ma, J. Han, G. Ding, and J. Sun, "RepVGG: Making VGG-style ConvNets Great Again," in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, Jun. 2021, pp. 13728–13737.

C. Y. Wang, A. Bochkovskiy, and H. Y. M. Liao, "Scaled-YOLOv4: Scaling Cross Stage Partial Network," in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, Jun. 2021, pp. 13024–13033.

T. Jiang and J. Cheng, "Target Recognition Based on CNN with LeakyReLU and PReLU Activation Functions," in 2019 International Conference on Sensing, Diagnostics, Prognostics, and Control (SDPC), Beijing, China, Aug. 2019, pp. 718–722.

Z. Ge, S. Liu, F. Wang, Z. Li, and J. Sun, "YOLOX: Exceeding YOLO Series in 2021." arXiv, 2021.

A. M. Roy and J. Bhaduri, "A Deep Learning Enabled Multi-Class Plant Disease Detection Model Based on Computer Vision," AI, vol. 2, no. 3, pp. 413–428, Aug. 2021.

Downloads

How to Cite

[1]
Ennaama, S., Silkan, H., Bentajer, A. and Tahiri, A. 2025. Enhanced Real-Time Object Detection using YOLOv7 and MobileNetv3. Engineering, Technology & Applied Science Research. 15, 1 (Feb. 2025), 19181–19187. DOI:https://doi.org/10.48084/etasr.8777.

Metrics

Abstract Views: 122
PDF Downloads: 57

Metrics Information