This is a preview and has not been published. View submission

Q_YOLOv5m: A Quantization-based Approach for Accelerating Object Detection on Embedded Platforms

Authors

  • Nizal Alshammry Department of Computer Sciences, Faculty of Computing and Information Technology, Northern Border University, Rafha, Saudi Arabia
  • Taoufik Saidani Center for Scientific Research and Entrepreneurship, Northern Border University, 73213, Arar, Saudi Arabia
  • Nasser S. Albalawi Department of Computer Sciences, Faculty of Computing and Information Technology, Northern Border University, Rafha, Saudi Arabia
  • Sami Mohammed Alenezi Department of Information Technology, Faculty of Computing and Information Technology, Northern Border University, Rafha, Saudi Arabia
  • Fahd Alhamazani Department of Computer Sciences, Faculty of Computing and Information Technology, Northern Border University, Rafha, Saudi Arabia
  • Sami Aziz Alshammari Department of Information Technology, Faculty of Computing and Information Technology, Northern Border University Rafha, Saudi Arabia
  • Mohammed Aleinzi Department of Information Systems, Faculty of Computing and Information Technology, Northern Border University, Rafha, Saudi Arabia
  • Abdulaziz Alanazi Department of Information Systems, Faculty of Computing and Information Technology, Northern Border University Rafha, Saudi Arabia
  • Mahmoud Salaheldin Elsayed Department of Computer Sciences, Faculty of Computing and Information Technology, Northern Border University, Rafha, Saudi Arabia
Volume: 15 | Issue: 1 | Pages: 19749-19755 | February 2025 | https://doi.org/10.48084/etasr.9441

Abstract

The deployment of deep learning models on resource-constrained embedded platforms presents significant challenges due to limited computational power, memory, and energy efficiency. To address this issue, this study proposes a novel quantization method tailored to accelerate object detection using a quantized version of the YOLOv5m model, called Q_YOLOv5m. This method reduces the model's computational complexity and memory footprint, allowing for faster inference and lower power consumption, making it ideal for real-time applications on embedded systems. This approach incorporates advanced weight and activation quantization techniques to balance performance with accuracy, dynamically adjusting precision based on hardware capabilities. The efficacy of Q_YOLOv5m was confirmed, exhibiting substantial enhancements in inference speed and a reduction in model size with negligible loss in object detection accuracy. The findings underscore the capability of Q_YOLOv5m for edge applications, including autonomous vehicles, intelligent surveillance, and IoT-based monitoring systems.

Keywords:

object detection, quantization, embedded systems, deep learning

Downloads

Download data is not yet available.

References

A. Dhillon and G. K. Verma, "Convolutional neural network: a review of models, methodologies and applications to object detection," Progress in Artificial Intelligence, vol. 9, no. 2, pp. 85–112, Jun. 2020.

A. Lopes, F. Pereira dos Santos, D. de Oliveira, M. Schiezaro, and H. Pedrini, "Computer Vision Model Compression Techniques for Embedded Systems:A Survey," Computers & Graphics, vol. 123, Oct. 2024, Art. no. 104015.

A. Gholami, S. Kim, Z. Dong, Z. Yao, M. W. Mahoney, and K. Keutzer, "A Survey of Quantization Methods for Efficient Neural Network Inference," in Low-Power Computer Vision, Chapman and Hall/CRC, 2022.

B. Yao, L. Liu, Y. Peng, and X. Peng, "Intelligent Measurement on Edge Devices Using Hardware Memory-Aware Joint Compression Enabled Neural Networks," IEEE Transactions on Instrumentation and Measurement, vol. 73, pp. 1–13, 2024.

H. Wu, P. Judd, X. Zhang, M. Isaev, and P. Micikevicius, "Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation." arXiv, Apr. 20, 2020.

J. Gorospe, R. Mulero, O. Arbelaitz, J. Muguerza, and M. Á. Antón, "A Generalization Performance Study Using Deep Learning Networks in Embedded Systems," Sensors, vol. 21, no. 4, Jan. 2021, Art. no. 1031.

P. Xiao, C. Zhang, Q. Guo, X. Xiao, and H. Wang, "Neural Networks Integer Computation: Quantizing Convolutional Neural Networks of Inference and Training for Object Detection in Embedded Systems," IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 17, pp. 15862–15884, 2024.

M. A. Hanif and M. Shafique, "Cross-Layer Optimizations for Efficient Deep Learning Inference at the Edge," in Embedded Machine Learning for Cyber-Physical, IoT, and Edge Computing: Software Optimizations and Hardware/Software Codesign, S. Pasricha and M. Shafique, Eds. Cham: Springer Nature Switzerland, 2024, pp. 225–248.

M. Wang et al., "Q-YOLO: Efficient Inference for Real-Time Object Detection," in Pattern Recognition, Kitakyushu, Japan, Nov. 2023, pp. 307–321.

T. Liang, J. Glossner, L. Wang, S. Shi, and X. Zhang, "Pruning and quantization for deep neural network acceleration: A survey," Neurocomputing, vol. 461, pp. 370–403, Oct. 2021.

B. Jacob et al., "Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference," in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, Jun. 2018, pp. 2704–2713.

M. Rastegari, V. Ordonez, J. Redmon, and A. Farhadi, "XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks," in Computer Vision – ECCV 2016, Amsterdam, The Netherlands, 2016, pp. 525–542.

J. Y. Li, Y. K. Zhao, Z. E. Xue., Z. Cai, and Q. Li., "A survey of model compression for deep neural networks," Chinese Journal of Engineering, vol. 41, no. 10, pp. 1229–1239, Oct. 2019.

P.-E. Novac, G. Boukli Hacene, A. Pegatoquet, B. Miramond, and V. Gripon, "Quantization and Deployment of Deep Neural Networks on Microcontrollers," Sensors, vol. 21, no. 9, Jan. 2021, Art. no. 2984.

A. Polino, R. Pascanu, and D. Alistarh, "Model compression via distillation and quantization." arXiv, Feb. 15, 2018.

Y. Ding et al., "Towards Accurate Post-Training Quantization for Vision Transformer," in Proceedings of the 30th ACM International Conference on Multimedia, Lisboa, Portugal, Oct. 2022, pp. 5380–5388.

M. Li et al., "Contemporary Advances in Neural Network Quantization: A Survey," in 2024 International Joint Conference on Neural Networks (IJCNN), Yokohama, Japan, Jun. 2024, pp. 1–10.

R. Zhang and A. C. S. Chung, "EfficientQ: An efficient and accurate post-training neural network quantization method for medical image segmentation," Medical Image Analysis, vol. 97, Oct. 2024, Art. no. 103277.

T. Y. Lin et al., "Microsoft COCO: Common Objects in Context," in Computer Vision – ECCV 2014, Zurich, Switzerland, 2014, pp. 740–755.

T. Saidani, R. Ghodhbani, A. Alhomoud, A. Alshammari, H. Zayani, and M. B. Ammar, "Hardware Acceleration for Object Detection using YOLOv5 Deep Learning Algorithm on Xilinx Zynq FPGA Platform," Engineering, Technology & Applied Science Research, vol. 14, no. 1, pp. 13066–13071, Feb. 2024.

R. Ghodhbani, T. Saidani, and H. Zayeni, "Deploying deep learning networks based advanced techniques for image processing on FPGA platform," Neural Computing and Applications, vol. 35, no. 26, pp. 18949–18969, Sep. 2023.

Downloads

How to Cite

[1]
Alshammry, N., Saidani, T., Albalawi, N.S., Alenezi, S.M., Alhamazani, F., Alshammari, S.A., Aleinzi, M., Alanazi, A. and Elsayed, M.S. 2025. Q_YOLOv5m: A Quantization-based Approach for Accelerating Object Detection on Embedded Platforms. Engineering, Technology & Applied Science Research. 15, 1 (Feb. 2025), 19749–19755. DOI:https://doi.org/10.48084/etasr.9441.

Metrics

Abstract Views: 85
PDF Downloads: 47

Metrics Information

Most read articles by the same author(s)

1 2 > >>