Advanced Gesture Recognition in Gaming: Implementing EfficientNetV2-B1 for "Rock, Paper, Scissors"
Received: 29 January 2025 | Revised: 27 February 2025 | Accepted: 6 March 2025 | Online: 6 May 2025
Corresponding author: Biswaranjan Acharya
Abstract
The study introduces a gesture recognition system for the classic "Rock, Paper, Scissors" game, based on a modified EfficientNetV2-B1 architecture. The dataset comprises 2,700 images, evenly divided among the three classes: "Rock", "Paper", and "Scissors". Leveraging the efficiency and accuracy of the EfficientNetV2-B1 model in image recognition tasks, the system was trained to classify these gestures effectively, and after fine-tuning, it achieved an accuracy of 98.89% and an Area Under the Curve (AUC) of ~1.0, indicating near-perfect classification across all classes. This performance highlights the potential of EfficientNetV2-B1 for real-time gesture recognition, with applications in interactive gaming and other gesture-based user interfaces. The proposed system also offers a foundation for further research and development in gesture recognition technologies.
Keywords:
rock-paper-scissors, deep learning, EfficientNetV2-B1, gesture recognitionDownloads
References
Rautaray, S. S., & Agrawal, A. (2011, December). Interaction with virtual game through hand gesture recognition. In 2011 International Conference on Multimedia, Signal Processing and Communication Technologies (pp. 244-247). IEEE.
Mohanty, A., Rambhatla, S. S., & Sahay, R. R. (2017). Deep gesture: static hand gesture recognition using CNN. In Proceedings of International Conference on Computer Vision and Image Processing: CVIP 2016, Volume 2 (pp. 449-461). Springer Singapore.
Alqethami, S., Almtanni, B., Alzhrani, W., & Alghamdi, M. (2022). Disease detection in apple leaves using image processing techniques. Engineering, Technology & Applied Science Research, 12(2), 8335-8341.
Kim, B., & Seo, S. (2023). EfficientNetV2-based dynamic gesture recognition using transformed scalogram from triaxial acceleration signal. Journal of Computational Design and Engineering, 10(4), 1694-1706.
J. Qi, L. Ma, Z. Cui, and Y. Yu, “Computer vision-based hand gesture recognition for human-robot interaction: a review,” Complex Intell. Syst., vol. 10, no. 1, pp. 1581–1606, 2024.
Alshammari, S. A., & Albalawi, N. S. (2024). Enhancing Healthcare Monitoring: A Deep Learning Approach to Human Activity Recognition using Wearable Sensors. Engineering, Technology & Applied Science Research, 14(6), 18843-18848..
A. Amrutesh, K. P. Asha Rani, A. Amruthamsh, S. Gowrishankar, and C. G. Gowtham Bhat, “Quantitative study on variation of glaucoma eye images using various EfficientNetV2 models,” in AI-Centric Modeling and Analytics, Boca Raton: CRC Press, 2023, pp. 173–197.
H. G. Doan and N. T. Nguyen, “Fusion machine learning strategies for multi-modal sensor-based hand gesture recognition,” Eng. Technol. Appl. Sci. Res., vol. 12, no. 3, pp. 8628–8633, 2022.
Z. Mohammadi, A. Akhavanpour, R. Rastgoo, and M. Sabokrou, “Diverse hand gesture recognition dataset,” Multimed. Tools Appl., vol. 83, no. 17, pp. 50245–50267, 2023.
A. S. M. Miah, M. A. M. Hasan, Y. Tomioka, and J. Shin, “Hand gesture recognition for multi-culture sign language using graph and general deep learning network,” IEEE Open J. Comput. Soc., pp. 1–12, 2024.
R. Bhumika and R. K. Hussain Laskar, “mIV3Net: modified inception V3 network for hand gesture recognition,” Multimedia Tools and Applications, vol. 83, pp. 10587–10613, 2024.
C. Griffin, L. Feng, and R. Wu, “Spatial dynamics of higher order rock-paper-scissors and generalisations,” arXiv [nlin.PS], 2023.
M. N. Ichsan, N. Armita, A. E. Minarno, F. D. S. Sumadi, and Hariyady, “Increased accuracy on image classification of game rock Paper Scissors using CNN,” J. RESTI (Rekayasa Sist. Dan Teknol. Inf.), vol. 6, no. 4, pp. 606–611, 2022.
A. Donciu-Julin, “Rock-Paper-Scissors-Dataset.” 21-Feb-2024.
H. Kırğıl, E. Nur, and Ç. B. Erdaş, “Enhancing Skin Disease Diagnosis Through Deep Learning: A Comprehensive Study on Dermoscopic Image Preprocessing and Classification,” International Journal of Imaging Systems and Technology, vol. 34, no. 4, 2024.
O. A. Abioye, A. E. Evwiekpaefe, and A. J. Awujoola, “Performance evaluation of efficientnetv2 models on the classification of histopathological benign breast cancer images,” Science Journal of University of Zakho, vol. 12, no. 2, pp. 208–214, 2024.
A. B. S. Salamh and H. I. Akyüz, “A novel feature extraction descriptor for face recognition,” Eng. Technol. Appl. Sci. Res., vol. 12, no. 1, pp. 8033–8038, 2022.
F. Ahmed, W. A. Khan, M. Iqbal, A. R. Ahmad Abazeed, H. Alrababah, and M. F. Khan, “Rock-paper-scissors image classification using transfer learning,” in 2023 International Conference on Business Analytics for Technology and Security (ICBATS), 2023.
D. Ye et al., “Towards playing full MOBA games with deep reinforcement learning,” arXiv [cs.AI], 2020.
Goyal, E. K., & Singh, A. (2014). Indian sign language recognition system for differently-able people. Journal on Today s Ideas-Tomorrow s Technologies, 2(2), 145–151. doi:10.15415/jotitt.2014.22011
Kumar, A., & Mantri, A. (2020). Gesture-based model of mixed reality human-computer interface. 2020 9th International Conference System Modeling and Advancement in Research Trends (SMART), 226–230. IEEE.
Bhalekar, M., & Bedekar, M. (2022). D-CNN: a new model for generating image captions with text extraction using deep learning for visually challenged individuals. Engineering, Technology & Applied Science Research, 12(2), 8366-8373.
Jiao, L., & Zhao, J. (2019). A survey on the new generation of deep learning in image processing. Ieee Access, 7, 172231-172263.
Tian, C., Xu, Y., Fei, L., & Yan, K. (2019). Deep learning for image denoising: A survey. In Genetic and Evolutionary Computing: Proceedings of the Twelfth International Conference on Genetic and Evolutionary Computing, December 14-17, Changzhou, Jiangsu, China 12 (pp. 563-572). Springer Singapore.
Koller, O., Zargaran, S., Ney, H., & Bowden, R. (2016, September). Deep Sign: Hybrid CNN-HMM for Continuous Sign Language Recognition. In BMVC (pp. 136-1).
Downloads
How to Cite
License
Copyright (c) 2025 Chander Prabha, Retinderdeep Singh, Meena Malik, Manas Ranjan Pradhan, Biswaranjan Acharya

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.