Efficient Job Scheduling in Cloud Environments using Reinforcement Learning Actor-Critic Models
Received: 12 June 2024 | Revised: 24 July 2024 | Accepted: 26 July 2024 | Online: 9 October 2024
Corresponding author: Archana Naik
Abstract
Optimized scheduling is an important task in the scheduling of job execution on cloud virtual machines, where optimal resource usage and a shorter makespan have become important features. When scheduling jobs, balancing the workload across all available virtual machines provides optimized performance. Reinforcement learning is a better optimization algorithm due to its adaptability to dynamic environments and balancing exploration and exploitation. To perform optimized balancing of job scheduling, an Actor-Critic-based reinforcement algorithm is applied in this work. The Alibaba cloud dataset is used to analyze the algorithm's performance. Policy constraints are made for assigning the number of tasks to the scheduler. During the learning phase, the rewards turn out to be negative. After the learning phase, the rewards stabilize. The results show that the algorithm is able to produce positive reward points. A 5% reduction in the makespan of job execution demonstrates the improvement in scheduling and resource use.
Keywords:
cloud resource scheduling, deep reinforcement learning, learning algorithm, task schedulingDownloads
References
J. Zhang, G. Ding, Y. Zou, S. Qin, and J. Fu, "Review of job shop scheduling research and its new perspectives under Industry 4.0," Journal of Intelligent Manufacturing, vol. 30, no. 4, pp. 1809–1830, Apr. 2019.
S. K. Mishra, B. Sahoo, and P. P. Parida, "Load balancing in cloud computing: A big picture," Journal of King Saud University - Computer and Information Sciences, vol. 32, no. 2, pp. 149–158, Feb. 2020.
R. Mijumbi, J.-L. Gorricho, J. Serrat, M. Claeys, F. De Turck, and S. Latre, "Design and evaluation of learning algorithms for dynamic resource management in virtual networks," in IEEE Network Operations and Management Symposium, Krakow, Poland, Dec. 2014, pp. 1–9.
R. Eswaraprasad and L. Raja, "A review of virtual machine (VM) resource scheduling algorithms in cloud computing environment," Journal of Statistics and Management Systems, vol. 20, no. 4, pp. 703–711, Jul. 2017.
A. R. Arunarani, D. Manjula, and V. Sugumaran, "Task scheduling techniques in cloud computing: A literature survey," Future Generation Computer Systems, vol. 91, pp. 407–415, Feb. 2019.
T. Gabel and M. Riedmiller, "Adaptive reactive job-shop scheduling with reinforcement learning agents," International Journal of Information Technology and Intelligent Computing, pp. 1–30, 2008.
T. Zhang, S. Xie, and O. Rose, "Real-time job shop scheduling based on simulation and Markov decision processes," in Winter Simulation Conference, Las Vegas, NV, USA, Dec. 2017, pp. 3899–3907.
A. Naik and K. R. Kavitha Sooda, "A study on Optimal Resource Allocation Policy in Cloud Environment," Turkish Journal of Computer and Mathematics Education, vol. 12, no. 14, pp. 5438–5446, 2021.
P. Tassel, M. Gebser, and K. Schekotihin, "A Reinforcement Learning Environment For Job-Shop Scheduling." arXiv, Apr. 08, 2021.
Y. Gui, D. Tang, H. Zhu, Y. Zhang, and Z. Zhang, "Dynamic scheduling for flexible job shop using a deep reinforcement learning approach," Computers & Industrial Engineering, vol. 180, Jun. 2023, Art. no. 109255.
W. Guo, W. Tian, Y. Ye, L. Xu, and K. Wu, "Cloud Resource Scheduling With Deep Reinforcement Learning and Imitation Learning," IEEE Internet of Things Journal, vol. 8, no. 5, pp. 3576–3586, Mar. 2021.
A. Jayanetti, S. Halgamuge, and R. Buyya, "Deep reinforcement learning for energy and time optimized scheduling of precedence-constrained tasks in edge–cloud computing environments," Future Generation Computer Systems, vol. 137, pp. 14–30, Dec. 2022.
C.-L. Liu, C.-C. Chang, and C.-J. Tseng, "Actor-Critic Deep Reinforcement Learning for Solving Job Shop Scheduling Problems," IEEE Access, vol. 8, pp. 71752–71762, Jan. 2020.
Y. Garí, D. A. Monge, and C. Mateos, "A Q-learning approach for the autoscaling of scientific workflows in the Cloud," Future Generation Computer Systems, vol. 127, pp. 168–180, Feb. 2022.
M. R. Maganti and K. R. Rao, "Enhancing 5G Core Network Performance through Optimal Network Fragmentation and Resource Allocation," Engineering, Technology & Applied Science Research, vol. 14, no. 3, pp. 14588–14593, Jun. 2024.
T. Akhtar, N. G. Haider, and S. M. Khan, "A Comparative Study of the Application of Glowworm Swarm Optimization Algorithm with other Nature-Inspired Algorithms in the Network Load Balancing Problem," Engineering, Technology & Applied Science Research, vol. 12, no. 4, pp. 8777–8784, Aug. 2022.
M. E. Hassan and A. Yousif, "Cloud Job Scheduling with Ions Motion Optimization Algorithm," Engineering, Technology & Applied Science Research, vol. 10, no. 2, pp. 5459–5465, Apr. 2020.
M. Cheng, J. Li, and S. Nazarian, "DRL-cloud: Deep reinforcement learning-based resource provisioning and task scheduling for cloud service providers," in 23rd Asia and South Pacific Design Automation Conference (ASP-DAC), Jeju, Korea (South), Jan. 2018, pp. 129–134.
Y. Huang et al., "Deep Adversarial Imitation Reinforcement Learning for QoS-Aware Cloud Job Scheduling," IEEE Systems Journal, vol. 16, pp. 4232–4242, Sep. 2022.
K. Li, Q. Deng, L. Zhang, Q. Fan, G. Gong, and S. Ding, "An effective MCTS-based algorithm for minimizing makespan in dynamic flexible job shop scheduling problem," Computers & Industrial Engineering, vol. 155, May 2021, Art. no. 107211.
V. Konda and J. Tsitsiklis, "Actor-Critic Algorithms," in Advances in Neural Information Processing Systems, Denver, CO, USA, Dec. 1999.
F. Cheng, Y. Huang, B. Tanpure, P. Sawalani, L. Cheng, and C. Liu, "Cost-aware job scheduling for cloud instances using deep reinforcement learning," Cluster Computing, vol. 25, no. 1, pp. 619–631, Feb. 2022.
S. Bhatnagar, V. S. Borkar, and S. Guin, "Actor–Critic or Critic–Actor? A Tale of Two Time Scales," IEEE Control Systems Letters, vol. 7, pp. 2671–2676, 2023.
J. Yan et al., "Energy-aware systems for real-time job scheduling in cloud data centers: A deep reinforcement learning approach," Computers and Electrical Engineering, vol. 99, Apr. 2022, Art. no. 107688.
https://tianchi.aliyun.com/competition/entrance/531831/information.
Downloads
How to Cite
License
Copyright (c) 2024 Archana Naik, Kavitha Sooda
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.