TY - JOUR
T1 - A bi-objective learn-and-deploy scheduling method for bursty and stochastic requests on heterogeneous cloud servers
AU - Cai, Xinye
AU - Xu, Haiyang
AU - Li, Xiaoping
AU - Wang, Kang
AU - Chen, Long
AU - Ruiz García, Rubén
AU - Zhang, Qingfu
PY - 2022/12
Y1 - 2022/12
N2 - In this article, we consider the dynamic allocation of bursty requests stochastically arriving at heterogeneous servers with uncertain setup times. Lower expected response time and less power consumption are desirable objectives of users and service providers respectively. However, sudden increase and decrease of cloud servers caused by bursty requests are rather challenging to get an appropriate trade-off between the two conflicting objectives which are closely related to the launched servers. The heterogeneity of the cloud servers further makes it more difficult to decide how to switch on and off servers and effectively and efficiently allocate bursty requests with balanced objectives. Based on a Markov decision process, a real-time bilevel decision-making model is constructed for unallocated requests which includes: whether to launch a server and which type of server to launch. A learn-and-deploy algorithm framework is proposed which contains two complementary stages. In the first stage, an effective offline bi-objective optimization algorithm is proposed to learn a set of policies, which provides helpful trade-off information for a decision-maker to choose a preferred policy a posteriori. In terms of the system status, a policy decides whether to launch a server according to a state-action table and which server to launch using a server priority sequence. In the second stage, a computationally efficient policy deployment method is proposed to search the corresponding action in the selected policy based on the current system status and apply it to the real-time system. Experimental studies over a large number of random and real instances have been conducted to validate the effectiveness of the proposed bilevel model and algorithm. Compared to the most recent existing method, the performance of the proposed approach can at most achieve an 80% improvement on power consumption and 20% improvement on response time.
AB - In this article, we consider the dynamic allocation of bursty requests stochastically arriving at heterogeneous servers with uncertain setup times. Lower expected response time and less power consumption are desirable objectives of users and service providers respectively. However, sudden increase and decrease of cloud servers caused by bursty requests are rather challenging to get an appropriate trade-off between the two conflicting objectives which are closely related to the launched servers. The heterogeneity of the cloud servers further makes it more difficult to decide how to switch on and off servers and effectively and efficiently allocate bursty requests with balanced objectives. Based on a Markov decision process, a real-time bilevel decision-making model is constructed for unallocated requests which includes: whether to launch a server and which type of server to launch. A learn-and-deploy algorithm framework is proposed which contains two complementary stages. In the first stage, an effective offline bi-objective optimization algorithm is proposed to learn a set of policies, which provides helpful trade-off information for a decision-maker to choose a preferred policy a posteriori. In terms of the system status, a policy decides whether to launch a server according to a state-action table and which server to launch using a server priority sequence. In the second stage, a computationally efficient policy deployment method is proposed to search the corresponding action in the selected policy based on the current system status and apply it to the real-time system. Experimental studies over a large number of random and real instances have been conducted to validate the effectiveness of the proposed bilevel model and algorithm. Compared to the most recent existing method, the performance of the proposed approach can at most achieve an 80% improvement on power consumption and 20% improvement on response time.
KW - Analytical models
KW - bursty and stochastic requests
KW - Cloud computing
KW - dynamic programming
KW - heterogeneous servers
KW - learn-and-deploy
KW - Markov decision process
KW - multiobjective optimization
KW - Optimization
KW - Power demand
KW - Real-time systems
KW - Servers
KW - Time factors
UR - http://www.scopus.com/inward/record.url?scp=85136884500&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-85136884500&origin=recordpage
U2 - 10.1109/TPDS.2022.3196475
DO - 10.1109/TPDS.2022.3196475
M3 - RGC 21 - Publication in refereed journal
SN - 1045-9219
VL - 33
SP - 4547
EP - 4562
JO - IEEE Transactions on Parallel and Distributed Systems
JF - IEEE Transactions on Parallel and Distributed Systems
IS - 12
ER -