320 likes | 432 Views
Deadline-constrained workflow scheduling algorithms for Infrastructure as a Service Clouds. Future Generation Computer Systems(FGCS.J) journal homepage: www.elsevier.com/locate/fgcs Saeid Abrishami a,∗, Mahmoud Naghibzadeha , Dick H.J. Epemab. Tai , Yu-Chang 4/29/2013.
E N D
Deadline-constrained workflow scheduling algorithms for Infrastructure as a Service Clouds • Future Generation Computer Systems(FGCS.J) • journal homepage: www.elsevier.com/locate/fgcs SaeidAbrishami a,∗, MahmoudNaghibzadeha, Dick H.J. Epemab Tai, Yu-Chang 4/29/2013
Introduction • Scheduling system model • IaaScloud partial critical paths algorithms • An illustrative example • Time complexity • Performance evaluation • Conclusions Outline
Clouds are different from utility Grids - on-demand resource provisioning - homogeneous networks - the pay-as-you-go pricing model • consider the benefits of using Cloud computing for executing scientific workflows -there exist several commercial Clouds, such as Amazon Introduction
Infrastructure as a Service (IaaS) Clouds, has some potential benefits for executing scientific workflows 1. users can dynamically obtain and release resources on demand, and charged on a pay-as-you-go basis 2.resource provisioning 3. illusion of unlimited resources important parameter : economiccost -faster resources are more expensive than slower ones -time-cost tradeoff in selecting appropriate services -belongs to the multi-criteria optimization problems minimize the execution cost of the workflow, while completing the workflow before the user specified deadline IaaSCloud Partial Critical Paths (IC-PCP) IaaSCloud Partial Critical Paths with Deadline Distribution (IC-PCPD2) Introduction
An application is modeled by a directed acyclic graph G(T , E) • T is a set of n tasks {t1, t2, . . . , tn} • E is a set of dependencies ei,j=(ti,tj) • two dummy tasks tentry and texitto the beginning and the end of the workflow (zero execution time and they are connected with zero-weight dependencies to the actual entry and exit tasks) Scheduling system model
services S = {s1,s2,…,sm} with different QoS parameters such as CPU type and memory size, and different prices • The pricing model is based on a pay-as-you-go basis similar to the current commercial Clouds, i.e., the users are charged based on the number of time intervals that they have used the resource, even if they have not completely used the last time interval c1=5 c2=2 c3=1 Scheduling system model
ET(ti, sj) :execution time of task tion computation service sj • average bandwidth between the computation services is roughly equal • TT(ei,j) : data transfer time of a dependency ei,j • MET(ti) : Minimum Execution Time of a task ti -execution time of task ti on a service sj∈ S which has the minimum ET(ti, sj) between all available services p p ti ti c c
SS (ti) = sj,k: Selected Service for each scheduled task ti • sj,k :kth instance of service sj. • AST (ti) : Actual Start Time of ti • assigned node :has already been assigned to (scheduled on) a service • Critical Parent : of a node ti is the unassigned parent of ti that has the latest data arrival time at ti, that is, it is the parent tp of ti for which EFT(tp) + TT(ep,i), is maximal • PCP: The Partial Critical Path of a node ti is: - empty if ti does not have any unassigned parents -consists of the Critical Parent tp of ti and the Partial Critical Path of tp if has any unassigned parents
IC-PCP Algorithm1
10 20 30 0 1 2 3 0~2__19 3~7__24 8~13__30 5 8 11 2 5 8 4 6 10 example D=30 0~5__16 7~10__23 14~17__30 5 12 16 3 8 11 3 6 8 0~3__16 14~19__30 7~11__22 5 8 14 4 8 11 3 5 9
10 20 30 0 1 Path{t2,t6,t9} 2 S2,1 28 2 3 0~2__19 3~7__24 8~13__30 5 8 11 2 5 8 4 6 10 example 0~12__14 14~17__23 21~24__23 D=30 0~5__16 7~10__23 14~17__30 5 12 16 3 8 11 3 6 8 0~3__12 12~20__23 20~28__30 0~3__16 14~19__30 7~11__22 5 8 14 4 8 11 3 5 9
10 20 30 0 1 2 S2,1 2 28 Path{t3} 9 1 3 S3,1 0~2__19 3~7__24 8~13__30 5 8 11 2 5 8 4 6 10 D=30 0~12__14 14~17__23 21~24__30 5 12 16 3 8 11 3 6 8 0~9__12 0~3__12 12~20__22 20~28__30 5 8 14 4 8 11 3 5 9
10 20 30 0 1 2 28 2 S2,1 Path{t5,t8} S2,2 14 6 0~2__18 3~7__23 9 1 3 S3,1 0~2__19 3~7__24 8~13__30 5 8 11 2 5 8 4 6 10 14~22__24 22~28__30 D=30 0~12__14 14~17__23 21~24__30 5 12 16 3 8 11 3 6 8 0~9__12 12~20__22 20~28__30 5 8 14 4 8 11 3 5 9
10 20 30 0 1 2 28 Path{t1,t4} 2 S2,1 S2,2 14 6 0~8__13 8~18__23 19~24__30 9 1 0~2__18 3 3~7__23 S3,1 8~13__30 5 8 11 2 5 8 4 6 10 2 S3,2 18 D=30 0~12__14 14~22__24 22~28__30 5 12 16 3 8 11 3 6 8 0~9__12 12~20__22 20~28__30 5 8 14 4 8 11 3 5 9
10 20 30 0 5 1 2 2 28 2 S2,1 Path{t7} COST=2*5+1*4=14 S2,2 14 6 18~29__30 8~18__23 9 1 0~8__13 1 19~24__30 3 S3,1 5 8 11 2 5 8 4 6 10 2 S3,2 18 1 11 S3,3 D=30 0~12__14 14~22__24 22~28__30 5 12 16 3 8 11 3 6 8 0~9__12 12~20__22 20~28__30 5 8 14 4 8 11 3 5 9
Applicable • applicable instance for a path if it satisfies two conditions: - The path can be scheduled on the instance such that each task of the path is finished before its latest finish time - The new schedule uses (a part of) the extra time of the instance,which is the remaining time of the last time interval of thatinstance. P C Cost=zero C P
IC-PCPD2 Call PLANNING(G(T,E)) Algorithm2 Assign subdeadline on PCP node (assigned node)
0 10 20 30 S1,1 S2,1 t1 tentety t1 t2 t3 t4 S3,1 0~5__6 6~10__24 11~16__30 0~2__6 3~7__24 8~13__30 5 8 11 2 5 8 4 6 10 0~5__7 7~10__13 14~17__30 5 12 16 3 8 11 sb=0 D=30 3 6 8 0~3__16 14~19__30 7~11__17 5 8 14 4 8 11 3 5 9
0 10 20 30 S1,1 t2 S2,1 t1 tentety t1 t2 t3 t4 t5 S3,1 0~5__6 6~10__24 11~16__30 5 8 11 2 5 8 4 6 10 0~5__7 7~10__13 14~17__30 5 12 16 3 8 11 sb=0 D=30 3 6 8 0~3__16 14~19__30 7~11__17 5 8 14 4 8 11 3 5 9
0 10 20 30 S1,1 t2 S2,1 t1 tentety t1 t2 t3 t4 t5 t6 S3,1 t3 0~5__6 6~10__24 11~16__30 5 8 11 2 5 8 4 6 10 0~5__7 7~10__13 14~17__30 5 12 16 3 8 11 sb=0 D=30 3 6 8 0~9__16 11~15__17 18~23__30 0~3__16 14~19__30 7~11__17 5 8 14 4 8 11 3 5 9
0 10 20 30 S1,1 t2 S2,1 t1 tentety t1 t2 t3 t4 t5 t6 t7 S3,1 t3 S3,2 t4 6~16__24 17~22__30 0~5__6 6~10__24 11~16__30 5 8 11 2 5 8 4 6 10 17~20__30 0~5__7 7~10__13 14~17__30 5 12 16 3 8 11 sb=0 D=30 3 6 8 0~9__16 11~15__17 18~23__30 5 8 14 4 8 11 3 5 9
0 10 20 30 S1,1 t5 t2 S2,1 t1 tentety t1 t2 t3 t4 t5 t6 t7 t8 S3,1 t3 S3,2 t4 17~22__30 6~16__24 0~5__6 5 8 11 2 5 8 4 6 10 17~20__30 0~5__7 7~10__13 5 12 16 3 8 11 sb=0 D=30 3 6 8 11~15__17 0~9__16 18~23__30 5 8 14 4 8 11 3 5 9
0 10 30 20 S1,1 t5 t2 S1,2 t6 S2,1 t1 tentety t1 t2 t3 t4 t5 t6 t7 t8 t9 S3,1 t3 S3,2 t4 17~22__30 6~16__24 0~5__6 5 8 11 2 5 8 4 6 10 17~20__30 0~5__7 7~10__13 5 12 16 3 8 11 sb=0 D=30 3 6 8 11~15__17 0~9__16 18~23__30 5 8 14 4 8 11 3 5 9
0 10 20 30 S1,1 t5 t2 S1,2 t6 S2,1 t1 tentety t1 t2 t3 t4 t5 t6 t7 t8 t9 S3,1 t3 t7 S3,2 t4 16~28__30 17~29__30 17~22__30 6~16__24 0~5__6 5 8 11 2 5 8 4 6 10 17~20__30 0~5__7 7~10__13 5 12 16 3 8 11 sb=0 D=30 3 6 8 11~15__17 0~9__16 18~23__30 5 8 14 4 8 11 3 5 9
0 10 20 30 S1,1 t5 t2 S1,2 t6 S2,1 t1 tentety t1 t2 t3 t4 t5 t6 t7 t8 t9 S3,1 t3 t7 S3,2 t4 S3,3 t8 16~28__30 6~16__24 0~5__6 5 8 11 2 5 8 4 6 10 17~25__30 17~20__30 0~5__7 7~10__13 5 12 16 3 8 11 sb=0 D=30 3 6 8 11~15__17 0~9__16 18~23__30 5 8 14 4 8 11 3 5 9
0 10 20 30 5 S1,1 t5 t2 S1,2 t6 2 S2,1 t1 S2,2 t9 tentety t1 t2 t3 t4 t5 t6 t7 t8 t9 1 S3,1 t3 t7 COST=5*2+2*2+1*4=18 S3,2 t4 S3,3 t8 16~28__30 6~16__24 0~5__6 5 8 11 2 5 8 4 6 10 17~25__30 0~5__7 7~10__13 5 12 16 3 8 11 sb=0 D=30 3 6 8 18~26__30 11~15__17 0~9__16 18~23__30 5 8 14 4 8 11 3 5 9
O(n+e)~O(n^2) IC-PCP=O(n^2) O(n) O(n^2) O(n-1) Time complexity O(m*n)=O(n^2)
Time complexity O(n+e)~O(n^2) Call PLANNING(G(T,E)) O(n^2) O(n^2) IC-PCPD2=O(n^2) Assign subdeadline on PCP node O(n)
Algo1 IC-PCP Algo2 IC-PCPD2 Algo3 IC-LOSS Fastest schedule : scheduling each workflow task on a distinct instance of the fastest computation service, while all data transmission times are considered to be zero MF =makespan of the Fastest schedule deadline factor α set the deadline = α・MF -Since the problem has no solution for α = 1, we let α ranges from 1.5 to 5 in our experiments, with a step length equal to 0.5 evaluation Cheapest schedule : scheduling all workflow tasks on a single instance of the cheapest computation service normalize the total cost of each workflow execution
Algo1 IC-PCP Algo2 IC-PCPD2 Algo3 IC-LOSS evaluation 1>2>3 1>2>3 1≈2>3 1>2>3 2>1>3 1>2>3 1>2>3 1≈2>3 1>2>3 2>1>3
The new algorithms consider the main features of the current commercial Clouds such as on-demand resource provisioning, homogeneous networks, and the pay-as-you-go pricing model • The time complexity of both algorithms is O(n2), The polynomial time complexity makes them suitable options for the large workflows • IC-PCP outperforms both, IC-PCPD2 and IC-Loss in most cases • experiments show that the computation times of the algorithms are very low, less than 500 ms for the large workflows • intend to improve our algorithms for the real Cloud environments Conclusions
The end Thanks~