A Probabilistic QoS Model and Computation Framework for Web Services-Based Workflows

A Probabilistic QoS Model and Computation Framework for Web Services-Based Workflows San-Yih Hwang, H. Wang, J. Srivastava National Sun Yat-sen U., Taiwan Univ. of Minnesota, USA

Overview • Introduction • QoS metrics and modeling • WS-Workflow QoS Computation • Performance evaluation • Conclusions

Web Service based workflows • Web Service: a modular and self-described application that uses Web technologies to interact with other services. • Workflow: a process by which a series of activities are executed in a specific sequence. • WS-Workflow (composite web service): a workflow of activities, each of which is wrapped as a web service. • e.g., a “Travel Planner” may aggregate multiple Web services for flight booking, travel insurance, accommodation booking, car rental, etc. • Quality of Service (QoS): non-functional measures of a service, which often decides the satisfaction of a user toward the service .

The problem • The goal is to compute the QoS measures of a WS-workflow from those of its constituent web services. • Four instance_level QoS metrics are considered: • Response time: the time elapsed from the submission of a request to the time the response is received. • Cost: amount of money paid. • Reliability: the probability that the service can be successfully delivered. • Fidelity: reputation rating. • How do we represent a QoS measure of a web service? • A single value, used by most of the previous researches • A probability distribution, adopted by us

The problem (Cont.) W (parallel) • Why not using a single value for a QoS measure of a web service? • Instance-level QoS measures are inherently probabilistic. • Choosing a single value for a QoS measure (e.g., average case) may yield incorrect result. A1 A2 Response time of A1: N(10, 10) Response time of A2: N(10, 10) Average response time of W is NOT 10.

The problem (Cont.) W (conditional) • Why not using a normal distribution for modeling each QoS measure of a Web service? • Some QoS measure may not follow normal distribution. • Even if the QoS measures of all activities follow normal distribution, the QoS of a WS-workflow may not follow normal distribution, e.g., parallel, conditional selection A1 0.5 0.5 A2 Response time of A1: N(10, 5) Response time of A2: N(20, 5)

Probabilistic Modeling of WS QoS • A QoS measure of a web service is modeled as a discrete random variable. • Probability Mass Function (PMF) • Let the sample space of X be Dom(X), then • e.g. Suppose the probabilities of a web service wbeing completed in one, two, and three days, are 0.2, 0.6, and 0.2, respectively. The PMF of its response time is: • fresponse_time(w)(1)=0.2 • fresponse_time(w)(2)=0.6 • fresponse_time(w)(3)=0.2

WS-workflow QoS framework WS invocation log Web services WS-workflow QoS Framework WS selection WS-workflow QoS Model WS-workflow QoS Computation WS-workflow enactment WS SLA spec WS-workflow QoS Objective Spec WS-workflow QoS Monitoring invokes owner user

Computing QoS Values of WS Compositions • A structured workflow can be constructed recursively by the following 5 basic constructs. • Sequential • Parallel • Conditional • fault-tolerant • Loop

Sequential Cost: Time: Reliability: Fidelity … a1 a2 an

Parallel a1, a2, …, an are executed concurrently.

Conditional Only one of the activities is executed. ,

fault-tolerant All the activities are executed concurrently, but only one of them need to be finished. is the probability that ai is finished first.

Loop • We model the number of iterations as a discrete random variable, e.g. 1 time with probability 0.3, and 2 times with probability 0.7. • It can be converted to a equivalent conditional structure.

Operations of discrete random variables • Basic operations: • Sum • Multiplication • MAX • MIN • Conditional selection • Let X, Y be independent random variables: • Dom(X)={x1, x2, …, xm} , PMF: fX(x) • Dom(Y)={y1, y2, …, yn} , PMF: fY(y)

Z=X+Y PMF: Dom(X)={1, 2}, fX(1)=0.3, fX(2)=0.7 Dom(Y)={10, 20}, fY(10)=0.4, fY(20)=0.6 Dom(Z)={11, 12, 21, 22} fz(11)=0.3*0.4=0.12, fz(12)=0.7*0.4=0.28 fz(21)=0.3*0.6=0.18, fz(12)=0.7*0.6=0.42 Note: Multiplication is similar, except that z is the product of some x and y.

Z=Max(X, Y) Dom(Z) = Dom(X)Dom(Y) Note: MIN operation is very similar to MAX.

Conditional selection : exclusive selection of a random variable according to the associated probabilities. pi : the probability that Xi is selected.

An example WS-workflow

Tree representation of a workflow • A structured workflow can be represented by a tree. • A composite activity can be substituted by an equivalent atomic activity. • Use bottom-up method to compute the Workflow QoS.

Sample space reduction • When combining the random variables, the sample space size of the resultant variable may increase dramatically. • To keep the sample space at a reasonable size after some operation, we have to group the elements in the sample space. Each group is represented by one value. • Find an optimal grouping scheme which minimizes the error.

The optimal solution • Dynamic programming: • Let e(i, j, k) be the optimalaggregate error of partitioningxi, xi+1, …, xj into ksubsequences. • Recursive function: if j-i+1>k and k>1 e(i, j, k) = 0 if j-i+1=k e(i, j, 1) = error(i, j). • Time complexity: O(s3m2), where s is the number of elements in the original domain and m is the number of elements in the domain after reduction.

Heuristic method • Algorithm: • Find the pair of adjacent elements in the domain which has least error when merged. • Replace the two elements by a new element. • Repeat until the reasonable size is reached. • Priority queue can be used to find the pair of samples with least error in O(logs) time. • Time complexity: O(slogs)

Performance evaluations • Sample space size reduction • Performance metric: cumulative distribution function (CDF). The CDF of Z, denoted as FZ(z), is defined as Pr(Zz) • The following figure shows the difference of CDFs of each method (DP and greedy methods) and the theoretical value when the size of PMF is reduced from 400 to 30. • Mean error: • DP: 0.001494, Greedy: 0.002136

Response time accuracy • Computation of response time of the experimental WS-workflow “PC order fulfillment”.

Response time accuracy • The following figure show the difference between the CDF of the greedy method and simulation result. • Maximum error: 0.008 • The greedy method is a thousand times faster than the simulation.

Cost accuracy

Fidelity accuracy

Related work • Estimating the QoS measures of a WS-workflow by assuming a single QoS value for each web service. [Menace02] [Cardoso02] [Zeng03]. • Estimating the QoS measures of a WS-workflow using simulation [Gillmann02][Cardoso02] • Project management (CPM/PERT) for estimating response time

Summary • We propose a probability-based WS-workflow QoS model and its computation framework. • The computation framework is efficient and accurate.

Ongoing work • Online QoS monitoring • Online QoS estimation for a WS-workflow instance • Pre-computation is needed for efficiency. • Defining and ensuring QoS objective • A QoS objective is a 5-tuple. E.g., (“PC order fulfillment”, “response time”, “no larger”, “7 days”, 90%) • Define a checkpoiont for each atomic web service.

Ongoing work • Web service selection • Each activity has a set of candidate web services that provide the same function but different QoS measures. • Choose a web service for each activity such that some constraints are satisfied and the goal is optimized. Both constraints and the goal are specified in a probabilistic manner • E.g., The probability that the entire WS-workflow can be completed in 10 days is at least 90%.

A Probabilistic QoS Model and Computation Framework for Web Services-Based Workflows