440 likes | 589 Views
Autonomous Resource Provisioning for Multi-Service Web Applications. Jiang Dejun,Guillaume Pierre,Chi -Hung Chi WWW '10 Proceedings of the 19th international conference on World wide web . Agenda. Introduction Related Work Autonomous Provisioning Evaluation Conclusion Comments.
E N D
Autonomous Resource Provisioning forMulti-Service Web Applications Jiang Dejun,GuillaumePierre,Chi-Hung Chi WWW '10 Proceedings of the 19th international conference on World wide web
Agenda • Introduction • Related Work • Autonomous Provisioning • Evaluation • Conclusion • Comments
Introduction(1/5) • Major web sites such as Amazon.com and eBay • Are not designed as monolithic 3-tier applications but as a complex group of independent services querying each other • A service is a self-contained application providing elementary functionality • database holding customer information • an application serving search requests
Introduction(2/5) • To provide acceptable performance to their customers • providers often impose themselves a Service Level Agreement (SLA) • apply dynamic resource provisioning to respect the SLA target by adding resources when violating the SLA • removing resources when possible without violating the SLA
Introduction(3/5) • An essential question in resource provisioning of multiservice web applications • select which service(s) should be (de-)provisioned such that the whole application maintains acceptable performance at minimal cost. • This is a challenge because multi-service applications involve large number of components that have complex relationships with each other
Introduction(4/5) • Two possible approaches • models the entire application as a single queuing network • Too complex to capture all services relationships • assigns a fixed SLA to each service separately • May waste resources • Only the front-end service should be given an SLA • each service should be autonomously responsible for its own provisioning • by collaboratively negotiating its performance objectives with other services to maintain the front-end’s response time • “What-if analysis”
Introduction(5/5) • The authors show the system which can effectively provision resources to both • traditional multi-tier web application • complex multi-service applications
Related Work • Resource provisioning for single-tier [1,4] or multi-tier Web applications [7,10,12,14,15,17] • In [18], model work flow patterns within multiservice applications to predict future workloads for each services component • In [15,16], the works focus on when resources should be provisioned • But allocating new resources becomes much faster now
Autonomous Provisioning – System Model(1/5) • A service • a single-tier functional service with an HTTP or SOAP interface hosted in an application server • a single-tier data service with an SQL interface hosted in a database server • Within a multi-service application • Services are commonly organized as a directed acyclic graph • assume that the services of one application are not used simultaneously by other applications
Autonomous Provisioning – System Model(3/5) • assume that some machines are always available to be added to an application • Each resource can be assigned to only one service at a time • Such resource may be a physical machine or a virtualized instance with performance isolation
Autonomous Provisioning – System Model(4/5) Global Decision Negotiation Predict
Autonomous Provisioning – System Model(5/5) • Step 1: • each service carries out “what-if analysis” to predict its future performance • If the service was assigned an extra machine or removed one • Periodically send result to parent node • Step 2: • Intermediate node selects the maximum performance gain and minimum loss among the children nodes and itself • Finally: • Root node select which service(s) to provision when the SLA is (about to be) violated
Autonomous Provisioning – Performance Model(1/3) • use an M/M/n/PS queue to capture the performance of an n-core machine • Expected Response Time: • Rserver : the average response time of the service • n : the number of CPU cores assigned to service • λ: the average request rate • Sserver : the mean service time
Autonomous Provisioning – Performance Model(2/3) • A service may also use caches to offload some of the incoming requests from the service itself. • Adding caches potentially improves response time for two reasons • First, cache hits are processed faster than cache misses. • Second, the service itself and all children nodes receive less requests, and can thus process them faster
Autonomous Provisioning – Performance Model(3/3) • The performance model calculates the caching impact on the response time as follows • R(m) : the response time of the backend server across m CPU cores • Scache : the cache service time • ρn : the expected cache hit ratio with n nodes
Autonomous Provisioning – Model parameterization(1/5) • Most of the model parameters can be measured offline or monitored at runtime. • the request rate can be monitored by the administrative tools of application servers and database servers • The cache service time can be obtained by measuring cache response time offline • But expected cache hit ratio(ρn) and mean service time(Sserver) are harder to measure
Autonomous Provisioning – Model parameterization(2/5) • Expected cache hit ratio(ρn) • Using virtual caches • Stores only metadata such as the list of object in caches and their sizes • It receives all requests directed to the service and applies the same operations as a real cache with the same configuration would
Autonomous Provisioning – Model parameterization(3/5) • Mean service time(Sserver) • Previous research works measure the service time via profiling under low workload • But authors found that while workload increases, the prediction error rate become higher • To achieve acceptable prediction results, authors apply a classical feedback control loop to adjust the service time at runtime. • The system continuously estimates the service’s response time under the current conditions and compares the error between the predicted response time and the measured one.
Autonomous Provisioning – Model parameterization(5/5) • Define a threshold as a configuration parameter • If the error rate exceeds the threshold, recomputed the service time • S’server : the corrected service time • Rserver : the latest measured response time • n : the number of current CPU cores • λ: the current request rate
Autonomous Provisioning – Resource Provisioning of service instances (1/3) • Each service reports performance promises to its parent on behalf of its children and itself: • it reports the best performance gain (loss) possible by adding (removing) a server to (from) a service of the subtree consisting of its children nodes and itself. • Assuming a service i has k immediate children services • Vi,J : the average number of service executions on service J caused by one request from service i
Autonomous Provisioning – Resource Provisioning of service instances (2/3)
Autonomous Provisioning – Resource Provisioning of service instances (3/3)
Autonomous Provisioning – Resource Provisioning of cache instances (1/3) • Provisioning cache instances is harder • it not only changes the performance of the concerned service, but also changes the traffic to its children, which in turn affects their performance • Each service periodically informs its children of the relative workload decrease (increase) it would address to them if it was given one more (one less) cache instance
Autonomous Provisioning – Resource Provisioning of cache instances (2/3) • To calculate expected traffic • EIR : expected invocation ratio = expected cache miss rate • Vi,j : the average number of service executions on service j caused by one request from service i • Wi : the request rate of node i • K : the number of predecessors in the graph
Autonomous Provisioning – Resource Provisioning of cache instances (3/3)
Autonomous Provisioning – Shifting Resource Among Services • In many cases, it can be more efficient to simply reorganize resource assignments within the application without retrieving machines from the resource pool • Vi,j may change due to an update in the application code or a change in user behavior • Shifting resource may be an oscillating behavior • To prevent it, one should define a performance threshold as the criterion for deciding whether to shift
Evaluation –Experimental Setup(1/3) • All experiments are performed on the DAS3 cluster at VU University Amsterdam. • The cluster consists of 85 nodes • a dual-CPU/dual-core 2.4GHz AMD Operon DP 280 • 4GB RAM • a 250 GB IDE hard drive • Nodes are connected with a 10Gbps LAN • the network latency between nodes is negligible • set the prediction error threshold for dynamically adjusting the service time to 3%.
Evaluation – Experimental Setup(2/3) • Author implement the local performance monitor on application server using the MBeanservletfrom JBoss. • The database server monitoring is based on performance data collected by the admin tool of MySQL. • Author developed the negotiation agent in Java using plain sockets.
Evaluation – Model Validation(1/2) • Compare predicted values with the measured response times • using the “XSLT” and “Product” services from Figure 7(c) separately • Set the SLA of each service to a maximum response time of 400ms • Initially assign one server to each
Evaluation – Comparison(1/2) • Comparison with “Analytic” • Figure 7(a) • A well known model • Set SLA of 500ms for the whole application • Analytic does not address multi-service application
Evaluation – Comparison(2/2) • Comparison with per-service SLA • Figure 7(b) • Set SLA to 500ms 12 req 16 req 23 req 12 req 25 req 16 req
Evaluation – Provisioning under varying load intensity(1/2) • Using figure 7(c) and 7(d) • Set SLA of 500ms • Workload first increases from 2 req/s to 22 req/s, then decrease back to 2 req/s
Evaluation – Provisioning under varying load intensity(2/2) 10 req 18 req 16 req 8req
Evaluation – Provisioning under varying load distribution(1/2) • Add workload to Service 2 and 3 at the same rate • At time 35, the workload of service 3 decrease while workload of service 2 increase as the same
Evaluation – Provisioning under varying load distribution (2/2)
Evaluation – Provisioning under varying load locality (1/2) • Define locality as the hit rate for a cache holding 10,000 objects • Increase workload until time 25 when the SLA was violated , and then changing the locality of service3
Conclusion • The paper takes a different stand and demonstrates that provisioning resources for multi-service applications. • Which can be achieved in a decentralized way where each service is autonomously responsible for its own provisioning • Propose to give an SLA only to the front-end service • To author’s best knowledge, no other published resource provisioning algorithm can match or outperform their approach
Comments • The paper is using physical machines as resources • In virtual machine, we can dynamic set each VM’s CPU upper bound • It means that maybe we can make the cost less than the paper • The paper is more complex than what we want to do now, but there are something we can refer to • Something like the approach of adjusting the profiling
The End • Thanks