230 likes | 482 Views
Analysis of Caching and Replication Strategies for Web Applications. Presented By Sudarsan Maddi Graduate Student. Authors: Swaminathan Sivasubramaniam, Guillaume Pierre, Maarten van Steen. Topics That We Will be Seeing…. Introduction Techniques to scale Web applications
E N D
Analysis of Caching and Replication Strategies for Web Applications Presented By Sudarsan Maddi Graduate Student Authors: Swaminathan Sivasubramaniam, Guillaume Pierre, Maarten van Steen.
Topics That We Will be Seeing… • Introduction • Techniques to scale Web applications • Performance Analysis • Choosing the Right Strategy Analysis of Caching and Replication Strategies for Web Applications
Introduction • In this paper the authors present qualitative and quantitative analysis of replication and caching techniques to host Web applications. • Their analysis shows that selecting the best mechanism depends heavily on data workload and application characteristics. Analysis of Caching and Replication Strategies for Web Applications
Introduction • Web sites are slow dew to many reasons, one of the main reason is dynamic generation of web documents. • Web page caching: Fragments of HTML pages the application generates are cached to serve future requests. • Content-delivery networks such as Akamai do this by deploying edge servers around the Internet thus reducing request’s network latency. Analysis of Caching and Replication Strategies for Web Applications
Introduction • Limitations of page caching have given raise to different approaches for scalable Web applications, classified broadly into: • Application code replication • Cache database records • Cache query results • Entire Database replication • In this article they have given overview of various scalable techniques compared and analyzed their features and performance. Analysis of Caching and Replication Strategies for Web Applications
Techniques to scale Web Applications • The techniques we are going to see are • Edge Computing • Data Replication • Content-Aware data Caching (CAC) • Content-Blind data Caching (CBC) Analysis of Caching and Replication Strategies for Web Applications
Edge Computing • In this the application code is replicated at multiple edge servers and data is centralized. • Akamai and ACDN use this technique. • The data centralization create problems, • If the edge servers are located worldwide, each data access incurs WAN latency. • The central database becomes a performance bottleneck if the load increases. Techniques to Scale Web Applications
Data Replication • Solution for Edge computing is to place the data at each edge server. • Database replication (REPL) techniques can help maintaining identical copies at multiple locations. Continued… Techniques to Scale Web Applications
Data Replication • The problem with this is when there is a database update. • This creates huge network traffic and performance overhead. Techniques to Scale Web Applications
Content-Aware data Caching (CAC) • Instead of maintaining full copies of database CAC systems cache database query results as the application code issues them. • Query Containment Check: The application running at the edge-server issues a query, the local database checks if it has enough data to answer the query locally. • Containment check results positive query is present locally, else its sent to central database and inserts the result in its local database. Continued… Techniques to Scale Web Applications
An Example of CAC • CAC store query results efficiently • For example: Query Q1: Select* from items where price<50 Query Q2: Select* from items where price<20 • Query template QT1: “Select* from items where price<” Analysis of Caching and Replication Strategies for Web Applications
Content-Aware data Caching (CAC) • This query containment check is highly computationally expensive because it must check the new query with all previously cached queries. • In order to reduce this cost CAC makes use of query template, which is a parameterized SQL query whose parameter values are parse at runtime • In, CAC systems update queries is always executed at the central database. Techniques to Scale Web Applications
Content-Blind data Caching (CBC) • Here, edge servers don’t need to run a database at all. • Instead they store the results of remote database queries independently. • The query results aren't merged here storing redundant information, and will have a hit only if application issues exact query, so hit rates are low Continued…. Techniques to Scale Web Applications
Content-Blind data Caching (CBC) • This have some advantages over CAC as, • Incurs very little computational load. • Caching query results as result sets instead of database records, so can return results immediately. • Finally, inserting a new element into the cache doesn't require a query rewrite. Techniques to Scale Web Applications
Scalable Web hosting. (a) edge computing, (b) content-aware caching, (c) content-blind caching, and (d) data replication.
Performance Analysis • To compare the four techniques, they have made use of two different applications, • RUBBoS, a bulletin-board benchmark application that models Slashdot.org, http://jmob.objectweb.org/rubbos.html • TPC-W, an industry-standard e-commerce benchmark that models an online book store such as Amazon.com, http://pgfoundry.org/projects/tpc-w-php/ Analysis of Caching and Replication Strategies for Web Applications
Performance Analysis • They have measured the end-to-end client latency, which is the sum of network latency and internal latency. • The results shows that CBC performed best in terms of client latency whereas EC performed the worst for RUBBoS. • Were as for TPC-W REPL performed the best and EC worst again. Analysis of Caching and Replication Strategies for Web Applications
Performance Results (a) RUBBoS benchmark (b) TPC-W Browsing (c) TPC-W Ordering Analysis of Caching and Replication Strategies for Web Applications
Choosing the Right Strategy • According to the author the Web designers should choose the scalable technique by carefully analyzing their Web application characteristics. • They have suggested the best strategy is the one that minimizes the applications end-to-end client latency. • This latency is affected by many parameters as hit ratio, database query execution time, application server execution time. • To do this they have proposed a concept called virtual caches (VC). Continued… Analysis of Caching and Raeplication Strategies for Web Applications
Choosing the Right Strategy • VC behaves just like a real cache but it stores only meta data, such as the list of objects in the cache, sizes. So this requires less memory compared to real caches. • So with the help of these VC we can get the hit ratios and execution times for servers and can estimate end-to-end latency. Analysis of Caching and Raeplication Strategies for Web Applications
Thank You. Analysis of Caching and Replication Strategies for Web Applications