720 likes | 1.15k Views
Web Caching. By Amisha Thakkar Alpa Shah. Overview. What is a Web Cache ? Caching Terminology Why use a cache? Disadvantages of Web Cache Other Features Caching Rules. Overview. Caching Architectures Comparison of Architectures Cache Deployment Scheme Client Side Cache Cooperation
E N D
Web Caching By Amisha Thakkar Alpa Shah Web Caching
Overview • What is a Web Cache ? • Caching Terminology • Why use a cache? • Disadvantages of Web Cache • Other Features • Caching Rules Web Caching
Overview • Caching Architectures • Comparison of Architectures • Cache Deployment Scheme • Client Side Cache Cooperation • Active Caching Web Caching
What is a Web Cache ? • Cache is a place where temporary copies of objects are stored • Cached information is generally closer to the requester than the permanent information is • Objects -HTML pages, images, files Web Caching
What is a Web Cache? Web Caching
Caching Terminology • Client - An application program that establishes connections for sending requests • Server- An application program that accepts connection to service requests by sending back responses • Origin Server-The server on which the given resource resides or is to be created Web Caching
Caching Terminology • Proxy- An intermediary program which acts both as a server and a client which requests on behalf of the other clients • Proxy is not necessarily a cache * Proxy does not always cache the replies passing through it * It may be used on a firewall to monitor accesses Web Caching
Whyuse a cache ? • To reduce latency • To reduce network traffic • Load on origin servers will be reduced • Can isolate end users from network failures Web Caching
Disadvantages of Web cache • With cached data there is always a chance of receiving stale information • Content providers lose access counts when cache hits are served • Manual configuration is often required • Operation of cache requires additional resources • In some situations the cache can be a single point of failure Web Caching
Other Features • Depending on the perspective the following may be good or bad * Cache requests on behalf of clients ; the servers never see the clients IP addresses * Cache provides an easy opportunity to monitor and analyze browsing activities * Cache can be used to block certain requests Web Caching
Types of Web Caches • Proxy caches * Serve a large number of users * Large corporations and ISP’s often set them up on the firewalls * They are type of shared caches • Browser caches * Use a section of the computer’s hard disk to store objects that you have seen Web Caching
Caching Rules • Rules on which caches work - * Some of them set in protocols * Some are set by cache administrator • Most common rules : * If the object is authenticated or secure it won’t be cached * Object’s headers indicate whether the object is cacheable or not Web Caching
Caching Rules * Object is considered fresh when - It has an expiry time or other age controlling directive set & is still within the fresh period If the browser cache has already seen the object & has been set to check once a session Web Caching
Caching Rules If a proxy cache has seen the object recently & it was modified relatively long ago Fresh documents are served directly from the cache without checking with the origin server Web Caching
Caching Rules * For a stale object , the origin server will be asked to validate the object , or tell the cache whether the copy is still good * The most common validator is the time that the object was last changed Web Caching
Caching ArchitecturesHierarchical /Simple Cache • Browser-cache interaction is same as browser -host interaction, i.e. a TCP connection is made & item requested • If not found send request to parent cache • Hierarchy built up - each level serving indirectly a wider community of users Web Caching
National Network National Network RegionalNetwork RegionalNetwork Institutional Network Institutional Network Institutional Network Institutional Network Caching ArchitecturesHierarchical /Simple Cache Web Caching
Caching ArchitecturesDistributed /Co-operating Cache • Decentralized(Cache Mesh) • Multiple servers cooperate in such a way that they share their individual caches to create a large distributed one • Simply put caching proxies communicating with each other to serve different users • On a cache miss, it checks with other proxy caches before contacting the origin server Web Caching
Caching ArchitecturesDistributed /Co-operating Cache • Caches communicate amongst themselves using a protocol like ICP (Internet Cache Protocol) • Caches can be selected on the basis of * Distances from the end user * Specialize in particular URLs(location hint). Web Caching
Caching ArchitecturesDistributed /Co-operating Cache • Why Distributed - limitations of hierarchy * Width of cache in hierarchy: caches at same level are inaccessible to each other * LRU policy implies sufficient disk space * Cost in replication of disk storage * Amount of disk space reqd. depends on number of users served & breadth of reading Web Caching
Caching ArchitecturesDistributed /Co-operating Cache More the users more disk space higher in the hierarchy * Exponential growth of number of documents on WWW Web Caching
Caching ArchitecturesDistributed /Co-operating Cache • Caching close to user - more effective, higher the level lower the efficiency • Can be created for load balancing • Most effective when serving a community of interests Web Caching
Caching ArchitecturesDistributed /Co-operating Cache • First an UDP packet sent for cache inquiry. • Cache selection decision is determined by RTT • Potential problem -network congestion because of UDP • In favor- * UDP exchange :2 IP packets, TCP :at least 8 packets Web Caching
Caching ArchitecturesDistributed /Co-operating Cache * UDP reply from cache can indicate a. Presence b. Speed c. Availability of requested documents Web Caching
Caching ArchitecturesHybrid Cache Note: ICP Web Caching
Comparison of Architectures • Hierarchical : caches placed at multiple levels • Distributed :caches only at bottom level; no intermediate caches Web Caching
Comparison of Architectures • Performance parameters. Connection time (Tc)is defined as the time since the document is requested & first data byte is received Transmission time (Tt)is defined as the time taken to transmit the document Total latency = Tc +Tt . Bandwidth usage Web Caching
Comparison of Architectures • Fig 3 -Connection time for different document’s popularity Web Caching
Comparison of Architectures • For unpopular documents high connection time • No of requests increases avg.. connection time decreases • For extremely popular documents distributed has smaller connection times Web Caching
Comparison of Architectures • Fig 4 Network traffic generated Web Caching
Comparison of Architectures • On lower levels, distributed caching practically double the network bandwidth usage • Around the root node in national network, the network traffic is reduced to half • Distributed caching uses all possible network shortcuts between institutional caches, generating more traffic in the less congested low network levels Web Caching
Comparison of Architectures • Fig 5 a, Not congested national network Web Caching
Comparison of Architectures • The only bottleneck on the path from the client to the origin server is the international path. Hence transmission times are similar for both Web Caching
Comparison of Architectures • Fig 5 b Congested National Networks Web Caching
Comparison of Architectures • Both have higher transmission times compared to the previous case • Distributed caching gives shorter transmission times than hierarchical because many requests travel through lower network levels Web Caching
Comparison of Architectures • Fig 6 Average total latency Web Caching
Comparison of Architectures • For large documents transmission time is more relevant than connection times • Hierarchical caching gives lower latencies for documents smaller 200 KB due to lower connection times • Distributed caching gives lower latencies for larger documents due to lower transmission times Web Caching
Comparison of Architectures • The size- threshold depends on the degree of congestion in national network • Higher the congestion, lower is the size- threshold • Distributed caching has lower latencies than hierarchical Web Caching
Comparison of ArchitecturesWith Hybrid Scheme • Fig 7 connection time Web Caching
Comparison of ArchitecturesWith Hybrid Scheme • Fig 8. Web Caching
Comparison of ArchitecturesWith Hybrid Scheme • In the hybrid scheme if the number of cooperating caches (kc) is very small , the connection time is high • When number of cooperating caches increases, the connection times decreases up to a minimum • If the number increases over the threshold , the connection time increases very fast Web Caching
Comparison of ArchitecturesWith Hybrid Scheme • Fig 9 Transmission time Web Caching
Comparison of ArchitecturesWith Hybrid Scheme • For un-congested n/w the no.of coop caches (kt) at every level hardly influences Tt • If no. of coop caches is very small , high Tt & vice -versa • If the no increases above the threshold the Tt increases • Optimum no. of caches depends on the no of caches reachable avoiding congested links Web Caching
Comparison of ArchitecturesWith Hybrid Scheme • Fig 10 Web Caching
Comparison of ArchitecturesWith Hybrid Scheme • Fig 11 total latency Web Caching
Comparison of ArchitecturesWith Hybrid Scheme • The no. of coop caches(kopt) at every level depend on the document size to minimize the total latency • For small documents the optimum no. is closer to kc • For large documents the the optimum no. is closer to kt Web Caching
Comparison of ArchitecturesWith Hybrid Scheme • Fig 12 Web Caching
Comparison of ArchitecturesWith Hybrid Scheme • For any document the optimum kopt that minimizes the total latency is such that kc koptkt Web Caching
Cache Deployment Schemes • Proxy caching Web Caching
Cache Deployment Schemes • Advantages Clients point all web requests directly to cache : no effect on non web traffic Cost of upgrading h/w & s/w is limited Administration on caches limited to basic configuration Web Caching