130 likes | 297 Views
Web Prefetching. Lili Qiu Microsoft Research March 27, 2003. Overview of Prefetching. Over 40% cache misses are compulsory misses [PQ00] Client response time DNS query + connection establishment + server process time + data transfer + client rendering time Prefetching DNS name resolution
E N D
Web Prefetching Lili Qiu Microsoft Research March 27, 2003
Overview of Prefetching • Over 40% cache misses are compulsory misses [PQ00] • Client response time • DNS query + connection establishment + server process time + data transfer + client rendering time • Prefetching DNS name resolution • Idea: send a DNS query before a requests • Cost: higher load on the network and DNS servers • Prefetching TCP connections • Idea: establish a TCP connection before a request • Cost: higher load on the network, client, server • Prefetching HTTP responses • Idea: transfer web content before a request • Cost: higher load on the network, client, server
Questions in Prefetching • What are performance metrics? • Response time, network bandwidth, server load, client load • Who does prefetch? • Client-based: based on requests from a single client • Server-based: based on requests from all clients to a single server and web content (e.g., hyperlink structure) • Proxy-based: based on aggregated clients’ requests to aggregated servers • What to prefetch? • DNS query • TCP connection • HTTP responses • Entire content • A part of content (e.g., prefix) • Meta data (e.g., HEAD request)
Questions in Prefetching (Cont.) • How to predict? • Content-based: hyperlink structure, topics, … • Access-based: popularity, request sequence, … • How to prefetch? • How to coordinate transfers of prefetched content vs. requested content? • How to allocate cache space to prefetched content vs. requested content? • How to get & use feedback • Notify proxy/server when there’s a cache hit
Prediction Algorithms • Letizia [Lie95] prefetch the documents pointed by the requested document • The depth of hyperlink to follow is controllable • Prefetch all or part of hyperlinks (e.g., appear earlier in the page or more popular)
Prediction Algorithms (Cont.) • Predictive prefetching [PM96] • Dependency graph with weight being the ratio of # accesses to B within a window after A to # accesses to A • When A is accessed, prefetch B if the associated edge has a weight larger than a threshold • Other optimizations • adjust weights using aging, and consider time window in which data is useful A C B D H I C D D
Prediction Algorithms (Cont.) • Prefetching between low-bandwidth clients and proxies [FJCL99] • Prediction by Partial Matching (PPM) • m: # accesses used to predict • l: # steps the algorithms tries to predict • t: threshold used to weed out candidates A C B C E D G B J D E C H I C K L D F D D D D F F E E
Prediction Algorithms: PPM (Cont.) • Implementation variations • No proxy notification upon browser cache hits reduces prediction accuracy by 5% • Prefetching without knowledge of the content of browser caches increases wasted transfer • Limiting the size of history structure reduces prediction accuracy slightly
Transfer Prefetched Content • Prefetching has serious performance impacts even under perfect prediction [CB98] • Assumption: prefetching imposes significant network load • Straightforward prefetching may increase traffic burstiness increase network queue and thus network delay • Prefetching under rate control helps to smooth traffic (compared to w/o prefetching), and reduce response time • Important to schedule prefetched content vs. requested content appropriately
Content Distribution • Prefetching can be applied in combination with caching and content distribution • CDN has similar flavor as caching, with the following differences • CDN: server oriented • Improve performance of requests to a server • Ease cache updates • Caching: client oriented • CDN research • Server selection: locality + server load + content availability • Redirection techniques: DNS-based redirection, HTTP redirection, URL rewriting, … • Server/content placement [QPV01]
References • [CB98] M. Crovella and P. Barford. The Network Effects of Prefetching, February 7, 1997. In Proc. of IEEE INFOCOM '98. Available at http://cs-www.bu.edu/faculty/crovella/paper-archive/infocom98.ps • [FJCL99] Li Fan, Quinn Jacobson, Pei Cao and Wei Lin. Web Prefetching Between Low-Bandwidth Clients and Proxies: Potential and Performance. In Proc. Of SIGMETRICS'99. Available at http://citeseer.nj.nec.com/fan99web.html • [KR01] Bala Krishnamurthy and Jennifer Rexford. Web Protocols and Practice, HTTP/1.1, Networking Protocols, Caching, and Traffic Measurement. Addison Wesley, 2001. • [Lie95] H. Lieberman. Letizia: An Agent that assists Web Browsing. In Proc. International Joint Conference on Artificial Intelligence, Aug. 1995. Available at http://lieber.www.media.mit.edu/people/lieber/Lieberary/Letizia/Letizia.html
References (Cont.) • [PM96] Venkata N. Padmanabhan and Jeffrey C. Mogul. Using Predictive Prefetching to Improve World Wide Web Latency. ACM Computer Communication Review, 26(3):22-36, Jul. 1996. Available at http://www.research.microsoft.com/~padmanab/papers/ccr-july96.ps • [PQ00] Venkata N. Padmanabhan and Lili Qiu. The Content and Access Dynamics of a Busy Web Site: Findings and Implications. In Proceedings of SIGCOMM 2000, Stockholm, Sweden, August 2000. Available at http://research.microsoft.com/~liliq/papers/pub/sigcomm2000.ps • [QPV01] Lili Qiu, Venkata N. Padmanabhan, and Geoffrey M. Voelker. On the Placement of Web Server Replicas. In Proceedings of INFOCOM 2001, Anchorage, AK, USA, April 2001. Available at http://research.microsoft.com/~liliq/papers/pub/infocom2001.ps