180 likes | 333 Views
Studying the Impact of More Complete Server Information on Web Caching. Craig E. Wills and Mikhail Mikhailov Worcester Polytechnic Institute {cew,mikhail}@cs.wpi.edu http://www.cs.wpi.edu/~{cew,mikhail}. Presented by Mikhail Mikhailov May 23, 2000. Outline of Talk. Observations
E N D
Studying the Impact of More Complete Server Information on Web Caching Craig E. Wills and Mikhail Mikhailov Worcester Polytechnic Institute {cew,mikhail}@cs.wpi.edu http://www.cs.wpi.edu/~{cew,mikhail} Presented by Mikhail Mikhailov May 23, 2000
Outline of Talk • Observations • Proposed approach • Experiments • Methodology • Test sets • Results • Conclusions • Future work
Observations • Heterogeneous dynamic content • Monolithic pages, loss of information • Changes are predictable, can be localized • Heuristic approaches to caching (many validations)
Proposed Approach • Object classification by type and change characteristics • Preserve object identities • Object Composition (vs. monolithic approach) • Object Relationships • Piggybacking
Exp1: Methodology (content reuse) • Popular sites (100hot.com) and popular URLs (NLANR proxy logs) • Unconditionally GET HTML and embedded images each day at the same time for 11 days • Catalogue resources, compute MD5 • Analyze changes with Chunking Tool
Exp1: Test Sets (content reuse) • Cnt300 (7 NLANR logs) • Top50 (50 most popular sites, 100hot.com) • ECom (50 largest b-2-c shopping sites, 100hot.com) • Srcheng (11 top search engines) • EComQ (2 queries, top 10 EComm set) • SrchengQ (2 queries, Srcheng set)
Exp2: Methodology(eliminating validation requests) • NLANR proxy logs • For each 304 response look for a 200 response from the same server within a given window (10 sec on each side) • Focus on 304 responses for images
Exp3.1: Methodology / Results(object change characteristics) • Dynamic, Access Dependent objects (Top50, R,R,15min,R) • most of short-term changes occur immediately
Exp3.2: Methodology / Results(object change characteristics) • Dependency-based objects (SrchengQ, EComQ, same query, retrieved daily) • some changes may be attributed to dynamic/access dependent objects; further study needed
Exp3.3.1: Methodology / Results(object change characteristics) • Input Dependent objects (SrchengQ, EComQ, different queries, retrieved daily)
Exp3.3.2: Methodology / Results(object change characteristics) • Input Dependent objects (objects with cookies from Cnt300, Top50, ECom, obtain 2 cookies for each object, R-cookie1,R-cookie2)
Conclusions • Proposed techniques have potential to: • increase content reuse • reduce number of validation requests
Future Work • Combine object types and change characteristics with object relationships • Extend web server and proxy caching software to support proposed techniques
Object classification by change characteristics • Periodic (changes at regular intervals: hour, day, etc) • Dependency-based (depends on a file or DB changing) • Dynamic (different on every access, can’t be prefetched) • Access Dependent (different on every access, can be prefetched) • Input Dependent (query, cookies) • Relatively Dynamic (changes frequently) • Static (never changes) • Relatively Static (changes infrequently)