1 / 46

A Secure,Publisher-Centric Web Caching Infrastructure

A Secure,Publisher-Centric Web Caching Infrastructure. April 19 th , 2001. Selcuk Uluagac Aravind Pavuluri. Outline. Dynamic Caching Motivation & Gemini Security Issues Incremental Deployment Design & Implementation Performance Conclusions & Discussion. Outline.

payton
Download Presentation

A Secure,Publisher-Centric Web Caching Infrastructure

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Secure,Publisher-Centric Web Caching Infrastructure April 19th, 2001 Selcuk Uluagac Aravind Pavuluri

  2. Outline • Dynamic Caching • Motivation & Gemini • Security Issues • Incremental Deployment • Design & Implementation • Performance • Conclusions & Discussion 18845-01

  3. Outline • Not Finished Yet !!  • Active Cache: Caching Dynamic Contents on The Web “ Pei Cao et al.” • A Publishing System For Efficiently Creating Dynamic Data “Arun Iyengar et al.“ 18845-01

  4. Dynamic Web Caching ? • Content generated on every request • Scripting Languages (Perl, CGI, Java,VBScript, etc.) • Personalization and E-commerce transactions • Presently not cached 18845-01

  5. General Approach 18845-01

  6. Gemini & Motivation • Drawbacks of Current Cache Infrastructure • Incapable of reporting access statistics • Not able to handle dynamic content • Loss of publisher control over the content • Not publisher centric • Solution is Gemini.. 18845-01

  7. Key Elements of Gemini Architecture • Node (Cache) • Security Architecture • Incremental Deployment Strategy Gemini • Control Plane Data Plane Consistency control Filtering Logging&Reporting Versioning QoS Sand boxed VM Access Control 18845-01

  8. Security Issues.. • The need for a new security approach??? • Active participant caches, not just end-to-end • Cache is responsible for reporting logs • Design Goals • Protect the publisher as well as the cache • Publisher decides who to trust • Publishers/clients find out about attacks eventually • The system should be incrementally deployable 18845-01

  9. Security Background • RSA (Rivest,Shamir, Adleman) • Encryption • Public Key  Private Key • Public Key Infrastructure (X.509) • Digital Signature • Verification • Certificate • Certificate Authority 18845-01

  10. A New Trust Model • Cache Authorization • Publishers explicitly specify which content a cache can generate • Cache Verification • Publishers and clients verify that authorized caches are performing correctly 18845-01

  11. Authorization & Content GenerationSteps… • PKI provides key distributions to clients, caches, publishers • Publisher’s certificate identifies its web site & PK • Certificate {P, KP,Valid, Expires, CA}Kca-1 • Publisher lists authorized caches for an object ?? • ACL: {URL,K1 K2,.. Kn,,Valid, Expires,P}Kp-1 • Publisher gives the cache: ACL, {Headers, Body} Kp-1 • Uses Pragma header field not to confuse legacy caches • Cache generates the content using the Body • Cache sends client • ACL,{URL,Cache,Client,H(Request),CurrDate,Body}Kcache-1 18845-01

  12. Authorization & Content GenerationSteps… • Client is able check the signature on ACL and verify the authorization of the cache • Client verifies • Cache is in ACL & Cache Signature is valid • Cache signature’s purpose • Tamper detection by client • ID of cache generating the content • Non-repudiation • Cache can perform access control on the content based on the demand of publisher (cookie etc.) 18845-01

  13. Verification • Client sends a feedback to the publisher regarding the misbehaving cache • Similarly, inconsistencies in cache log reporting can be detected • Publisher removes the cache from the ACL list ??? • When to question cache responses? • Publisher initiated (fake clients..) • Client initiated 18845-01

  14. Protecting the cache • Publishers may send malicious code to caches • To prevent that.. • Publisher’s code runs inside sand boxed JVM • Limited API exposed to publisher’s code • Resource restrictions using OS level controls to counter denial-of-service attacks 18845-01

  15. Incremental Deployment Strategy… Principles • Cache and document heterogeneity • Transparency to clients • Transparency to legacy caches • Proximity Leaf Cache 18845-01

  16. Discovering Gemini Documents… • Publishers explicitly notify Gemini caches about documents that have associated Gemini documents. • Notification contains • Server name • Pattern to match • Transformation • They’re piggy-backed on HTTP responses • Caches store notifications as soft state 18845-01

  17. Serving a request… 18845-01

  18. Leaf Discovery • Leaf Cache Gemini cache which translates a request for a regular document into a request for a Gemini document. • With security the leaf cache becomes the first cache that both has the proper lookup table entry and is authorized by the publisher 18845-01

  19. Scalability • Leverages thousands of legacy caches to help deliver Gemini documents • Computational burden is pushed as close to the edge of the network as possible. 18845-01

  20. Node Design & Implementation 18845-01

  21. Node Design & Implementation(cont…. ) • Platform => On top of Squid • Runtime Language => Java • Platform independent • Allows sand boxing • Partitioning of functionality • Squid Process • Look up table • Fetch Gemini Documents • Forwarding Gemini requests • Gemini Process • JVM • Security 18845-01

  22. Node Operation • Squid front end receives the request from the client • Hands the requests to Gemini process via IPC • Gemini threads begin to process (Dispatcher,Checker, Worker) • The output is signed by the worker thread and sent to client • Request is logged 18845-01

  23. Performance Evaluation • 5 to 15 times response time degradation for non-active Gemini documents • Signing the reply accounts for 90% of processing time 18845-01

  24. Performance Evaluation (cont..) 18845-01

  25. Conclusions & Discussion • Gemini addresses the Security issues in Dynamic Web Caching • Provides a node implementation • Provides a publisher centric architecture • End user performance ??? 18845-01

  26. A Publishing System For Efficiently Creating Dynamic Data Arun Iyengar et al. IBM Research T.J. Watson Research Center 18845-01

  27. Problems with Dynamic Caching At A First Glance • Several Problems With Dynamic Data Generation • Expensive to create • Overhead • Consistent update (we already know this!) • More ??? 18845-01

  28. Little Fragments… • Fragments • Objects • Atomic vs. Complex Object • Object Dependence Graph(ODG) • Dynamic Pages… • Embedded fragments automatically updated • Atomic vs. Incremental Publication • Problems ?? • 3 proposed algorithms 18845-01

  29. Publishing process • Immediate fragments • Quality controlled fragments • Trigger Monitor’s notified • Fetches new copies from source • The ODG is updated • Graph Traversal algorithms applied • Bundles of web pages are written to sink 18845-01

  30. Sample screen 18845-01

  31. Performance • Deployed in 2000 Olympic Games Web Site 18845-01

  32. Performance • Easier to design web sites • Users specifies and modifies relationships among web pages& fragments • Performance improvement • Incremental publication • Faster with 3 algorithms 18845-01

  33. Active Cache: Caching Dynamic Contents on the Web April 19th, 2201 Selcuk Uluagac Aravind Pavuluri

  34. Motivation and Active Cache • Dynamic documents constitute an increasing percentage of contents on the web • Affects the scalability of the web • No approaches presently to do Dynamic Content Caching • Solution: Active Cache….. 18845-01

  35. Brief Overview • Migrates parts of server processing on each user request to the caching proxy via “cache applets” • A cache applet is a server-supplied code that is attached with a URL • On a user request the proxy invokes the cache applet • Cache applets allow servers to obtain the benefit of proxy caching without losing the capability to track user accesses and tailor the content presentation 18845-01

  36. The Active Cache Protocol • Web server specifies association between a cache applet and a URL-named document by sending a new entity header “Cache Applet” with the document • CacheApplet: code = “code.class”, archive=“code.jar”, codebase=“codebase.url” • For security reasons, codebase of the applet has to has the same server URL as the document. 18845-01

  37. The Active Cache Protocol (cont…) • Active Cache Obligations • If a document is cached, it will either invoke the cache applet or send the request directly to the server. • If an applet’s execution fails due any reason, the request is sent to the server • If applet’s execution succeeds , the proxy will take the appropriate action based on the return value of the FromCache method • Each applet can deposit information in a log object and the proxy will send the log object back to the server. 18845-01

  38. Proxy Decides…. • Whether to cache a document • Whether to invoke the applet • Cache applet may not process every request for the document • Some requests may go the original server • What document or applet to evict from the cache at any time 18845-01

  39. Active Cache Interface • Cache applet must implement the “ActiveCacheInterface” • FromCache( user_http_request, client_ip, client_name, cache_file, new_file) • Cache Applet can only call the ActiveProxy class to perform its functions • ActiveProxy provides methods for file access, cache query, locking and unlocking as well as sending requests to the server 18845-01

  40. Active Cache Interface … Methods in ActiveProxy • Boolean is_in_cache( string url) • Public int open(string url, int mode) • Public int close(int fd) • Public int create(string url, int mode) • Public int read(int fd, byte[] buf, int size) • Public int lock(int fd) • Public string curtime() 18845-01

  41. Cache Applet Examples • Logging User Requests • Logs eventually sent to the server • Advertising Banner Rotation • Decides which banner to put according to the specifications • Access Permission Checking • Applet verifies weather the server signed the document • Client-Specific Information Distribution • www.my.yahoo.com 18845-01

  42. Security Mechanisms • Language-based Protection • ActiveProxy class implements the constraints • Java built in security measures • Prevents illegal access to information belonging to the other web servers • Resource Accounting • Proxy keeps track of an applets resource consumption in terms of storage size, disk bandwidth,network bandwidth , CPU usage and virtual memory size • Set upper limits on resources using setrlimit • Prevents Denial of Service attacks 18845-01

  43. Implementation • Extended the CERN httpd proxy • Handles each request in a separate process • Makes it easy to set limits on the resources • Implements the Active Cache Protocol and the security mechanisms 18845-01

  44. Performance • Degrades the performance at least by 50 – 75% • Increase in client latency by a factor of 1.5 to 4 • CPU becomes the bottleneck 18845-01

  45. Conclusions • Active Cache trades local CPU resources for network bandwidth savings • $6K - $10K/month for a T1 line vs. • $2K for high end Computer with sufficient CPU • Improves object hit and byte hit count from 35% and 30% to 55% and 41% respectively 18845-01

  46. 18845-01

More Related