1 / 22

DotSlash: Handling Web Hotspots at Dynamic Content Web Sites

DotSlash: Handling Web Hotspots at Dynamic Content Web Sites. Weibin Zhao Henning Schulzrinne {zwb,hgs}@cs.columbia.edu Department of Computer Science Columbia University Global Internet 2005 March 19, 2005. Web Hotspots. Web Server. Internet. A well-identified problem

Download Presentation

DotSlash: Handling Web Hotspots at Dynamic Content Web Sites

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DotSlash:Handling Web Hotspots at Dynamic Content Web Sites Weibin Zhao Henning Schulzrinne {zwb,hgs}@cs.columbia.edu Department of Computer Science Columbia University Global Internet 2005 March 19, 2005

  2. Web Hotspots Web Server Internet • A well-identified problem • Flash crowds, the Slashdot effect • 15 minutes of fame • Examples • Slashdotting, featured Google search, special events, breaking news, … DotSlash

  3. The Challenge • Short-term dramatic surge of request rate • Large & quick increase • Last for a short period • Existing mechanisms are not sufficient • Capacity planning, CDNs • Good for long term, not cost-effective for hotspots • Caching • Not fully controlled by origin server • Service degradation, admission control • Last resort, not user friendly DotSlash

  4. Dynamic Content Web Sites • More vulnerable to hotspots • CPU-bound, request rate supported is low • Hard to cache dynamic content • A much harder problem • Different bottlenecks • Database server: on-line bookstore (Amazon) • Web server: auction (eBay), bulletin board (Slashdot) • Caching & consistency control DotSlash

  5. Our Approach • DotSlash counteract the Slashdot effect • Rescue system • Triggered automatically when load spikes • Mutual-aid model: for different web sites • Cost effective: for rare events • Automated rescue process • Self-configuring: build an adaptive distributed web server system on the fly • Techniques: service discovery, dynamic virtual hosting, adaptive overload control, dynamic script replication DotSlash

  6. Rescue Relationship rescuing S3 S2 S7 S4 S1 S8 S6 S5 • Can provide rescue to multiple servers: S3 • Can get rescue from multiple servers: S1 • Cannot provide/get rescue simultaneously • Origin Server: S1, S2 • Rescue Server: S3, S4, S5, S6 DotSlash

  7. Service Discovery • DotSlash directory services • Enable web servers from different sites to learn about each other: register/query • Built upon mSLP (Mesh-enhanced Service Location Protocol): replicated Directory Agents (DAs) • Discover mSLP DAs • dot-slash.net DNS domain • DNS SRV for dot-slash.net • query_name=_slpda._tcp.dot-slash.net, query_type=srv DotSlash

  8. Workload Monitoring • Bottlenecks & Metrics • Network (static content): outbound HTTP traffic • CPU (dynamic content): /proc/stat • Moving average filter • Load regions • Desired • Configurable: [40%, 60%] • Trigger rescue actions Heavily loaded region Desired load region Lightly loaded region DotSlash

  9. DotSlash Rescue Protocol • Application level request response • Requests: SOS, RATE, SHUTDOWN • SOS: initiate a rescue • origin  rescue • RATE: adjust allowed redirect (data) rate • rescue  origin • SHUTDOWN: end a rescue • origin  rescue DotSlash

  10. Rescue Control Request more rescue SOS Increase Pr Decrease Pr Get rescue Release rescue Normal Provide rescue Shutdown last rescue Rescue Increase Rr Decrease Rr Provide more rescue Shutdown some rescue DotSlash

  11. Request Redirection • Origin server • Offload client requests to rescue servers • Two-level redirection • DNS-RR • Add/remove rescue server IP addresses via dynamic DNS update • HTTP redirect • Use rescue server aliases • Don’t redirect requests from rescue servers • Redirect policies • WRR based on rescue server capacity DotSlash

  12. Dynamic Virtual Hosting • Rescue server • Serve new content (origin server) on the fly • Alias • Generate dynamically, and register via dynamic DNS update • Mapping: request  itself / origin server • Based on the Host header in the request • Three cases • Its configured name: www.rescue.com itself • An alias:www-vh1.rescue.com(HTTP redirect)  origin • An origin server name:www.origin.com(DNS-RR)  origin • Handle expired mapping DotSlash

  13. DotSlash for Dynamic Content • Remove the web server bottleneck • Dynamic Script Replication • LAMP configuration Apache MySQL origin server database (1) Client (2) (4) (5) PHP (6) (3) (7) rescue server (8) Apache DotSlash

  14. Dynamic Script Replication • Rescue server • Map a redirected URI to a script file • Trigger 404 handler if the script file not found • Retrieve the script file • Handle file inclusions • Set query variables • Run the script by invoking native include • Origin server • If a request is from a rescue server and for dynamic content, return the script file DotSlash

  15. Handle File Inclusions • The problem • A replicated script may include files that are located at the origin server • Assume: included files under DocumentRoot • Approaches • Renaming inclusion statements • Need to parse scripts • Customized error handler • Catch inclusion errors DotSlash

  16. Implementation • Apache module, PHP extension • Dynamic DNS: dot-slash.net • Service discovery: enhanced SLP Apache Mod_dots SHM Dotsd Other Dotsd DSRP HTTP Client SLP DNS BIND mSLP DotSlash

  17. Evaluation • Experimental Setup • Linux machines: Redhat 9.0 • HC: 2 GHz CPU, 1 GB memory • LC: 1 GHz CPU, 512 MB memory • Apache: 2.0.48, DotSlash module, • PHP: 4.3.6, DotSlash extension • MySQL: 4.0.18 • Benchmark • RUBBoS (Rice U.) bulletin board • 19 scripts: 1 KB to 7 KB • 439 MB database DotSlash

  18. Increasing Max Request Rate: R Configuration: Rescue (LC) Rescue (LC) Rescue (LC) Rescue (LC) Rescue (LC) Rescue (LC) Origin (HC) Rescue (LC) DB (HC) Rescue (LC) Rescue (LC) No rescue: R=118 CPU: Origin=100% DB=45% With rescue: R=245 #rescue servers: 9 CPU: Origin=55% DB=100% 245/118>2 DotSlash

  19. Effectiveness Another Configuration: Rescue (LC) Rescue (LC) Rescue (LC) Rescue (LC) Rescue (LC) Origin (LC) Rescue (LC) Rescue (LC) DB (HC) Rescue (LC) Rescue (LC) Rescue (LC) With rescue: R=245 No rescue: R=49 #rescue server: 10 245/49=5 Comparison: Conclusion:remove web server bottleneck DotSlash

  20. CPU Utilization Control DotSlash

  21. Workload Migration DotSlash

  22. Conclusions • DotSlash framework, prototype, evaluation • Fully automated rescue system, transparent to clients • Scalable • Get 10-fold improvement (static content) • Remove web server bottleneck (dynamic content) • Future work • Remove database server bottleneck • For further information • http://www.cs.columbia.edu/~zwb/project/dotslash • WCW’04 DotSlash

More Related