250 likes | 416 Views
The CoDeeN Content Distribution Network. Vivek S. Pai, Limin Wang, KyoungSoo Park, Ruoming Pang, Larry Peterson Princeton University August 12, 2003. Content Distribution Networks. Replicates Web content broadly Redirects clients to “best” copy Load, locality, proximity
E N D
The CoDeeN Content Distribution Network Vivek S. Pai, Limin Wang, KyoungSoo Park, Ruoming Pang, Larry Peterson Princeton University August 12, 2003
Content Distribution Networks • Replicates Web content broadly • Redirects clients to “best” copy • Load, locality, proximity • Offloads work from origin servers • Multiplexes load spikes • Reduces overprovisioning • Ex: Akamai, Mirror Image, Speedera CoDeeN Overview - IRIS/PlanetLab
What Does It Do? • An Academic Content Distribution Network • Redirects/caches HTTP requests • Based on our OSDI 2002 paper on CDN performance • An Open Proxy Network • Probably the largest in existence CoDeeN Overview - IRIS/PlanetLab
Who Is The Target Audience? • Now • Users wanting better performance • People seeking “anonymity” • Next • Content providers seeking load sharing • Later • General support for absorbing flash crowds • Avoid the “Slashdot Effect” CoDeeN Overview - IRIS/PlanetLab
How Does It Work? • Server surrogates (proxies) on most North American sites • Originally everywhere, but we cut back • Clients specify proxy to use • Cache hits served locally • Cache misses forwarded to CoDeeN nodes • Maybe forwarded to origin servers CoDeeN Overview - IRIS/PlanetLab
Request Forwarding CoDeeN Overview - IRIS/PlanetLab
When Will It Be Ready? • January – development started • Reliability & stability major concerns • March – stable enough for daily use • April – security problems begin • Shut down for one month • June – Restarted “beta” • Expecting “production” soon CoDeeN Overview - IRIS/PlanetLab
Decisions – Good & Bad • Use commercial proxy with API [USITS 2003] • Good – mostly layer 7 concerns • Bad – limits deployment size (donated licenses) • Deployment on PlanetLab • Good – otherwise impossible • “Bad” – vulnerable to other experiments • Allow open access • Good – generates real traffic • Bad – some traffic just plain mean CoDeeN Overview - IRIS/PlanetLab
Lots of Malicious Traffic Restrict ports & HTTP methods Multi-scale req & bw accounting • Spammers • SMTP tunnels, POST forms, IRC channels • Bandwidth hogs • Google crawls, steganographers, X-Pacific • Hackers & Spreaders • Yahoo dictionary attacks, IIS vuln tests • Content thieves • E-journals/databases, local content Signature database & Robot test Determine location & privilege CoDeeN Overview - IRIS/PlanetLab
Protecting Privilege CoDeeN Overview - IRIS/PlanetLab
Attempted SMTP Tunnels/Day CoDeeN Overview - IRIS/PlanetLab
By The Numbers… • Restarted in late May • In continuous operation • Stats from first 8 weeks • Over 59,000 unique IPs as clients • Over 24 million requests serviced • Valid rates up to 15K reqs/hour • Roughly 1 million reqs/day aggregate CoDeeN Overview - IRIS/PlanetLab
More Production Info • About 2000 lines of code • About ¼ is actual decision logic • Uptimes limited by upgrades • Generally 1-2 times/week • Downtimes of 20 seconds/node • Currently on ~40 nodes CoDeeN Overview - IRIS/PlanetLab
Daily Requests (Serviced) CoDeeN Overview - IRIS/PlanetLab
Welcome CoDeeN Overview - IRIS/PlanetLab
Avoiding sorted by # avoiding CoDeeN Overview - IRIS/PlanetLab
Load sorted by # load average CoDeeN Overview - IRIS/PlanetLab
Total sorted by # total req rate CoDeeN Overview - IRIS/PlanetLab
Users sorted by # users CoDeeN Overview - IRIS/PlanetLab
The Troubles We’ve Caused • Routinely trigger open proxy alerts • Educating sysadmins, others • Resource checks generate noise • Got onto planetlab-support • Really good honeypots • 6000 SMTP flows/minute at CMU • Spammers do ~1M HTTP ops/day CoDeeN Overview - IRIS/PlanetLab
What We’ve Learned • Parallel ssh is a must • General commands/queries • Basis for parallel scp • Used to detect out-of-date files • Monitoring is a must • Too hard to see anomalies in 40+ nodes • Almost looks like a demo • Be careful accepting outside requests CoDeeN Overview - IRIS/PlanetLab
What We Still Need • Better layer 4 tools • Hard to tell why things die • Building complete heartbeats isn’t fun • Better isolation on most resources • CPU/OS: Java, VServers, ??? • Others: FD exhaustion, disk space CoDeeN Overview - IRIS/PlanetLab
What We Wouldn’t Mind… • Customizable DNS mapping • Map project.planet-lab.org to some node • Projects could provide feedback • Node availability, utility, etc • Most IP geolocation seems locked up CoDeeN Overview - IRIS/PlanetLab
More Info http://codeen.cs.princeton.edu CoDeeN Overview - IRIS/PlanetLab