400 likes | 635 Views
Cooperative Caching and Kill-Bots. Presented by: Michael Earnhart. On the Scale and Performance of Cooperative Web Proxy Caching By: Alec Wolman Geoffrey Voelker Nitin Sharma Neal Cardwell Anna Karlin Henry Levy. Intuitive Benefits of Cooperative Caching. Larger population
E N D
Cooperative Cachingand Kill-Bots Presented by:Michael Earnhart
On the Scale and Performance of Cooperative Web Proxy CachingBy: Alec WolmanGeoffrey VoelkerNitin SharmaNeal CardwellAnna KarlinHenry Levy
Intuitive Benefits of Cooperative Caching • Larger population • Better coverage of web objects • Higher request rate • Less bandwidth utilized to 3rd party websites • More responsible use of the Internet • Distributing the web load across several proxies - Bittorrent model.
Terms • Cacheable • Objects that are currently cacheable with current proxy technology • Ideal Caching • Caching all shared objects • Popular • The most frequently visited objects that account for 40% of all requests
Network Traces Physical connection in the network • UW - Connected to the outgoing switches • MS? • Clients • UW - 22,984 • MS - 60,233 • Destination servers • UW - 244,211 • MS - 360,586 • Duration • 168 ±18 hours
Decision Yes No Simple Cooperative Caching Algorithm Request Cached Locally Current Cached Co-op Current Returnfrom Co-op ReturnObject Retrievefrom Source
Hit Rate • Large benefits for small population • Similar shape regardless of caching • “Knee” at 2500 clients
Latency • ~0 slope • Mean is significantly higher than median • Large delays dominate (as with DNS lookups) • Can Co-op proxy help … No
Bandwidth • Caching helps preserve bandwidth • Independent of population
When is Co-op Caching Useful? Several small organizations Ideal +17% Cacheable +9% 978 is clearly losing No clear winner
Locality • Randomly fill 15 organizations of equal size • -4% hit rate compared to real organization
Large Company Co-op Caching • Popularity is universal • UW cacheable increased 4.2% • MS cacheable increased 2.1% Preloaded MS cache was used as a second levelProxy cache for UW.
Trace Data Conclusions • Cooperative caching is essentially only useful to organizations < 2500 clients • Only 2.7% hit (MS) improvement given cooperative caching for > 2500 clients • Specialize grouping cooperative caching is also ineffective
Analytical Model of Web Accesses • Long term analysis of web caching • Infinite storage per proxy • In theory this proxy setup could cache 100% of the Web objects available • Optimal caching occurs when: Creation Rate + Change Rate < Request Rate
The Model • N clients that act independently • Total number of objects is n • Zipf-like distribution where pi denotes popularity • is the average client request rate • Time between changes is exponential with parameter
The Model (Cont.) • pc is the probability that an object is cacheable • Average object size is E(S) • Average last byte latency is E(L)
Simulation Results • Initial stage, < 2500 • Request rate is dominated by changes • Middle region < 250,000 • Unpopular documents begin to hit in cache • Final region > 250,000 • Request rate dominates even fast changing objects
Latency • Hit rate determines latency • Assume 10ms response time • Asymptotically approaches(1-pc)E(L)
Change Rate • Unpopularity • 60% of requests • 99.7% of all objects • Change Rate • Large impact on unpopular objects • ≥1day interval yields nearly perfect caching of popular objects Notes 250,000 Clients Change interval determined by HTTP header When Change dominates the request rate Hit % goes to min.
Positive Conclusions • Hit rate depends • Population size • Cooperation can increase population • Rate of change • Creation of new objects • Request rate • Increased population increases request rate
Other Conclusions Cooperative Caching vs. Simple Caching • Bandwidth: No significant benefits • Object latency: No significant benefits • Specialize group: No significant benefits • Large populations >250k: No significant benefits
Discussion • Is cooperative caching useful now • Will it become more/less useful do to Web traffic trends of the future • Rate of object change • Request rate • Number of clients N • Size of the web (in terms of objects) n
Botz-4-Sale: Surviving Organized DDoS Attacks That Mimic Flash Crowds By: Srikanth KandulaDina Katabi Matthias Jacob Arthur Berger
Botz Problem • Worm viruses can spread to 30,000 clients per day • Botnets for hire has become a reality • HTTP servers need to be able to handle highly sophisticated attacks with up to and beyond 10,000 coordinated attackers
What is Kill-bots • Kernel extension to a web server • Provides • Load activated authentication • IP address admission control • Load balancing between kill-bots and HTTP service
What is CAPTCHA • A graphical puzzle used to distinguish between a human user and a computer or automatic user.
Stage 1 • Changes state when HTTP server load exceeds (K1) - SUSPECTED_ATTACK • In SUSPECTED_ATTACK CAPTCHA puzzles are served to all admitted incoming connections • Puzzle serving is done at a minimal cost • No dedicated sockets • No worker processes • 1 Test per session - for usability • Cryptographic support to validate the client • Per-cookie fairness • Limit set to 8 HTTP requests per cookie
Stage 2 • Uses a bloom filter to count up IP addresses which fail Authentication • Once ALL counters (for an IP address) in the bloom filter reach a threshold packets are dropped • This decreases server load hence the server ceases authentication when Load ≤ K2 < K1
What is Bloom Filter • Hash a value to a certain # of bits • Set those bits in the bloom filter vector • Collision resistant hash required
Admission Control • Attempt authentication with probability • Clearly AdmissionControl isrequired • Optimal AdmissionControl is highlydesired • Adaptation Required
Adaptive Admission Control • Balancing act between serving HTTP requests and authentication puzzles • Aim for point B - Difficult to identify location on BCsegment; pi = 0 • Settle for E • a fraction ofidle time
Attacks • Social Engineering • Attack: Use people to circumvent authentication • Solution: Kill-Bots puzzles expire in 4 minutes • Polluting Bloom Filter • Attack: Spoof IP Address to fill out filter • Solution: SYN cookies prevent IP Spoofing • Copy Attacks • Attack: Solve one puzzle - copy cookie to zombies • Solutions: 8 simultaneous connections per cookie
Attacks Cont. • Replay Attacks • Attack: Reuse authentication information • Solution: Cookies are time stamped and hashed • Solution: Cookies are based on the answer • Database Attack • Attack: Learn all the possible puzzles • Solutions: Use a rotating set of puzzles • Breaking CAPTCHA • Attack: Decipher the puzzles automatically • Solution: Create a different type of puzzle
Attack Strategies • a=4000 req/s • N=25,000 clients • Quick exhaust • Fresh IP - 2.5s • Slow exhaust • Fresh IP - 5.0s
Experimental Environment • Web server • 2.0GHz P4 with 1GB RAM • Hosted two websites with mathopd • Debian mirror • CSAIL web-server • 100 Mbps Ethernet connection • Attack • 100 PlanetLab nodes • 256 attackers per node (25,600 total attackers)
Metrics • Goodput of legitimate clients • # of bytes delivered to all legitimate clients • Response times of legitimate clients • Elapsed time to complete a request (<60s) • Total number of legitimate requests dropped
PlanetLab Results • CyberSlam attack • a=4000 req/s • Attack lasts 1800s • 60% of legitimate users solve puzzle correctly
PlanetLab Results Cont. • Flash Crowd (non-attack) • f=2000 req/s (norm = 300 req/s) • Kill-bots improves performance • Base - wastes throughput on retry, and incomplete transfers
Discussion • Willingness to solve puzzles? 60% • Research group’s web page - NOT a standard audience • Solving puzzles for text browsers or the visually impaired - not possible • NAT/Proxy Solution • Requires Zombies to be x-1 times as active as legitimate users • Arbitrary parameter values • Flash crowds - base server has no connection limiting. This is not realistic