120 likes | 292 Views
Streaming Algorithms for Robust, Real-Time Detection of DDoS Attacks. S. Ganguly M. Garofalakis R. Rastogi K.Sabnani. Indian Inst. Of Tech. India Yahoo! Research USA Bell Labs India Bell Labs USA. ICDCS’07 27th international Conference on Distributed Computing Systems. Introduction.
E N D
Streaming Algorithms for Robust, Real-Time Detection of DDoS Attacks S. Ganguly M. Garofalakis R. Rastogi K.Sabnani Indian Inst. Of Tech. India Yahoo! Research USA Bell Labs India Bell Labs USA ICDCS’07 27th international Conference on Distributed Computing Systems
Introduction • Distributed Denial-of-Service (DDoS): A DDoS attack directs hundreds or even thousands of “zombie” hosts against a single victim
Introduction (cont.) • TCP-SYN flooding attack 1. SYN 3. Ack Out of Memory Crash! Fake IP × 2. SYN-Ack
Problem Formulation • A stream of flow updates: (source, dest, ±1) • Bad guy: Occur(u, v, +1) > Occur(u, v, -1) 1. SYN +1 3. Ack -1 2. SYN-Ack • Distinct source frequency fv = # of bad guys to v • Continuously track the top-k distinct source frequency destinations over the stream of flow updates
Main idea of the solution: Sampling • Directly sample from the stream? • For estimating the counts of an item: OK • For counting the number of distinct items: NO • Construct the synopsis for the stream and then sample from the synopsis a, a, a, a, a, a, a, a, a, a, b (a, 10), (b, 1)
Distinct-Count Sketch: structure • Domain of IP: [m] = {0, m-1} • (source, dest) pairs: [m2] • First level hash function h: [m2] → {0, …, Θ(logm)} with Pr[h(x) = l] = 1/2l+1 • ½ of the distinct values in [m2] mapping to bucket 0 • ¼ of the distinct values in [m2] mapping to bucket 1 • 1/8 of the distinct values in [m2] mapping to bucket 2 • Second level hash function gi: [m2] → [s] uniformly
… … … ☆ 1 1 1 1 0 0 1 ☆ ☆ ☆ ☆ Distinct-Count Sketch: structure (cont.) 0 1 2logm Θ(logm) g1(u, v) … Total element count Bit location counts s h(u, v) = b g2(u, v) Total element count: the total number of the tuples hashed into the bucket Bit location counts: the total number of the tuples hashed into the bucket with BITj(u, v) = 1 … r hash tables 1 … … gr(u, v) Binary representation of (u, v): 0 … χ[i, j, k, l]: the ith first level bucket, the jth hash table, the kth second level bucket, the lth count-signature location
Distinct-Count Sketch: maintenance • For each incoming update/tuple (u, v, ±1), update its corresponding count-signatures • For all j = 1 to r • χ[h(u, v), j, gj(u, v), 0] = χ[h(u, v), j, gj(u, v), 0] ±1 • For each l = 1 to 2logm • If BITl(u, v) = 1 • χ[h(u, v), j, gj(u, v), l] = χ[h(u, v), j, gj(u, v), l] ±1
5 6 2 6 0 0 1 7 … … (u, v) → … 1 1 1 1 0 0 1 8 8 8 8 8 0 0 8 2 0 0 2 2 1 0 3 0 2 2 0 1 2 3 0 0 bit 1 bit Top-k Frequency Estimation (u,v) = 1010 Collision • Generate distinct sample from the distinct-count sketch • Scan the first level hash table until |dSample| < (1+ε)s/16 or b ≥ 0 • Check the count-signatures • For all l = 1 to 2logm • Either Χ[b, j, k, l] = Χ[b, j, k, 0] • or Χ[b, j, k, l] = 0 • Add the (u, v) to dSample Θ(logm) g1(u, v) … s g2(u, v) … r hash tables 1 … … gr(u, v) 0 …
Top-k Frequency Estimation (cont.) • After obtaining the dSample • (a, v), (u, v), (m, v), (a, w), (b, w), (c, w), (d, w), …. • fw in dS = 4, fv in dS= 3, …
Error guaranteed • Input: Flow-update stream, k, error ε, and confidence δ • Output: continuously track a list L of k destination IP addresses and guaranteed that with probability of at least 1-δ • 1. Any destination address v in L has frequency fv≥ (1-ε)fvk • 2. For any destination address v in L, n = the upper bound on the number of update tuples in the streams
Conclusion • Seem to combine the FM sketch and the Count-Min sketch to reduce the collisions and then using BIT operations to identify the destination addresses