460 likes | 564 Views
Measurement Algorithms: Bloom Filters and Beyond. George Varghese University of California, San Diego. Network Evolution?. Basic : stateless, transparent. Tools: protocol design (e.g., soft-state) 2. Active : customizable, re-configurable Tools : Code Safety (e.g., sandboxing)
E N D
Measurement Algorithms: Bloom Filters and Beyond George Varghese University of California, San Diego
Network Evolution? • Basic: stateless, transparent. • Tools: protocol design (e.g., soft-state) • 2. Active: customizable, re-configurable • Tools: Code Safety (e.g., sandboxing) • 3. Introspective: pattern detection/response • Tools: Streaming algorithms, statistical inference (e.g. Bloom Filters, sampling) Hawkeye enables introspection for measurement & security
What is Introspection? Detecting patterns in data traffic, either in real-time or based on packet logs. Examples: • Measurement Introspection: Identify resource usage patterns for better resource management • Security Introspection: Identify attack patterns to mitigate or prevent attacks. • Fault Introspection: Identify fault or anomaly patterns to allow automated fault repair. Motivated by market pull and technology push
Market Pull reroute or add B/W Customer Site 2 Customer Site 1 Customer Site 3 • Better ROI: Optimize network resources (BGP policy, OSPF weights, light up fibers, add bandwidth) based on resource usage patterns. • Better security: Allowing organization to be open for business during mass or targeted attacks is major differentiator. • Better Fault Detection: Many performance anomalies can be detected by better measurement primitives (e.g., Goldman-Sachs)
Technology Push: Streaming Algorithms and Hardware Gates • Algorithms: Recent major thrust in streaming algorithms in database, web analysis, theory, networks • Hardware: Memory accesses remain expensive (< 100) and SRAM not scaling as fast as number of connections (< 32 Mbits), but gates are plentiful. • Mapping: Many randomized streaming algorithms (e.g., Bloom Filters, Min-wise hashing) developed to find patterns in disk logs map well to network ASICs. • Opportunity: Invent or adapt streaming algorithms for networking patterns.
Concerns about Network Introspection • Speed: Can hardware run fast enough? • Recall IP lookups in 1990’s, surprisingly complex things (branch predictors, TCP Offload) being done routinely today. • Most of the algorithms described below are being implemented at 24 Gbps in Hawkeye • Inflexible: Hardware not easy to change. • Design hardware to identify useful “primitive” patterns that can be combined. (Exactly what Hawkeye does) • Network Processors can offer flexibility & speed. • End-to-end argument: Not simple, stateless core. • Not required for correctness of basic forwarding, but only as an optimization or value-add.
Introspection as Pattern Detection ROUTER S1 S2 S2 S5 S2 S1 • Within Packet Patterns: Prefix matches, classification, signature detection (e.g., Code Red Payload) • Across Packet Patterns: Scheduling, Timing, Membership Checks Heavy-hitters, large flows, partial completion, counting flows
Pattern Detection Algorithm Requirements • Low memory: On-chip SRAM limited to around 10-32 Mbits. Not constant but is not scaling with number of concurrent conversations. May need to replicate. • Small processing: For wire-speed at 40 Gbps, using 40 byte packets, have 8 nsec. Using 1 nsec SRAM, 8 memory accesses. Factor of 30 in parallelism buys 240 accesses.
Talk Outline • Part 1: Motivation • Part 2: Basic Patterns and Algorithms (membership checks, heavy-hitters, many flows, partial completion) • Part 3: Combining patterns to solve useful application problems • Part 4: Conclusions.
Pattern 1: Membership Check Membership Check: In a measurement interval, (e.g., 10 minutes) detect the flows (e.g., sources) on that belong to a pre-specified set (e.g., black list) S2 S6 S2 S5 S1 S8 Set contains only S2, S5 B. Bloom, Comm. ACM, July 1970
Hash 1 BitMap Stage 1 Equal to 1 ? Hash 2 Field Extraction Stage 2 Equal to 1 ? Hash 3 ALERT! If all bits are set Stage 3 Equal to 1 Membership Check via Bloom Filter Set
Trivial Bloom Filter Analysis Assume set of size 1000. Bound probability that a flow F not in set gets through 4 stages of size 10000 each. • Why trouble?:F can pass a stage if it hashes to a bit set by some real member of the set. • Single stage probability: At most 1000/10,000 buckets can have set bits. Thus probability F passing a stage is less than 1000/10,000 = 0.1 • Multistage probability: To be branded, F must beunlucky in all 6 stages with a probability of no more than0.16 which is very small. Can play with numbers
Accurate Bloom Filter Analysis Assume set of size 1000. Bound probability that a flow F not in set gets through 4 stages of size 10000 each. Previous analysis ignores bit collisions • Single stage probability: Probability of F passing a stage is s = (1 – (1-1/10,000)^1000) = 1 – e^{-0.1} • Multistage probability: To be branded, F must beunlucky in all 6 stages with a probability of no more than s6 which is very small.
Applications • Replacement for a hash table: useful when storage is important, identifiers are long, false positives are acceptable, & membership check suffices • Example 1: String Matching: exact strings of up to 4000 strings of 40 bytes each using only on-chip SRAM. • Example 3: Reporting
Example 1: String Matching String Database to Block Anchor Strings Multi Stage Filter ST0 A0 ST1 A1 ST2 A2 Hash Function STn An Sushil Singh, G. Varghese, J. Huber, Sumeet Singh, Patent Application
String Matching Continued: String Grouping ST0 A0 Hash Bucket-0 A1 ST1 ST2 A2 Hash Bucket-1 Hash Function STn An Hash Bucket-m
String Matching Continued: Bit Trees Strings in a single hash bucket A8 ST8 0 ST2 A2 1 ST11 1 A11 A8 0 ST8 LOC L2 ST11 0 A11 ST2 A2 1 ST17 A17 1 ST17 A17 0 LOC L1 LOC L3 0 ST8 L2 0 1 ST11 L1 0 ST17 1 L3 1 ST2
Example 3: Scalable Reporting (Carousel, NSDI 2010) • Problem: When a worm breaks out, how do we report all infected machines. Logging packets w. pattern can result in millions of sources and many duplicates • Solution: Use a sampled Bloom filter and a more bit. Start with no sampling. Any source IP in a worm packet is reported and placed in Bloom filter of size B to suppress duplicates. Stop when B are reported and set “more” bit. • Recursive Solution: If “more” bit repeat algorithm twice for LSB of Hashed (SourceIP) = 0 and 1. If still more repeat it four times. Nearly optimal solution.
Timed Bloom Filters • Question: How can we add notion of time to Bloom Filters without lots of memory? • Solution: Use 2 Bloom Filters, Old and New. • Insert: Insert into New • Search: Search in bothNew and Old • Age every T seconds: Old := New; New:= Empty • Property: Any entry not refreshed for 2T is deleted. An entry refreshed within T is in. U.S Patent Application: Paul Owen & Andy Fingerhut et al
Pattern 2a: Heavy-hitters with Threshold Heavy-hitters: In a measurement interval, (e.g., 10 minutes) detect the flows (e.g., sources) on a link that send more than a threshold T (say 1% of the traffic) on a link. S2 S6 S2 S5 S1 S2 Source S2 is 30 percent of traffic sequence Estan,Varghese, ACM TOCS 2003
Hash 1 Counters Stage 1 Equal to T? Hash 2 Field Extraction Stage 2 Equal to T? Hash 3 ALERT! If all counters above threshold T Stage 3 Equal to T? Heavy Hitters with Multistage Filters Increment
Multistage filters in Action Counters . . . Threshold Grey = other flows Stage 1 Yellow = small flow Green = large flow Stage 2 Stage 3
Multistage Filter Analysis Assume 1 percent threshold. Bound probability that a flow F of 0.1 % or less gets through 6 stages of size 1000 each. • Why trouble?:F can fall into a ``hot'' bucket if and only the sum of traffic of all other flows in that bucket is morethan 0.9 % • Single stage probability: At most 100/0.9 = 111 bucketsthat can be over 0.9 % before we bring on F. Thus probability Ffalls in a ``hot'' bucket is less than 111/1000 = 0.111 • Multistage probability: To be branded, F must beunlucky in all 6 stages with a probability of no more than0.1116 which is very small. Thus at most 1000 false positiveswith very high probability.
Pattern 2b: Top K heavy-hitters Heavy-hitters: In a measurement interval, (e.g., 10 minutes) detect the flows (e.g., sources) on a link that are the top K talkers on a link. S2 S1 S2 S5 S1 S2 Source S2 and S1 are top K talkers Bonomi, Prabhakar, Zhang, Wu, Cisco Internal
Two simpler proposals • SIFT (Prabhakar) and Sample-and-Hold (Estan-Varghese) both suggest sampling a packet with small probability p. Once sampled, place in CAM and watch all packets • Idea: large flows are more likely to be sampled, and then we get exact counts • Problem: CAM quickly gets muddied with mice and then elephants can be lost.
Pattern 3: Partial Completion Partial Completion: In a measurement interval, detect the flows (e.g., destinations) which have several Start Packets (e.g., SYN) without the corresponding End (e.g., FIN). SYNx SYNY SYNz FINY SYNx SYNx FINZ Destination X has 3 partial completions in sequence Kompella,Singh,Varghese, IMC 2003
Hash 1 Counters Stage 1 Equal to T? Hash 2 Field Extraction Stage 2 Equal to T? Hash 3 ALERT! If all counters above threshold Stage 3 Equal to T? Partial Completion Filters Increment for SYN, Decrement for FIN
Analysis 1: Benign but Malformed Connections FINx SYNx Long Lived Connection Interval 1 Interval 2 Interval 3 Interval 4 SYNy Retransmissions FINz Retransmissions Model benign but malformed connections as adding extra SYN or FIN to an interval with probability 0.5
Analysis 2: using Gaussian approximation Probability of false positives = 0.0013 Probability Probability of false negatives = 0.0013 Greater than 6 Counter Values
Pattern 4: Many Flows Many Flows: In a measurement interval, find if number of tuples exceeds a threshold. S2 S6 S2 S5 S1 S2 6 packets but only 4 distinct sources Estan, Fisk, Varghese, IMC 2003, ACM TONS to appear
Simple Bitmap counting 1 1 1 1 1 1 1 F Hashbased on flow identifier Estimate: based on the number of bits set Problem: bitmap takes too much memory to count a large number of flows
Sampled Bitmap counting 1 1 Solution:keep only a sample of the bitmap Estimate: scale up sampled count Problem: inaccurate if too few or too many flows
Multi-resolution Bitmap counting 100-1000 10-100 1-10 flows Solution: multiple bitmaps, each covering a different range Estimate: use first bitmap that has less than 93.1% of its bits set, count, scale
Scalable Bitmap counting At time 0, start with scale = 1 1-10 flows Solution: one bitmap with an additional scale factor that is increased when all bits are set Estimate: count bits, correct, multiply by , scale factor. Can count to 1 million using 15 bit scale factor and 32-bit vector 100-1000 Later use with scale = 100
Scaled Multi-resolution Bitmap counting 100-1000 Scale = 5 10-100 Scale = 8 1-10 flows Scale = 2 Solution: multiple bitmaps, each covering a different range but each with a scale factor Estimate: use first bitmap that has less than 93.1% of its bits set, count, scale F. Shahid et al, U.S. Patent Application
Pattern 4: Concurrent Approximate State Machines State Machine: In a measurement interval, detect the flows which hit a specified state machine (Bloom filter is a special case where state machine is a membership check) Bx IY X Y Px x Bx Flow X has two packets in B frame Bonomi, Mitzenmacher, Panigraphy, Singh, Varghese 2006
Concurrent State Machines • Implementation: We know 3 good implementations. The best of these uses a good hash table implementation (d-left) and simply substitutes the identifier of a flow with a smaller signature for the flow. • Results: For 64 K flows, we need roughly 1 Mbit of memory. • Applications: First, for video congestion control. We found good results by dropping B-frames during congestion and then tail-dropping till the next I-frame. Can handle twice the loss rates with same quality. Second, for P2P identification.
Outline of Talk • Part 1: Motivation • Part 2: Basic Patterns and Algorithms • Part 3: Combining base patterns to solve useful application problems (traffic matrix, DoS, worms) • Part 4: Conclusions.
Application 1: Traffic Matrix reroute or add B/W Customer Site 2 Customer Site 1 Customer Site 3 ISP • Each entry router uses a multistage filter on traffic to destination prefixes to isolate subnets to which there is large traffic. • Aggregating across all entry routers gives the “dominant” part of traffic matrix. ATT reports 80-20 rule for prefixes.
Application 2: DoS Attacks • Bandwidth attacks: (e.g.. Smurf). Pound victim with large traffic of certain type. • Use heavy-hitter pattern relative to traffic type (e.g., ICMP) to find attacked destinations • Partial Completion attacks: (e.g., TCP SYN-Flood). May not be unusual bandwidth but characterized by partial connections. • Use partial completion pattern as a front-end for Riverhead Guard module in Jaffa.
Application 4: Worm Detection New Victim Infected 1 Inactive Address Infected N ISP • Manual signature extraction: slow and enormous effort for each new worm. • Automatic signature extraction of a specific worm by automatically detecting an abstractworm Sumeet Singh, G. Varghese, C. Estan, S. Savage, OSDI 2004, more in next class
Abstract Worm Definition and Detection • F1, Content Repetition: Payload of worm is seen frequently at router. • Use heavy-hitter pattern with hash H of content as index. • NetSift used large multistage filters. A variant of elephant traps invented by John Huber and Sumeet Singh seems to be the best solution for Hawkeye. • F2, Increasing Infection Levels: Same content is disbursed to increasing number of distinct source-destination pairs. • Use many flows pattern with content hash H as index
Hashing Implementation • Need a hash function, especially for content, that is easy to compute and random. • NetSift used a Rabin hash function but that requires multiplies. For Bloom Filters can make 1 large hash and take portions/stage • Much nicer hash function using Galois multiplication (Xor and Shift)
Conclusions • Introspection/Pattern detection can be useful for the next generation of networks. Beyond faster-cheaper • Can implement base patterns at high speeds. • Base patterns can be combined to solve useful application issues (traffic matrix, DoS, worms, etc.) • Only scratching surface: need to build a library of patterns.
Introspection at UCSD Ramana Kompella Cristian Estan Sumeet Singh