1 / 10

A Resource-minimalist Flow Size Histogram Estimator

A Resource-minimalist Flow Size Histogram Estimator. Bruno Ribeiro, Don Towsley UMass Amherst. Tao Ye Sprint. Flow size histogram. Internet core router: TCP flows. Flow size e.g. # of packets TCP flow Flow size histogram used: Traffic profiling Anomaly detection

clea
Download Presentation

A Resource-minimalist Flow Size Histogram Estimator

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Resource-minimalist Flow Size Histogram Estimator Bruno Ribeiro, Don Towsley UMass Amherst Tao Ye Sprint

  2. Flow size histogram Internet core router: TCP flows • Flow size • e.g. # of packets TCP flow • Flow size histogram used: • Traffic profiling • Anomaly detection • Histogram hard to obtain • TCP flows: • Hundreds of millions flows/hour (OC-48 router) • Estimating flow size histograms • Random packet sampling is inaccurate [Ribeiro et al. 2006] • Flow sampling: more memory & accurate tail needs packet sampling • Current data streaming methods have slow estimators Bruno Ribeiro, Tao Ye, Don Towsley, "A Resource-minimalist flow size histogram estimator"

  3. Outline • Related work • Our resource-minimalist approach • Experiment • Conclusions Bruno Ribeiro, Tao Ye, Don Towsley, "A Resource-minimalist flow size histogram estimator"

  4. Related work [Kumar et al. 2004] Router Packet hash collision!! Universal hash function Flow size histogram 0 1 2 1 1 0 2 0 0 Estimation phase (powerful backend server) counters hash collisions Complexity: O( (maximum flow size)3 ) Sketch phase Bruno Ribeiro, Tao Ye, Don Towsley, "A Resource-minimalist flow size histogram estimator"

  5. Resource-minimalist Approach • Insight: Don’t need to count every flow size • Idea: Group large flow sizes into bins • Fine grained flow histogram < k packets • Coarse grained flow histogram > k packets • Approach: Probablistic counting • Reduces counters to 6 bits • Requires: Low collision probability (e.g. counter/flow = 2/1) • Result: O(k3 + log(W)) estimator, e.g., k=16 and W=107 • Problem: Low collision→ more memory (2 counters / flow) • Approach: Counter folding • Negligible increase in estimator error • Requires one extra bit / counter • Result: Reduces number of counters by half Bruno Ribeiro, Tao Ye, Don Towsley, "A Resource-minimalist flow size histogram estimator"

  6. Group large flow sizes & Probabilistic counting [Morris 78] Counter increments (probabilisitc): With ma = 2ª , 6 bit counter bins up to W=1014 Hash counter p=1/m2 p=1/m1 k+2 k-1 2 k k+1 0 1 Arrived packets: … … … m2 m1 k-1 k average • Counter value k→ flow sizes = [k, k+m1-1] • Counter value k+1 → flow sizes = [k+m1, k+m1+m2-1] Bruno Ribeiro, Tao Ye, Don Towsley, "A Resource-minimalist flow size histogram estimator"

  7. Counter folding: Detecting some collisions • Maximum hash value = M • M/2 counters • If hash(packet) < M/2 → red • Otherwise (hash(packet) modM/2) →blue Detectable blue – red collision: 1 bit required Undetectable collision flow 7 flow 9 flow 8 Flows: Counters: 6 1 2 0 2 1 6 0 0 M/2 counters Bruno Ribeiro, Tao Ye, Don Towsley, "A Resource-minimalist flow size histogram estimator"

  8. 1 À Counter folding • Collision policy: • “red flow cannot increment blue counter” • “blue flow overwrites red counter” • counter = 0 are red Flows: Counters: 6 1 2 0 2 1 3 0 0 Counter colors: (extra bit) 1 1 0 0 1 0 1 0 0 • Result: e.g. if 1 counter / flow • All red counters are also bluecounters= 0 • Virtually expands hash table in ≈50% (virtual 2 counters/ flow) • Blue counters evict red counters • Flow sampling effect: Discards 15% flows at random Folding: interesting fact Number of foldings Policy: Evict newest flow (color = flow ID) Flow sampling Bruno Ribeiro, Tao Ye, Don Towsley, "A Resource-minimalist flow size histogram estimator"

  9. Experiment Same accuracy without counter folding requires 13MB of memory • Evaluated with simulations • Our worst result with Internet core traces • 9.5 million flows • 8MB of memory • k=16 • W=1014 k Bruno Ribeiro, Tao Ye, Don Towsley, "A Resource-minimalist flow size histogram estimator"

  10. Conclusions Insights • Group large flow sizes using probabilistic counters • Counter folding • Fast quasi-random sampling Our Estimator • Time complexity • Sketch phase • Universal hash cost • Two additions • One subtraction • Estimation phase • O(k3 + log(W)) • Space complexity • ≈ 1/4 memory usage of [Kumar et al. 2004] Bruno Ribeiro, Tao Ye, Don Towsley, "A Resource-minimalist flow size histogram estimator"

More Related