200 likes | 297 Views
Joint Data Streaming and Sampling Techniques for Detection of Super Sources and Destinations. Qi (George) Zhao, Abhishek Kumar, Jun (Jim) Xu College of Computing, Georgia Institute of Technology Internet Measurement Conference 2005 Speaker: Yongming Chen on Dec. 07, 2006. Outline.
E N D
Joint Data Streaming and Sampling Techniques for Detection of Super Sources and Destinations Qi (George) Zhao, Abhishek Kumar, Jun (Jim) Xu College of Computing, Georgia Institute of Technology Internet Measurement Conference 2005 Speaker: Yongming Chen on Dec. 07, 2006
Outline • Introduction • Motivation • Algorithms • Evaluation • Conclusion
Outline • Introduction • Motivation • Algorithms • Evaluation • Conclusion
Introduction • detect super sources/destinations at high link speeds (10 to 40 Gbps) in real-time • super sources: source with a large fan-out • fan-out: number of distinct destinations • data streaming: sequentially process every packet passing through • a better alternative to sampling and monitoring of high-speed links • NO connection with multimedia streaming!
Outline • Introduction • Motivation • Algorithms • Evaluation • Conclusion
Motivation • detecting super sources and destinations is useful • network monitoring and security • hot-spot or flash crowds detection • traditional per-flow schemes cannot scale to high-speed links • information lost in packet sampling • FlowScan: maintain per-flow state with hash table • network data streaming with a small but well-organized data structure
Outline • Introduction • Motivation • Algorithms • Simple Scheme • Advanced Scheme • Evaluation • Conclusion
Simple Scheme • traditional hash-based flow sampling • reduce the amount of incoming traffic, e.g., a 10M pkt/s link can be processed in 400ns with 25% sampling rate • this approach will fail with traffic burst • filtering after sampling
Simple Scheme • at most one packet from each sampled flow need to be processed • need a umin to reduce estimation error • typically set to w/2 • extremely low storage complexity • only 128KB SRAM is needed for OC-192 links with 25% flow sampling
Advanced Scheme Separation of counting and identity gathering: • streaming module encodes fan-out information • sampling module captures candidate source
Estimation Module Fs : fan-out of source s Ai, i=1 to k: the column indices we can obtain by hashing s with h1 to hk Ti : the set of packets hashed into Ai UTi : # ‘0’ bits in Ai DTi : a fairly accurate estimator of |Ti|
Outline • Introduction • Motivation • Algorithms • Evaluation • Conclusion
Settings • packet header traces • IPKS+ and IPKS- : OC192c • USC : Los Nettos tracing facility at USC • UNC : 1Gbps • flow label • (<src_ip, src_port>, <dst_ip>) • (<src_ip>, <dst_ip, dst_port>)
Outline • Introduction • Motivation • Algorithms • Evaluation • Conclusion
Conclusion • combine the power of data streaming and sampling to perform efficient and accurate detection • but no comparison with other approaches in evaluation
Reference • Q. Zhao, A. Kumar, J. Xu. Joint Data Streaming and Sampling Techniques for Detection of Super Sources and Destinations. Internet Measurement Conference 2005. • Q. Zhao, A. Kumar, and J. Xu. Joint data streaming an sampling techniques for detection of super sources and destinations. In Technical Report, July 2005. • FlowScan: http://www.usenix.org/events/lisa2000/full_papers/plonka/plonka_html/