700 likes | 807 Views
Internet Measurement, Section 6.3 Analyzing Network Traffics. Mohammad Hassan Hajiesmaili ECE Department, University of Tehran Fall 2009. Outline. Packet Capture Data Management Data Reduction Inference. Packet Capture. Passive Traffic Management
E N D
Internet Measurement, Section 6.3Analyzing Network Traffics Mohammad Hassan Hajiesmaili ECE Department, University of Tehran Fall 2009
Outline • Packet Capture • Data Management • Data Reduction • Inference
Packet Capture • Passive Traffic Management • Packet Capture in General Purpose Systems • Packet Capture in Special Purpose Systems • Control Plane Traffic
Packet Capture in General Purpose Systems • Libpcap • Promiscuous mode • Report all packet received • Packet Filter • Specify how much of each packet should be captured • Parsing tools • Tcpdump • Ethereal • Commercial products
Packet Capture in General Purpose Systems • Broadcast LANs • Switched LANs • Port Mirroring • Libpcap • Unrestricted Access • Pktd & scriptroute • Set Access policies by system Admin
Packet Capture in Special Purpose Systems • Monitoring Core Network Links • In Optical Scenarios the link speed is faster than PCI bus • Specialized Network Interface and Interface Driver • Some Example • OC3MON • OC12MON
Control Plane Traffic • Capture control packet traffic • Example • Local view of BGP system • Establishing a session with a BGP-speaking router • Tools • GNU Zebra • Quagga • Can capture other routing traffic • OSPF • RIP
Data Management • Full Packet Capture is challenging • Limited bus bandwidth • Limited memory access speed • Limited speed and capacities of disk array • Limited processing power • Specialized tools • Using sophisticated algorithms for operating on large stream of data • Example: • Smacq • Windmill
Data Management • Database management problem • Very large data sets • Incrementally over time • Continuous queries • Solution • Data Stream Management • Example: • Tribeca • STREAMS • TelegraphCQ • Gigascope • Queries are expresses in GSQL
Looking at the traffic Too much data for a human Do something smarter!
Data Reduction • Traffic Counters • Flow Capture • Sampling • Summarization • Dimensionality Reduction • Probabilistic Model
Counter • Use aggregation to form time series of counts of traffic statistics • Bytes or packets per unit time • Time series are constructed by periodic polling • Generically called SNMP counts • Benefits: • Capturing without much performance impact on router • Extremely compact compared to traffic traces
Counters - Drawback • SNMP transport is via UDP • Measurement packets can be lost • Difficulty in obtaining synchronized time series across multiple interfaces • Too coarse-grained for many needs
Flow Capture • Counters provide basic information • Almost all traffic semantics are absent • Capture and store packet trains or flows
Packet Train • A burst of packets arriving from the same source and heading to the same destination. • If the spacing between two packets exceeds some inter-train gap, they are said to belong to different trains.
Capture Packet Train • Can be used for • Monitoring basic network activity • Monitoring users and applications • Network planning • Security analysis • Accounting and Billing • Tools for capturing are present in all major routers
Packet Train Record Content • IP header (5-tuple) • Source IP address, Source TCP port, Destination IP address, Destination port, Protocol ID • Start Time • End Time • Number of Packet • Number of bytes contained in the packet train • Dramatic decrease in trace size compare to full packet capture
Packet Flows • Capturing packet flows rather than packet trains • Require higher level software for processing and interpreting the raw data
Flow Capture • IETF standards for flow capture • IP Flow Information Export effort (IPFIX) • For Exporters: Providers of flow data • For Collectors: Consumers of flow data • Real Time Flow Metering (RTFM) • Meter MIB that can be accessed via SNMP • Meter Readers: Collect flow data • Managers: Coordinate meters and meter readers
Sampling • In sampling scheme, a subset of packets are chosen for capture • Two important question • How should packet be chosen for sampling? • How should one correct or compensate for the sampling process when performing analysis? • Basic packet sampling • Trajectory sampling
Basic Packet Sampling • The sampling process is performed independently on each link being monitored • Two category: • Variable rate sampling • Constant rate sampling • Random sampling • Deterministic sampling • Stratified sampling
Trajectory Sampling • Basic Packet Sampling • Packet capture at multiple points in a network • Can not obtain per-packet delay • Trajectory Sampling Idea • If a packet is chosen for sampling at any point in the network, it is chosen at all point in the network. • Idea of implementation • Calculation of hash function for each packet
Trajectory Sampling - Advantages • Easy to obtain metrics on customer performance • Per-customer packet delay • Detect routing loops • Trace denial of service attacks
Summarization • Form compact summaries of large volume of data • Bloom Filter • Sketches • Other Approaches
Review: Bloom Filters • Given a set S = {x1,x2,x3,…xn} on a universe U, want to answer queries of the form: • Bloom filter provides an answer in • “Constant” time (time to hash). • Small amount of space. • But with some probability of being wrong. • Alternative to hashing with interesting tradeoffs.
B 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 B 0 1 0 0 1 0 1 0 0 1 1 1 0 1 1 0 B 0 1 0 0 1 0 1 0 0 1 1 1 0 1 1 0 B 0 1 0 0 1 0 1 0 0 1 1 1 0 1 1 0 Bloom Filters Start with an m bit array, filled with 0s. Hash each item xjin S k times. If Hi(xj) = a, set B[a] = 1. To check if y is in S, check B at Hi(y). All k values must be 1. Possible to have a false positive; all k values are 1, but y is not in S. n items m= cn bits k hash functions
False Positive Probability • Pr( specific bit of filter is 0) is • If r is fraction of 0 bits in the filter then false positive probability is • Find optimal at k = (ln 2)m/n by calculus. • So optimal FPP is about (0.6185)m/n n items m= cn bits k hash functions
Example m/n = 8 Opt k = 8 ln 2 = 5.45... n items m= cn bits k hash functions
Handling Deletions • Bloom filters can handle insertions, but not deletions. • If deleting xi means resetting 1s to 0s, then deleting xi will “delete” xj. xixj B 0 1 0 0 1 0 1 0 0 1 1 1 0 1 1 0
B 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 B B B 0 0 0 3 2 1 0 0 0 0 0 0 0 1 0 0 0 0 1 2 2 0 0 0 0 0 0 3 1 3 2 2 1 1 1 1 0 0 0 2 1 1 1 1 1 0 0 0 Counting Bloom Filters Start with an m bit array, filled with 0s. Hash each item xjin S k times. If Hi(xj) = a, add 1 to B[a]. To delete xjdecrement the corresponding counters. Can obtain a corresponding Bloom filter by reducing to 0/1.
Counting Bloom Filters: Overflow • Must choose counters large enough to avoid overflow. • Poisson approximation suggests 4 bits/counter. • Average load using k = (ln 2)m/n counters is ln 2. • Probability a counter has load at least 16:
Bloom Filters in Networking • Summarizing the contents of web caches to facilitate sharing cache content • Peer-to-Peer Systems • Routing • Active Queue Management • Topology Discovery • An Extension • Multistage bloom filters for storage of multisets
Sketches • X: histogram of flow counts observed • Xi: number of packets observed for flow i • Random lower-dimension projection • For m << n • m*n Matrix P • P has a single 1 in a randomly chosen row • All other entries in the column are 0 • Forming the product Px is equivalent to constructing a single counter set of the multistage filter
Sketches • Dimension reducing random projection • Linear • Well suited to processing via wavelets • Applications • Traffic compression • Traffic similarity detection • Heavy-hitter estimation • Drawback • The set of keys encoded cannot easily be retrieved directly form data structure
Summarization approaches • Trie data structure • Constructed based on address prefix • Each node in the tire stores the traffic volume corresponding to all addresses contained in the prefix • Approaches used to count the number of distinct values in a traffic trace • Probabilistic counting • Bitmap algorithms
Summarization approaches • Approaches that maintain traffic summary for a while • Landmark window model • Sliding window model
Dimensionality Reduction • Approaches for solving the problem of high dimensionality in traffic measurements. • Dimension reduction approaches • Tend to find an alternate representation of data that exposes the true (low-dimensional) structure in the data • Clustering • Principal Component Analysis
Src. IP Dest. IP Dest. IP Source port Protocol Src. port Dest. port Src. net Dest. net Dest. net Looking at traffic aggregates • Aggregating on individual packet header fields gives useful results but • Traffic reports are not always at the right granularity (e.g. individual IP address, subnet, etc.) • Cannot show aggregates defined over multiple fields (e.g. which network uses which application) • The traffic analysis tool should automatically find aggregates over the right fields at the right granularity Which network uses web and which one kazaa? Where does the traffic come from? …… What apps are used? Most traffic goes to the dorms …
Ideal traffic report Web is the dominant application This is a Denial of Service attack !! The library is a heavy user of web That’s a big flash crowd! This paper is about giving the network administratorinsightfultraffic reports
Clustering • Similarity metrics • Defined on the set of traffic features • Specific form of Vector Quantization • Challenges • Discovering a set of cluster definitions that succinctly describe the traffic • Search problem in high dimensional space
Principal Component Analysis • Clustering • Nonlinear dimensionality reduction • PCA • Linear • Optimal, in the sense of capturing maximum variability in the data using a minimum number of dimension
Principal Component Analysis (PCA) • Away to • identifying “patterns” in data • Expressing the data in order to highlight the correlations such as similarities and dissimilarities • Why important? • Hard to visualize the patterns of high dimensional data • How to take advantages of PCA • Compressing Data by reducing the number of dimensions without “hopefully” much losing of data information
Background Mathematics • Linear Algebra • Matrix representation of Data • Statistics Concepts • Mean – Expectation of the data distribution • Covariance – Sparseness of data distribution • Build Covariance Matrix (CM) • Covariance Matrix tells us the correlations of data between dimensions of data • CMij = Positive -> ith dimension increased, so does jth dimension • CMij = Negative -> ith dimension increased, jth dimension decreased • CMij = 0 -> No correlation, which means Independency
PCA (1) • For any given dataset, PCA finds a new coordinate system that maps maximum variability in the data to a minimum number of coordinates • New axes are called Principal Axes or Components
Principal Component Analysis • Suppose that the original variables X1, X2, . . . , Xm form a coordinate system in m-dimensional space. • Each variable Xi represent an n × 1 vector, where n is the number of records. • Standardized variable Zi is the n × 1 vector, where Zi = (Xi − µi )/σii , µi is the mean of Xi , and σii is the standard deviation of Xi • In matrix notation: Z = (V1/2)−1(X − µ), and V1/2 is a diagonal matrix (nonzero entries only on the diagonal)