Towards a Packet Classification Benchmark

Towards a Packet Classification Benchmark ARL Current Research Talk 20 October 2003

Packet Classification Example • Data services: • Reserved bandwidth • AES security • VLANs • Multi-Service Routers: • Filter databases updated manually or automatically based on service agreements • Services applied based on classification results Query: Packet from 12.34.244.1 going to 168.92.44.32 using TCP from port 1200 to port 1450 Result: Decrypt all packets using AES; Transmit packet on port 3 Query: Packet from 12.34.244.1 going to 168.92.44.32 using TCP from port 1200 to port 1450 Result: Encrypt packet using AES; Send copy of header to usage accounting with userID 110; Transmit packet on port 5

Formal Problem Statement • Given a packet P containing fields Pj and a collection of filters F with each filter Fi containing fields Fij, select the highest priority exclusive filter and k highest priority non-exclusive filters where for each filter: For all j: Fij matches Pj • Performance tradeoffs commonlycharacterized by point locationproblem in computational geometry • For n regions defined in j dimensions, for j > 3, a point may be located in multi-dimensional space in O(log n) time with O(nj) space; or O(logj-1n) time with O(n) space Example: n = 13, j = 2 Packet header maps to point in 2-D space Destination Address Source Address

Motivation for a Benchmark • No benchmark currently exists in industry or research community • Performance of two most effective packet classification solutions depends on the composition of filters in the filter set • TCAM capacity depends on port range specifications • Range conversion to prefixes may cause a single filter to occupy [2(w-1)]k TCAM slots (900 slots in the worst case for TCP & UDP source/destination ports) • w = number of bits required to represent a point in the range • k = number of fields specified by ranges • Observed expansion factors range from 40% to 520% • Fastest algorithms leverage heuristics and optimize average performance • Cutting algorithms (E-TCAMs, Hi-Cuts, Hyper-Cuts) • Tuple-Space algorithms • Plethora of new packet classification products • Network processors, packet processors, traffic managers, TCAMs • Intel, IBM, Silicon Access, Mosaid, IDT (Solidium), SiberCore, Cypress, etc.

Motivation for a Benchmark (2) • Security and confidentiality concerns limit access to “real” databases for study and performance evaluation • Well-connected researchers have gained access but are unable to share • Lack of large “real” databases due to limited deployment of high-performance packet classification solutions • Performance evaluations with “real”databases limited by size and structure of samples • Goal: develop a benchmark capable of capturing relevant characteristics of “real” databases while providing structured mechanisms for augmenting database composition and analyzing performance effects • Should have value for three distinct communities: researchers, product vendors, product consumers

Related Work • IETF Benchmarking Working Group (BMWG) developed benchmark methodologies for Forwarding Information Base (FIB) routers and firewalls • FIB focuses on performance evaluation of routers at transmission interfaces • Firewall methodology is a high-level testing methodology with no detailed recommendations of filter composition • Network Processing Forum has a benchmarking initiative • Produced IP lookup and switch fabric benchmarks • Thus far, only IBM and Intel have published results for IP lookup • No details or announcements re: packet classification • Performance evaluation by researchers • Most randomly select prefixes from forwarding tables and use existing protocol, port range combinations • Baboescu & Varghese added refinements for controlling the number of zero-length prefixes and prefix nesting

Related Work (2) • Woo [Infocom 2000] provided strong motivation for a benchmark • Provided a high-level overview of filter composition for various environments • ISP Peering Router, ISP Core Router, Enterprise Edge Router, etc. • Generated large synthetic databases but provided few details regarding database construction • No mechanisms for varying filter composition

Understanding Filter Composition • Most complex packet filters typically appear in firewall and edge router filter sets • Heterogeneous applications: network address translation (NAT), virtual private networks (VPNs), and resource reservation • Firewall filters are created manually by a system admin using standard tools such as Cisco Firewall MC • Model of filter construction: specify communicating subnets, specify application (or set of applications) • TCP and UDP identify applications via 16-bit port numbers • Provide services to unknown clients via “contact ports” in the range of well-known (or system) ports assigned by IANA • Since 1993, the system port range is [0:1023] • Established sessions typically use a unique port in the ephemeral port range [1024:65535] • IANA manages a list of user registered ports in the range [1024:49151] • Limited number of protocols in use, dominated by TCP and UDP

Analyzing Database Structure • Engaged in an iterative process of analyses in order to identify useful metrics • Accurately capture database structure • Goal: identify methods and metrics useful for constructing synthetic databases • Defined new metrics • Joint address prefix length distributions • Scope: metric used to assess the specificity of filters on a logarithmic scale • Skew: metric used to assess the number of subnets covered by a given filter set • Quantifies branching in the binary tree representation of address prefixes

Scope Definition • From a geometric perspective, a filter defines a region in 5-d space • Volume of the region is the product of the 1-d “lengths” specified by the filter fields • e.g. Number of addresses covered by source address prefix • Points in 5-d space correspond to packet headers • Filter properties are commonly defined as a tuple specification, or a vector with fields: • t[0], source address prefix length, [0…32] • t[1], destination address prefix length, [0…32] • t[2], source port range width, [0…216] • t[2], destination port range width, [0…216] • t[4], protocol specification, Boolean [specified, not specified]

Scope Distributions • Scope distribution characterizes the specificity of filters in the database • Exact match filters have scope = 0 • Default filters have scope = 104 • Notable “spikes” near low end of distribution • Wide variance

Joint Prefix Length Distributions • Observe large spikes in joint distribution along the “edges” • Unlike forwarding tables /0 and /32 prefixes are common in prefix length pairs • Strong motivation for capturing joint distribution • Observe a correlation with port range specifications (not shown)

Joint Prefix Length Distributions (2) • For synthetic database generation, we want to: • Select a prefix length pair based on total prefix length • Total length specified by diagonals in joint distribution • Allow distribution to be modified • Represent joint distribution by a collection of 1d distributions • Build a total length distribution [0…64] • bin = sum of prefix lengths • For each non-empty bin in total length distribution, build a source length distribution for the prefix pairs in the bin • (destination address prefix length) = (total length) – (source address prefix length) • Allows for high-level input parameter for address scope adjustment

Skew Definition • Want a high-level characterization of address space coverage by filters, (also want to anonymize IP addresses) • Complete, statistical model is infeasible • Imagine a binary tree with a branching probability for each node • Employ a suitable approximation to capture important characteristics such as prefix containment • Build two binary trees from the source and destination address prefixes in the filters • At each node, define the weight of the left child and right child as the number of filters specifying a prefix reached by taking the left child and right child, respectively • Let heavy = max[weight of left child, weight of right child] • Let light = min[weight of left child, weight of right child]

Skew Distributions • For each level in the tree compute the average skew for the nodes at that level • Low skew  evenly “weighted” children, doubling of address space coverage • High skew  asymmetrically “weighted” children, containment of address space coverage • Skew = 1 means a node has a single path

Designing a Flexible Benchmark • Provide mechanism for defining database structure • Structure could be based on analysis of seed databases • Construct a set of benchmark database structures to use a departure point for performance evaluation • Provide high-level controls for augmenting database structure • Observe effects on search and capacity performance • Scale the database while preventing redundant filters • Adjust the specificity or scope of filters • Introduce “entropy” into the database • A structured mechanism for straying from database structure • Difficult to provide meaningful adjustments for application specifications (protocol, port ranges)

Benchmark Architecture

Parameter Files • Defines the general database via requisite statistics • May be extracted from seed databases using an analysis tool • Goal: compile a set of benchmark parameter files that characterize various packet classification application environments (as proposed by Woo) • Protocol and port pair class distribution • Distribution of protocol specifications • For each protocol, specify a port pair class distribution for filters specifying the given protocol • Port pair class defines the structure of port range pairs • 25 port pair classes all possible permutations of five port classes • WC = [0:65535], WR1 = [0:1023], WR2 = [1023:65535], AR, EM • Port range distributions • Arbitrary range and exact port distributions • Limited set of arbitrary ranges observed in real databases

Parameter Files (2) • Joint prefix length distributions for each “port pair class” • 25 distributions, each containing a total length distribution and the associated source address prefix length distributions • Preserves correlation between port pair class and prefix length pairs in directional filters • Address skew distributions for source and destination addresses • Source/destination prefix “correlation” distribution • Specifies the “distance” between communicating subnets specified by filter • Probability that the address prefixes of a filter continue to be identical at a given prefix length • Consider a filter with address prefix length pair (16,25) • Consider walking the source and destination address prefix trees in parallel • Assume that the prefixes are identical for the first 8 bits • The “correlation” probability at level 9 specifies the probability that the next bit in the prefixes will be the same • Once prefixes diverge or prefix length is reached, the distribution is irrelevant

Synthetic Database Generator • Reads in parameter file • Trivial option to generate a completely random filter database • Takes three high-level input parameters • size = target size for synthetic database • Resulting size may be less than target • Tool generates filters using statistical model then post-processes database to remove redundant filters • Favorable for assessing scalability of parameter files • Smoothing (r) = number of bits by which synthetic filters may stray from points in prefix length pair distribution • Structured “entropy” mechanism for introducing new prefix length pairs • Models aggregation and/or increased flow segregation • Scope (s) = bias to more or less specific filters • Adjusts the shape of the address length distributions without adding or removing bins

Understanding Scaling Effects • Readily scale a seed database by 30x to 40x • Larger seed databases provide for larger synthetic databases • rules6 (~1500 filters) is approximately 6x larger than rules1 and rules5 • As the “limit”of the seed parameter file is reached  shift in average filter scope to more specific filters

Smoothing Adjustment • Smoothing (r) = number of bits by which synthetic filters may stray from points in prefix length pair distribution • Apply a symmetric binomial spreading to each spike in the joint prefix length distribution • For each joint distribution in parameter file: • Apply binomial spreading to each spike in total length distribution • For each source prefix length distribution: • Apply binomial spreading to each spike in source length distribution • Tricky details like adjusting the width of the source spreading as you move away from the original spike • Truncate and normalize distribution to allow for spreading of spikes at the edges • Let k = 2r

Smoothing Example: Single Spike • All prefixes lengths are 16 bits • Database target size = 64,000 filters • No scope adjustment, s = 0 • Generate databases for various values of smoothing adjustment, r (a.) r = 0 (b.) r = 0, top-view

Single Spike with r = 8 • r = 8  maximum Manhattan “distance” from original spike • Observe symmetric binomial distribution across total prefix length (diagonal) and source prefix length (a.) r = 8 (b.) r = 8, top-view

Single Spike with r = 32 • r = 32  maximum Manhattan “distance” from original spike • Observe symmetric binomial distribution across total prefix length (diagonal) and source prefix length (a.) r = 32 (b.) r = 32, top-view

Smoothing with Seed Parameter File • r = 16 • Appears to be the sensible limit to smoothing for real databases • Spreading is cumulative, adjacent spikes may spread into each other creating new dominant spikes

Understanding Smoothing Effects • High sensitivity for small values of smoothing adjustment, r • Believe that this is due to dominance of spikes at the “more specific” edges of the joint distributions in seed databases • Truncation causes a slight drift to a larger average scope

Smoothing: Contrived Distributions • Constructed two contrived distributions to verify hypothesis • Spikes = all joint distributions have two points (0,0) and (32,32) • Uniform = uniform total length distribution • Observed identical drift for spikes distribution and no drift for uniform distribution

Scope Adjustment • Scope (s) = bias to more or less specific filters, [-1:1] • Adjusts the shape of the address length distributions without adding or removing bins • s > 0 : decrease scope, increase specificity (prefix length) • s < 0 : increase scope, decrease specificity (prefix length) • Utilize a bias function on the random number used to select from the cumulative distributions • Bias function computes area under line whose slope is defined by s • Prevents laborious recomputation of each prefix length distribution s = 1 s = -1 s = 0 1 1 1 0.5 0.75 0.25 0 0.5 1 0 0.5 1 0 0.5 1 S = -1 S = 0 S = 1

Scope Example: Uniform Distribution • Uniform distribution, r = 0, s = 1 • Weight is pushed to more specific address prefixes

Scope: Contrived Distributions • Maximum bias of ~12-bits longer or shorter in total prefix length • Provides for an 4096x increase or decrease in the average coverage of the filters in the database • As expected, negligible difference in two distributions • No change in bins, only a shift in weight

Scope: Real Distributions • Observed maximum bias of ~ 6-bits longer or shorter in total prefix length • Provides for an 64x increase or decrease in the average coverage of the filters in the database • Sensitivity is dependent upon parameter file

Synthetic Database Generation Summary • Solid foundation for a packet classification benchmark • May be beneficial to have a high-level skew adjustment or skew compensation coupled with scaling • Allow more branching for larger databases • Need more sample databases from other application environments in order to compile benchmark suite of parameter files • Alternately, formulate parameter files manually from more detailed extensions of Woo’s descriptions

Trace Generator • Problem: given a filter database, construct an input trace of packet headers that query the database at all “interesting” points and an associated output trace of best-matching (or all-matching) filters for each packet header • We can define “interesting” in various ways… • A point in each 5-d polyhedron formed by the intersections of the 5-d rectangles specified by the filters in the database (optimal solution) • Appears to be an O((n*log n)5) problem using fancy data-structures • Optimizations may exist and amortized performance may be better • A random selection of points (least favorable solution) • A pseudo-random selection of points (most feasible solution?) • For each filter, chose a few random points covered by the filter • Might be able to develop some heuristics to choose points that are and are not likely to be overlapped by other filters • Post-process the input trace in order to generate the output trace • Could feedback results of post-process in order to choose points for filters not appearing in the output trace

The next step… • Finalize trace generator design, implement, and analyze (if necessary) • Run several packet classification algorithms through the benchmark • Use results to refine tools and develop benchmarking methodology that extracts salient features • Investigate ways to generate broad interest in the benchmark • Publication • Web-based scripts • Pitch to the IETF • Comments, critiques, suggestions, questions?

Towards a Packet Classification Benchmark

Towards a Packet Classification Benchmark

Presentation Transcript

Packet Classification

Packet Classification

ClassBench: A Packet Classification Benchmark

Approximate Caches for Packet Classification

Packet classification on Multiple Fields

Packet Classification # 3

BPC: A language for packet classification

Packet Classification On Multiple Fields

Packet Classification on Multiple Fields

AffyDEComp: towards a benchmark for differential expression methods

IP-Lookup and Packet Classification

Survey of Packet Classification Algorithms

Benchmark 2 review packet answers

Route Lookup and Packet Classification

Packet Classification on Multiple Fields

Packet Classification using Extended TCAMs

Packet Classification on PLUG Architecture

Efficient packet classification using TCAMs

FRuG: A Benchmark for Packet Forwarding in Future Networks

Approximate Caches for Packet Classification