120 likes | 246 Views
HEXA: Compact Data Structures for Faster Packet Processing. Sailesh Kumar Jonathan Turner Patrick Crowley Michael Mitzenmacher. HEXA. HEXA ( History-based Encoding, eXecution and Addressing ) Novel representation for: IP Lookup tries (directed acyclic graph)
E N D
HEXA: Compact Data Structures for Faster Packet Processing Sailesh Kumar Jonathan Turner Patrick Crowley Michael Mitzenmacher
HEXA • HEXA (History-based Encoding, eXecution and Addressing) • Novel representation for: • IP Lookup tries (directed acyclic graph) • Simple finite automaton such as Aho-Corasick String Matchers • Space efficient • Challenges the assumption that graph structures must store log2n bits pointers to identify successor nodes • Requires only 2-bit versus 20-bit pointers (for 1 million nodes)
1 0 1 2 3 0 1 1 P1 4 5 6 1 0 P3 P2 7 8 0 P4 9 P5 Tries - Traditional Implementation Five IP prefixes 1* P1 00* P2 11* P3 011* P4 0100* P5 There are nine nodes; we will need 4-bit node identifiers Total memory = 9 x 9 bits Each trie node will require 9-bits in memory - a flag indicating if node is a prefix - a 4-bit left child pointer - a 4-bit right child pointer
HEXA based Implementation 1 0 1 0 Five IP prefixes 2 3 1* P1 0 1 1 1 P1 00* P2 4 5 6 11* P3 1 0 0 P3 P2 011* P4 7 8 0100* P5 0 P4 9 P5 Properties of HEXA identifiers: Define HEXA identifier of a node as the path that leads to it from the root Unique for every node Implicit (need not be stored) 1. - 2. 0 3. 1 4. 00 5. 01 6. 11 7. 010 Can replace node pointers 8. 011 9. 0100
HEXA based Implementation Hash (HEXA identifier) = memory address IP addr. : 1 1 0 0 x x x If we have a minimal perfect hash function f - A function that maps elements to unique location Then we can store the trie as shown below The prefix, we were looking begin lookup at root node f(-) = 4 f(0) = 7 f(1) = 9 We use only 3-bits per node in fast path - Valid prefix flag - Left child flag - Right child flag Properties of HEXA identifiers: f(00) = 2 f(01) = 8 f(11) = 1 Unique for every node Implicit (need not be stored) 1. - 2. 0 3. 1 4. 00 5. 01 6. 11 7. 010 8. 011 9. 0100 Can act as memory address f(010) = 5 f(011) = 3 f(0100) = 6
Devising One-to-one Mapping • Finding a minimal perfect hash function is difficult • One-to-one mapping is essential for HEXA to work • Use discriminator bits • Attach c-bits to every HEXA identifier, that we can modify • Thus a node can have 2c choices of identifiers • We now need to store these c-bits for every child instead of a single flag • With multiple choices of HEXA identifiers for a node, reduce the problem to a bipartite graph matching • We need to find a perfect matching in the graph to map nodes to unique memory locations
Four choices of Choices of HEXA identifiers memory locations h(00) = 0, h(01) = 4 00 -, 01 -, h(10) = 1, h(11) = 5 10 -, 11 - 00 0, 01 0, 10 0, 11 0 h() = 0, h() = 4 00 1, 01 1, h() = 1, h() = 5 10 1, 11 1 h() = 2, h() = 6 00 00, 01 00, h() = 3, h() = 7 10 00, 11 00 h() = 1, h() = 5 00 01, 01 01, h() = 2, h() = 6 10 01, 11 01 h() = 8, h() = 3 00 11, 01 11, h() = 0, h() = 4 10 11, 11 11 00 010, 01 010, h() = 1, h() = 5 10 010, 11 010 h() = 6, h() = 2 00 011, 01 011, h() = 0, h() = 4 10 011, 11 011 h() = 5, h() = 1 00 0100, 01 0100, h() = 0, h() = 3 10 0100, 11 0100 h() = 4, h() = 6 Devising One-to-one Mapping Use 2-bit discriminators Input labels OR HEXA identifier Nodes Bipartite graph 1 - 0 h(000) = 1, h(010) = 5 2 0 1 h(100) = 2, h(110) = 6 PERFECT MATCHING 3 1 2 4 00 3 5 01 4 Pick Appropriate Discriminators 6 11 5 7 010 6 8 011 7 9 0100 8
HEXA based Implementation Store its discriminator instead of a single flag for left and right children Here we use only 5-bits per node in fast path - Valid prefix flag - Left discriminator - Right discriminator 1. - 2. 0 3. 1 4. 00 5. 01 6. 11 7. 010 8. 011 9. 0100
Results • 3 choices are sufficient to find a perfect matching (with 10% memory over-provisioning) • Thus 2-bits discriminators (00 value reserved for no child) • Significant reduction 2-bits per node versus log2n bits 32 Eatherton tries, each contains 100-120k prefixes.
Incremental Updates • IP table updates are very frequent • When a node is removed and another added, we must ensure a few memory operations. • In the new bipartite graph, a new perfect matching can be found • Quickly (O(n2c) time in the worst-case, typically constant time) • New matching is slightly different from the previous matching • Typically around 10 different edges, experimental worst-case - 18 • Thus less than 18 memory operations are needed for an update
HEXA for Pattern Matching • HEXA can be used to compress Aho-Corasick string matching automaton • Directed graph • In the future, HEXA may become useful for general finite automaton • Reg-ex acceleration