1 / 19

A Dynamic Load-Balanced Hashing Scheme for Networking Applications

A Dynamic Load-Balanced Hashing Scheme for Networking Applications. Author : N. Sertac Artan, Haowei Yuan, and H. Jonathan Chao Publisher/Conf : IEEE GLOBECOM 2008 Speaker : Chen Deyu Data : 2009.10.14. Background (1/2). Network applications such as IP traceback , route lookup, TCP

obelia
Download Presentation

A Dynamic Load-Balanced Hashing Scheme for Networking Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Dynamic Load-Balanced Hashing Scheme for Networking Applications Author : N. Sertac Artan, Haowei Yuan, and H. Jonathan Chao Publisher/Conf : IEEE GLOBECOM 2008 Speaker : Chen Deyu Data : 2009.10.14

  2. Background (1/2) Network applications such as IP traceback , route lookup, TCP flow state monitoring, and malware detection often require large data storage resources, fast queries, and frequent updates. Hash tables are traditional data structures that allow large amounts of data to be stored, queried, and updated in a space- and time-efficient manner on average.

  3. Background (2/2) However, in the worst case, which is critical for network applications, hash tables perform poorly. This poor performance is due to hash collisions. To reduce the impact of hash collisions on worst-case performance, the hash table can be modified to store multiple keys, say up to key in a single hash bucket. Unfortunately, even when multiple keys are allowed to be stored in one bucket, occasional overflows cannot be prevented.

  4. Co-processor architecture (a) Co-processor architecture (b) The LB hash table reduces number of overflows compared to the naïve hash table

  5. Hash Co-Processor Architecture The hash co-processor consists of four parts: (1) Group Counts Table(GCT) (2) Bin-Table (BT) (3) Hash Function (4) CAMs.

  6. Hash Co-Processor Architecture The GCT consists of g entries, where each entry, GCTi shows the number of keys currently in the group i Both GCT and BT are divided into equal-sized segments. Let the number of groups in a single segment of GCT be Gs and the number of bins in a single segment of BT be s.

  7. Three operations in the co-processor • Insertion (Include three algorithm) (1) Non-Search Algorithm (NSA) (2) Single-Search Algorithm (SSA) (3) Double-Search Algorithm (DSA) • Query • Deletion

  8. Insertion - NSA K1 2 0 14 GCTi < Gmax ? overflow CAM

  9. Insertion - SSA K2 3 4 4 5 17 16 13 11 15 gmin

  10. Insertion - DSA K3 b39 2 3 4 8 9 8 16 15 13 15 gmin

  11. Query If in the system, the queried key is either in the hash table or in the CAMs. If the key is found in the CAMs, it is returned immediately. Otherwise, the group of the bin where the key is hashed is read as a single burst from the hash table. So, the query operation takes at most one burst access.

  12. Deletion To delete a key, the key is first queried. If the key is in the CAMs, it is removed from the CAMs and deletion is completed. If this key is not in the CAMs, the group to which this key belongs is read as a single burst from the off-chip structure. The key is deleted and the remaining keys are written back to the off-chip structure in another burst. So, the delete operation takes at most two burst accesses.

  13. ANALYSIS(1/3) • On-chip Memory Consumption • One bin requires: • Each entry in GCTrequires: • CAM storage: • On-chip memoryrequirement per key: • n is the number of keys stored in CAMs and Ks denoteskey size. • Cis the hash tablecapacity. • l is the ratio between the keys stored in the system(i.e. in hash table and • CAMs) at a given time and hash table capacity.

  14. ANALYSIS(2/3) • Time Complexity for On-chip Search The main contributor to on-chip time complexity is the search times for the SSA and DSA. The group search used by both algorithms takes time proportional to Gs, since the search is limited to a single segment and Gs −1 groups need to be searched. Note that for a hardware implementation, since Gs is small, the group search can be done in parallel using a simple priority encoder.

  15. ANALYSIS(3/3) • External Memory Access(Per key on average) (1) NSA: (2) SSA、DSA: Where W is the number of overflows in a give time period. N is the total number of keys inserted into the system.

  16. Performance (1/4) C=65,536 keys Gmax=8 l=0.8

  17. Performance (2/4) Performance comparison for NSA, SSA, and DSA, where GSI and BSI stands for group and bin search per insertion, respectively. AI is external memory access for insert.

  18. Performance (3/4)

  19. Performance (4/4)

More Related