290 likes | 421 Views
Cuckoo the Kicking Bird. Presented By: Ilya nelkenbaum Keren armon Supervisor: Mr. Yossi Kanizo 09/03/2011. Motivation. Modern networking systems: Increasing traffic rates. Packet processing in switching level is essential and in some cases is crucial.
E N D
Cuckoo the Kicking Bird Presented By: Ilyanelkenbaum Kerenarmon Supervisor: Mr. Yossi Kanizo 09/03/2011
Motivation • Modern networking systems: • Increasing traffic rates. • Packet processing in switching level is essential and in some cases is crucial. • Memory access time becomes more critical. • Fast memory is very expensive and size limited. • All this requires faster and more efficient data structures.
Motivation (2) • Applications can be found in wire speed communication, high speed packet processing, large data centers, etc. • Hash-based data structures are an extremely useful technique to deal with this type of problems. • Particularly hash table. • Traditional data structures are not efficient enough.
Cuckoo Hashing A new approach for handling collisions.
Cuckoo Hashing X H_1(X)=1 H_2(X)=4 Insert X
Cuckoo Hashing Y H_2(Y)=7 H_1(Y)=1 Insert Y
Cuckoo Hashing H_1(Y)=1 H_2(Y)=7 Find(y) Found!
Cuckoo Hashing – Description • Basic scheme: each element gets d possible locations. • To insert x, check all locations for x. If one is empty, insert. • If all are full, x kicks out an old element y. Then y moves to one of its other locations. • If all locations are full, y kicks out z, and so on, until an empty slot is found
Hash Basics • Hash memory include: • Basic hash parameters: • m – number of buckets. • h - buckets height. • D – number of memory segments. • n - number of elements. • d – number of hash functions. • b – maximum number of kicks. h
Objectives • Cuckoo’s • Reduce number of memory accesses • Number of accesses is translated to number of kicks. • Better memory utilization. • According to mathematical analysis, for a table twice the size of the number of elements, we will have zero elements in CAM • Project • Test the performance of parallel cuckoo implementation compared with a sequential one, in a manner of memory accesses in several system configurations.
Implementation Platform • OOP language: C# (using Microsoft Visual Studio) • OOP • Generic data structures (Queue). • Garbage collector • GUI • Unfamiliar language. • Version control system: • Using the lab facilities (SVN).
Memory Structures • Hash Table: • Memory segment • API to memory • Segments: • For each segment: operation queue • CAM: • Content Addressable Memory
Cuckoo Logic • Implements an abstract Cuckoo scheme • Is father of: • Naive Cuckoo • NaiveParallel Cuckoo • Parallel Cuckoo • Contains properties: • CAM • Operation queue (filled by simulations) • Hash set – an assembly of randomized hash functions • Statistics • Methods: • doQueue (virtual). • Get Statistics methods
Input for hash table parameters (Including Cuckoo constants) Generating and inserting operations to hash table operations queue. Executing the operations by selected Cuckoo Logic Extracting flow data and processing it according to simulation type. Simulation Flow
Naive Cuckoo Logic • Implements naive execution of operations: • Get first operation. • Execute sequentially for each hash function of element. • When finished, get next operation. • Methods: • Enqueue • doQueue • addElement • Was implemented first.
Naive Parallel Cuckoo Logic • Implements parallel examinations of different hash functions for each element: • Get first operation. • Inquire execution of all hash functions simultaneously. • Save first success and drop all others. • When finished, get next operation. • Methods: • Enqueue • doQueue • addElement • Was implemented second.
Parallel Cuckoo Logic • Implements parallel execution of different operations: • Consider all segments as pipelined system • Each cycle (one memory access), all segments execute an operation. • In case of failure, the operation is being transferred to the next segment. • In case of success a new operation is being pulled from operations queue. • Methods: • AddNewOper • CheckResult • DoQueue • Was implemented last.
API Console API GUI API
Simulation Classes • Main Class • Simulations • Two main simulation classes: • Fast_simulation – samples the stats data after each operation • Regular_Simulation – samples the stats after all operations were executed. • GUI uses only fast simulation • Constants • Define all constants – m, n, d, D, b, h. • Can be modified.
CAM Load by number of elements In this type of simulation the insertion scenario of elements is running, while the final number of elements inserted is equal to number of buckets (m = 1000) D = d = 1 D = d = 2 D = d = 3
CAM Load by number of kicks In this type of simulation we sweep the limit of the number of kicks allowed and each time insert 1000 elements into the hash table.
Memory Access by number of elements In this type of simulation an insertion scenario is executed according to the parameters given. The number of memory accesses is shown as function of inserted elements number. D = d = 1 D = d = 2 D = d = 10
Number of kicks by number of elements This type of simulation executes the insertion scenario according to the given parameters and the result is number of kicks that were made as function of inserted elements number.
Achievements • Reducing number of memory access by ~ x (b=100) • Naive ~= 36944 • NaiveParallel ~= 27556 • Parallel ~= 9330 • By increasing number of hash functions per element, nearly ~98% of memory is used. Therefore main memory utilization is significantly improved. • Providing a platform for finding the minimum number of kicks that limits CAM load. This limitation can further reduce memory accesses.
Future Development • Within the framework of the project one of the main goals was to provide a modular code for implementing and running different Cuckoo Logics over the same hash table. • Due to the modularity, it will be possible in future to add the following features: • Additional Cuckoo Logic approaches • Additional operations for hash table (find and delete). • Implementing mixed operations scenarios
Gant t • Ramp up on problem, algorithm, terminology and theory. • Designing the project, Get to know C#. • Getting green light from supervisor. • Implementation of ‘naive’ Cuckoo. • Running simulations and analyzing results. • Iterative improvements. • Implementation of naive parallel Cuckoo. • Running simulations and analyzing results. • Iterative improvements. • Implementation of parallel Cuckoo. • Creating a GUI for configurations of simulations. • Creating a GUI to review results. • Project summary.