300 likes | 551 Views
ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS. Authors: Steffen Zeuch, Frank Huber, Johann-Christoph Freytag Humboldt-Universität zu Berlin {zeuchste,huber,freytag}@informatik.hu-berlin.de. 1. Motivation. B + -Tree: common index structure
E N D
ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS Authors:Steffen Zeuch, Frank Huber, Johann-Christoph Freytag Humboldt-Universität zu Berlin {zeuchste,huber,freytag}@informatik.hu-berlin.de 1
Motivation • B+-Tree: common index structure • Common node-internal search algorithm: • Binary search in O(log2n) Can we do better? Yes with SIMD! 2
Outline • Background • Binary Search and SIMD • Segmented Tree • Segmented Trie • Evaluation • Conclusion 3
Add const to vector Compare two vectors Add two vectors SIMD 3 67 65 2 2 65 67 3 3 2 ≥ + • Single Instruction Multiple Data: • Available on CPU and GPU • Arithmetical, comparison, conversion, logical +2 +2 0 -1 67 69 5 4 4
5 1 2 3 4 Binary Search Search Key = 9 Iteration Search Space Search Key Separator Excluded 5
Outline • Background • Binary Search and SIMD • SegmentedTree • Segmented Trie • Evaluation • Conclusion 6
2 3 Binary Search - two Separator Search Key = 9 Iteration 1 Search Space Search Key Separator Excluded 7
SIMD Register A SIMD Register B Search Space Excluded Search Key Separator >= 8 17 9 9 Binary Search + SIMD 0 -1 SIMD Register C 8
Problem: SIMD on CPU SIMD on CPU do not support Scatter and Gather functionality. SIMD load(start position) 4 x 32-bit SIMD Register 8 9 10 11 9
3-ary Search Tree (k = 3) Linearized Order Solution: K-ary Search by Schlegel et al. Search Key = 9 Search Space Search Key Separator Excluded 10
2 3 Linearized Order Applied K-ary Search 3-ary Search Tree Search Key = 9 1 Search Space Search Key Separator Excluded 11
Outline • Background • Binary Search and SIMD • SegmentedTree • Segmented Trie • Evaluation • Conclusion 13
Segmented Tree Change inner-node search algorithm from commonly binary search to k-ary search. 14
3-ary Search Tree Linearized Order Smax+1 Problem: Unfilled Nodes K-ary requirement: multiple of k-1 keys 15
Reordering • New keys require reordering: • Sorting → Inserting → Linearizing • Exceptions: • Empty Node • Key is greater than the largest existing key 16
Segmented Tree Advantages: • High resource utilization • Less iterations required • Binary Search: log2n vs. K-ary Search logkn Disadvantages: • Reordering overhead • Large data types decrease performance 17
Outline • Background • Binary Search and SIMD • SegmentedTree • Segmented Trie • Evaluation • Conclusion 18
Key (Hex) Segmented Trie Level 1 Partial Key (Hex) Key (Dec) Level 2 19
Segmented Trie Advantages: • High SIMD search performance • Prefix compression • Early termination Disadvantages: • Fix level count • Reordering overhead 21
Outline • Background • Binary Search and SIMD • SegmentedTree • Segmented Trie • Evaluation • Conclusion 23
Test Setup HW/SW Configuration: • CPU: Intel Xeon 5520, 4 x 2,26 GHz • L1: 32KB, L2: 256 KB, L3: 8 MB, MM: 8 GB • Cacheline: 128 Byte, SIMD bandwidth: 128 Bit • Windows 7 64-bit Professional Test Dataset: • Synthetically generated, ascending, starting at 0 24
Evaluation: Bitmask SIMD Register B • Three Algorithms: • Bit Shifting • Case-Switch • PopCnt SIMD Register A 9 9 8 17 >= 0 -1 SIMD Register C 25
Outline • Background • Binary Search and SIMD • SegmentedTree • Segmented Trie • Evaluation • Conclusion 28
Our Contributions • B+-Tree and prefix B-Tree using SIMD • Transformation and search algorithm for breadth-first and depth-first data layout • Three algorithms for interpreting a SIMD comparison result • Solution for an arbitrary key count Thanks 29
Backup 30