1 / 37

Optimizing Sorting With Genetic Algorithms

Xiaoming Li, Mar í a Jes ú s Garzar á n, and David Padua University of Illinois at Urbana-Champaign. Optimizing Sorting With Genetic Algorithms. ESSL on Power3. ESSL on Power4. Outline. Our Solution Primitives & Selection mechanisms Genetic Algorithm Performance results

grazia
Download Presentation

Optimizing Sorting With Genetic Algorithms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Xiaoming Li, María Jesús Garzarán, and David Padua University of Illinois at Urbana-Champaign Optimizing Sorting With Genetic Algorithms

  2. ESSL on Power3

  3. ESSL on Power4

  4. Outline Our Solution Primitives & Selection mechanisms Genetic Algorithm Performance results Classifier System Conclusion

  5. Motivation • No universally best sorting algorithm • Can we automatically GENERATE and tune sorting algorithms for each platform (such as FFTW and Spiral)? • Performance of sorting on the platform and on the input characteristics. • The algorithm selection may not be enough.

  6. Algorithm Selection (CGO’04) • Select the best algorithm from Quicksort, Multiway Merge Sort and CC-radix. • Relevant input characteristics: number of keys, entropy vector.

  7. Algorithm Selection (CGO’0

  8. Proposed Solution • We need different algorithms for different partitions • The best sorting algorithm should be the result of the composition of the these different best algorithms. • Build Composite Sorting algorithms • Identify primitives from the sorting algorithms • Design a general method to select an appropriate sorting primitive at runtime • Design a mechanism to combine the primitives and the selection methods to generate the composite sorting algorithm

  9. Outline • Our Solution • Primitives & Selection mechanisms • Genetic Algorithm • Performance results • Classifier System • Conclusion

  10. Sorting Primitives • Divide-by-Value • A step in Quicksort • Select one or multiple pivots and sort the input array around these pivots • Parameter: number of pivots • Divide-by-Position (DP) • Divide input into same-size sub-partitions • Use heap to merge the multiple sorted sub-partitions • Parameters: size of sub-partitions, fan-out and size of the heap

  11. Sorting Primitives • Divide-by-Radix (DR) • Non-comparison based sorting algorithm • Parameter: radix (r bits) • Step 1: Scan the input to get distribution array, which records how many elements in each of the 2r sub-partitions. • Step 2: Compute the accumulative distribution array, which is used as the indexes when copying the input to the destination array. • Step 3: Copy the input to the 2r sub-partitions. src. counter accum. dest. 0 1 2 3 0 1 2 3 11 23 30 12 1 1 1 1 0 1 2 3 1 2 3 4 30 11 12 23

  12. Sorting Primitives • Divide-by-radix-assuming-uniform-distribution (DU) • Step 1 and Step 2 in DR are expensive. • If the input elements are distributed among 2r sub-partitions near evenly, the input can be copied into the destination array directly assuming every partition have the same number of elements. • Overhead: partition overflow • Parameter: radix (r bits) src. accum. dest. 0 1 2 3 11 23 30 12 0 1 2 3 1 2 3 4 30 11 12 23

  13. Selection Primitives • Branch-by-Size • Branch-by-Entropy • Parameter: number of branches, threshold vector of the branches

  14. Leaf Primitives • When the size of a partition is small, we stick to one algorithm to sort the partition fully. • Two methods are used in the cleanup operation • Quicksort • CC-Radix

  15. Composite Sorting Algorithms • The composite sorting algorithms are built from these primitives. • The algorithms have shapes of tree.

  16. Outline • Our Solution • Primitives & Selection mechanisms • Genetic Algorithm • Performance results • Classifier System • Conclusion

  17. Search Strategy • Search the best tree • Search the best parameter values of the primitives • Good solutions for small size problem should be retained to use in the solution for larger problem. • Genetic algorithms are a natural solution that satisfy the requirements: • Preserve good sub-trees • Give good sub-trees more chances to propagate

  18. Composite Sorting Algorithms • Search the best parameter values to adapt • To the architectural features • To the input characteristics

  19. Search Strategy • Search for the best tree • Search for the best parameter values of the primitives • Good solutions for small size problem should be retained to use in the solution for larger problem. • Genetic algorithms are a natural solution that satisfy the requirements: • Preserve good sub-trees • Give good sub-trees more chances to propagate

  20. Genetic Algorithm • Mutation • Mutate the structure of the algorithm. • Change the parameter values of primitives.

  21. Crossover • Propagate good sub-trees

  22. Fitness Function • A fitness function measures the relative performance of the genomes in a population. • The average performance of a genome on the training inputs is the base for the fitness of the genome. • A genome which performs well across inputs is preferred • fitness is penalized when performance varies across the test inputs

  23. Library Generation • Installation phase: Use genetic algorithm to search for the sorting genome. • Set of genomes in initial population • Test the genomes in a set of inputs with different characteristics

  24. Outline • Our Solution • Primitives & Selection mechanisms • Genetic Algorithm • Performance results • Classifier System • Conclusion

  25. Platforms • AMD Athlon MP • Sun UltraSparcIII • SGI R12000 • IBM Power3 • IBM Power4 • Intel Itanium2 • Intel Xeon

  26. AMD Athlon MP

  27. Power3

  28. Multiple-peak Performance

  29. Outline • Our Solution • Primitives & Selection mechanisms • Genetic Algorithm • Performance results • Classifier System • Conclusion

  30. The best genomes in different regions

  31. Problems of Genetic Adaptation • Fitness function is the average performance of the genome on the test inputs. • Fitness function in our genetic algorithm prefers genomes with stable performance • The genetic algorithm is not powerful enough to evolve into the complex genome which chooses the best genome in each small region

  32. Using Classifier System • Search the best genomes for different regions of the input characteristics. • Selects the regions • Selects the best algorithm for each region • Nice feature: The fitness of a genomes in a region will not be affected by its fitness in other regions

  33. Map sorting composition into a classifier system • The input characteristics (number of keys and entropy vector) are encoded into bit strings. • A rule in the classifier system has two parts • Condition: A string consisting of ‘0’, ‘1’, and ‘*’. Condition string will be used to match the encoded input characteristics. • Action: Sorting genomes without branch primitives

  34. Example for Classifier Sorting • Example: • For inputs of up-to 16M keys • Encode number of keys with 4 bits. • 0000: 0~1M, 0001: 1~2M… • Number of keys = 10.5M. Encoded into “1100” 1100 01** 1100 1010 1100 110* (dv 2 ( lr 6 16))

  35. Performance of Classifier Sorting • Power3

  36. Power4

  37. Conclusions • Replace the complexity of finding an efficient algorithm with the task of defining a set of generic primitives. • Design methods to search in the space of the composition of the primitives. • Genetic algorithms • Classifier system

More Related