130 likes | 289 Views
Hardware Acceleration of Parallel Prefix Algorithms. Peter Scott (Project leader) Avinash Srinivasa Vaibhav Sundriyal. What is parallel prefix?. Finding parallelism in serial-looking problems. Take an array, like [1, 3, 2, 1] Find partial sums: [1, 1+3, 1+3+2, 1+3+2+1]
E N D
Hardware Acceleration of Parallel Prefix Algorithms Peter Scott (Project leader) Avinash Srinivasa Vaibhav Sundriyal
What is parallel prefix? • Finding parallelism in serial-looking problems. • Take an array, like [1, 3, 2, 1] • Find partial sums: [1, 1+3, 1+3+2, 1+3+2+1] • We can use any associative operation, not just addition. • Matrix multiplication is okay • Vector dot product doesn’t work
Applications • DNA sequence alignment • Large tree data structure acceleration • Incremental regular expression matching • Many others, parameterizable by kernel.
Parallel version of this • Distribute data to several processors. • Do redundant computations to get parallelism. Image taken from Steele & Hillis, 1986.
Architecture • Several processors, shared multi-channel bus
1,2 3,4 5,6 7,8 P1 P2 P3 P4 1,3 3,7 5,11 7,15 OPERATE 1,3 3,7 5,11 7,15 COMMUNICATE 1,3 6,10 5,11 18,26 UPDATE 1,3 6,10 5,11 18,26 COMMUNICATE 1,3 6,10 15,21 28,36 UPDATE
Bus contention • There are often more processors than bus channels. • How to deal with contention? • Answer: pre-computed static scheduling. • Store schedule as sequence of instructions: • Write <channel> • Load <channel> • No_op • Comm_step_complete
How to use the final product • Write VHDL for an associative binary operation, like addition or multiplication. • Say how many processors you want, how wide your data are, how many bus channels, etc. • A wizard generates all the VHDL. • Just customize it and go.
…and various supporting files • Bus program memory holds bus instructions • Prefix accelerator instantiates processors and bus • Etc.
Related Papers • Explanation of parallel prefix and DNA sequence alignment (Aluru): http://class.ece.iastate.edu/cpre526/basics.pdf • Data parallel algorithms (Steele and Hillis): http://cva.stanford.edu/classes/cs99s/papers/hillis-steele-data-parallel-algorithms.pdf • Prefix sums and their applications (Bleloch): http://www.cs.cmu.edu/~guyb/papers/Ble93.pdf • Finger trees (Hinze & Paterson): http://www.soi.city.ac.uk/~ross/papers/FingerTree.pdf