170 likes | 317 Views
Parallel Hardware for Sequence Comparison and Alignment. By Richard Hughey University of California, Santa Cruz Presented by Travis Brown. Introduction. Use dynamic programming Compare shorter sequences first Sequence comparison is O( n 2 )
E N D
Parallel Hardware for Sequence Comparison and Alignment By Richard Hughey University of California, Santa Cruz Presented by Travis Brown
Introduction • Use dynamic programming • Compare shorter sequences first • Sequence comparison is O(n2) • O(n2/log n) version is available, not not practical for parallelization • Use standard local alignment algorithm to find best score
Course Grain Approach • Suitable for large database searches • Data is divided evenly among all PEs • Multiple independent analysis is performed • Not suitable for small databases because there is not enough data for all the Pes • Method of choice for non-specialized processors
Fine Grain Approach • O( n ) processing element are used to compare two sequences in O( n + N ) time • Calculations along a single diagonal can all be performed at once • Assign on PE to each character of the query string • Shift the database through linear array of PEs
Fine Grain Approach (cont.) • Each ci,j is dependent only on ci-1, j-1 so each row in the matrix can be computed simultaneously • Entire calculation can be completed in N time steps on n Pes • Method of choice for special-purpose processors
Architectures • There are 5 major architectures that can be used for sequence analysis • Workstations • Supercomputers • Single-purpose VLSI • Reconfigurable Hardware • Programmable co-processors
Workstation • Used together as a Network of Workstations • Best used for coarse-grain problems • Fairly inexpensive and can be used for many other tasks
Supercomputer • Most flexible means of fast sequence analysis • Very costly • SIMD works well • MIMD performs only slightly better than the 5 Alphas
Single Purpose VLSI • Highest performance for a single algorithm • Inexpensive (~$12,000) • This is the method of choice for • BioScan (812 PEs per chip) • Fast Data Finder (5 board, 3360 PE)
Reconfigurable Hardware • Based on FPGAs (Field Programmable Gate Arrays) • Generally have a higher cost than Single Purpose VLSI machines
Programmable co-processors • Cost of these systems is more than Single Purpose VLSI, but less than Reconfigurable Hardware • Have hardware dedicated to performing simple tasks (i.e. adding 2 numbers)
Cost of Systems • Several of these systems have not been built, or not completed, so costs are estimates only • Commercial and research machines are expandable • Faster systems come at a higher cost
Discussion • Some systems are faster under different conditions • This makes evaluating systems difficult • Different algorithms produce different results • The algorithms change over time, and so do the system requirements
Conclusion • There is no “best” solution for any problem • Cost/Performance is important, but difficult to measure • Specialized, yet programmable hardware seems to be the best solution
Biography • Richard Hughey • Associate Professor and Chair of Computer Engineering at University of California, Santa Cruz • Email: rph@cse.ecsc.edu