230 likes | 252 Views
This research explores the use of network processors in genomics, specifically in the context of the BLAST algorithm. By offloading certain phases of the algorithm onto NPUs, significant improvements in performance and scalability can be achieved.
E N D
H. Bos – Leiden University 13/02/2004 1 Using Network Processors inGenomics Herbert Bos* † Kaiming Huang* {herbertb,khuang}@liacs.nl *Leiden Universiteit, Netherlands † Vrije Universiteit, Netherlands http://www.liacs.nl/~herbertb/projects/biocomp/
H. Bos – Leiden University 13/02/2004 2 Case study: BLAST • search nucleotide/protein database for query • BLAST discovers similarity rather than exact match • two main phases: • scoring (registering where query and DNADB match) • alignment (dynamic programming) • only the first phase on NPUs
H. Bos – Leiden University 13/02/2004 3 Window matching
H. Bos – Leiden University 13/02/2004 4 Window matching
H. Bos – Leiden University 13/02/2004 5 Window matching
H. Bos – Leiden University 13/02/2004 6 Window matching
H. Bos – Leiden University 13/02/2004 7 Window matching • naïve approach: roughly W*N*M comparisons • does not scale • string search algorithms: Aho-Corasick • all windows matched at the same time • shifting genome one nucleotide at a time • matching algorithm transformed in a DFA • DFA may be quite large
H. Bos – Leiden University 13/02/2004 8 Aho-Corasick • Alphabet: acgt • Window size: 3 • Query: acgccga • Windows: {acg,cgc,gcc,ccg,cga}
H. Bos – Leiden University 13/02/2004 9 Aho-Corasick • Alphabet: acgt • Window size: 3 • Query: acgccga • Windows: {acg,cgc,gcc,ccg,cga} a c g t 0 1 2 3 c g c 4 5 6 a 12 c g 10 11 g c c 7 8 9
H. Bos – Leiden University 13/02/2004 10 Aho-Corasick • Alphabet: acgt • Window size: 3 • Query: acgccga • Windows: {acg,cgc,gcc,ccg,cga} a c g t 0 1 2 3 c g c 4 5 6 a 12 c g 10 11 g c c 7 8 9
H. Bos – Leiden University 13/02/2004 11 Aho-Corasick • Alphabet: acgt • Window size: 3 • Query: acgccga • Windows: {acg,cgc,gcc,ccg,cga} a c g t 0 1 2 3 c g c 4 5 6 a 12 c g 10 11 g c c 7 8 9 tacgcga
SRAM H. Bos – Leiden University 13/02/2004 12 IXPBlast Architecture Gbps ports NPU (IXP1200) ME ME scratch ME ME DRAM Control Processor ME ME Pentium StrongARM Microengines PCI Bus PCI
SRAM H. Bos – Leiden University 13/02/2004 13 IXPBlast Architecture Gbps ports NPU (IXP1200) ME ME scratch ME ME DRAM Control Processor ME ME Pentium StrongARM Microengines PCI Bus PCI
SRAM H. Bos – Leiden University 13/02/2004 14 IXPBlast Architecture Gbps ports NPU (IXP1200) ME ME scratch ME ME DRAM Control Processor ME ME Pentium StrongARM Microengines PCI Bus PCI
a c g 0 1 2 3 t c g c 4 5 6 a 12 SRAM c g 10 11 g c c 7 8 9 H. Bos – Leiden University 13/02/2004 15 IXPBlast Architecture Gbps ports NPU (IXP1200) ME ME scratch ME ME DRAM Control Processor ME ME Pentium StrongARM Microengines PCI Bus PCI
a c g 0 1 2 3 t c g c 4 5 6 a 12 SRAM c g 10 11 g c c 7 8 9 H. Bos – Leiden University 13/02/2004 16 IXPBlast Architecture Gbps ports NPU (IXP1200) ME ME scratch ME ME DRAM Control Processor ME ME Pentium StrongARM Microengines PCI Bus PCI
a c g 0 1 2 3 t c g c 4 5 6 a 12 SRAM c g 10 11 g c c 7 8 9 H. Bos – Leiden University 13/02/2004 17 IXPBlast Architecture Gbps ports NPU (IXP1200) ME ME scratch ME ME DRAM Control Processor ME ME Pentium StrongARM Microengines PCI Bus PCI
H. Bos – Leiden University 13/02/2004 18 IXPBlast: packet handling 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 • packets read and processed in batches of 100.000 • “spilling” must be taken into account • currently no feedback
H. Bos – Leiden University 13/02/2004 19 Results • 232 MHz IXP1200 ~ 1.8GHz Pentium-4 • 1611 Nucleotide query (MyD88) • 1.4 GB genome (Zebrafish) • IXP1200: 90 sec with DFA • IXP1200: 129 sec with “trie” • P4: 132: 132 sec with “trie” • number of matches: 524856
H. Bos – Leiden University 13/02/2004 20 Results
H. Bos – Leiden University 13/02/2004 21 Conclusions • NPUs are useful in other application domains • Newer hardware is expected to perform much better • “Throughput processors” • Adapting our current approach to use BLAST tricks/heuristics
H. Bos – Leiden University 13/02/2004 22 Network processors • geared for high throughput • used exclusively in network systems • example: intrusion detection • similar to looking for gene onin genomes • differences Radisysixp1200 board
H. Bos – Leiden University 13/02/2004 23 Application domain: “Genomics” • example: search genome for occurrence of “patterns” • similar problems as IDS, poor performance on GPP cannot exploit parallelism • throughput-driven • how about FPGAs? • how about clusters? • NPU • easier to program than FPGAs • cheaper than cluster computing • “on the desktop” IP never leaves the room