200 likes | 387 Views
FPGA Implementation of Lookup Algorithms. Presenter : Shi- qu Yu Email : P76001158@mail.ncku.edu.tw Date : 2011/08/31. Outline. POLP[2] and BPFL algorithm BPFL Search Engine POLP Search Engine Performance. POLP algorithm. POLP:Parallel optimized linear pipeline algorithm
E N D
FPGA Implementation of Lookup Algorithms Presenter : Shi-qu Yu Email : P76001158@mail.ncku.edu.tw Date : 2011/08/31
Outline • POLP[2] and BPFL algorithm • BPFL Search Engine • POLP Search Engine • Performance
POLP algorithm • POLP:Parallel optimized linear pipeline algorithm • Main idea: Split the original binary tree into non-overlapping subtrees that are distributed across P pipelines. • Chose Pipeline:Baseon the first I bits of the IP address.
BPFL Algorithm • An extension of the PFL algorithm • Advantage of BPFL: Frugally uses the memory resources so the large lookup tables can fit the on-chip memory. • Next-hop information->External memory • Lookup table->On-chip memory • The subtree prefixes are stored in the corresponding balanced trees(PFL:register)
BPFL Algorithm(cont.) • External memory is accessed only once at the end of the lookup when the next-hop information is retrieved. • The on-chip lookup table is organized as a binary tree divided into levels that are searched in parallel. • If the substree is sparsly->Only indices of existing nodes are kept.
BPFL Search Engine BPFL search engine top-level
BPFL Search Engine(cont.) Find the subtree at this level. Process prefix search.
BPFL Search Engine(cont.) Subtree search engine at level i.
BPFL Search Engine(cont.) Subtree search engine at level i.
BPFL Search Engine(cont.) Prefix search engine at level i.
POLP Search Engine(cont.) Pipeline structure.
POLP Search Engine(cont.) Stage i structure.
Performance • Altera’sStratix II EP2S180F1020C5 chip [10]. • The SRAM memory is used as the external memory.
Performance(cont.) BPFL(DS=8)
Performance(cont.) POLP(I=16)
Performance(cont.) BPFL(DS=8)
Performance(cont.) POLP(I=16)