310 likes | 558 Views
Design and Implementation of FPGA-based systolic array for LZ Data Compression. By Mohamed Ahmed Abd El Ghany Ahmed. 2006. Overview. Introduction to Data Compression Data Compression Methods Systolic Array Operation in LZ Proposed design (Design-P) FPGA Implementation
E N D
Design and Implementation of FPGA-based systolic array for LZ Data Compression By Mohamed Ahmed Abd El Ghany Ahmed 2006
Overview • Introduction to Data Compression • Data Compression Methods • Systolic Array Operation in LZ • Proposed design (Design-P) • FPGA Implementation • Testing Application • Software simulation • Conclusions
Introduction to Data Compression • Data compression is the process of converting an input data stream into another data stream with a reduced size. • Benefits of data compression • Reduction of data storage requirements • Reduction of data transfer cost
Data Compression Methods Lossless Data Compression Lossy Data Compression The decompressed data must always be identical to the original data The decompressed data are some approximation of the original data Run-Length Encoding Transform coding schemes Statistical Methods Vector Quantization schemes Dictionary Methods Sub-band coding schemes
Lempel Ziv Algorithms LZ77 LZ78 LZSS LZH LZW LZMW
LZSS Idea Dictionary Lookahead buffer a b c b b a c d e b b a d e a a . . . . Window Output codeword (1, Ip, Lmax) (1, 2, 3) b b a c d e b b a d e a a e f g . . . . Shifting by Lmax ( 3 )
Codeword length Lc Lc =log2 (dictionary length) + log2 (lookahead buffer length) + 1 bits In the example, Lc= log2(7)+log2(5)+1 = 7 bits b b a (1, 2,3) 3 bytes = 24 bits Compressed to 7 bits
Non-Match Case Dictionary Lookahead buffer a b c b b a c d e f a c d e a a . . . . Window Output codeword (0, S) S = first symbol of lookahead buffer (0, f ) b c b b a c d e f a c d e a a . . . . Shifting by 1
E5 E4 E3 E2 E1 E0 i Y0 j X0 Ls Y1 X1 Y2 X7 X6 X5 X4 X3 X2 L5 L4 L3 L2 L1 L0 n-Ls Systolic Array Operation in LZ dictionary Lookahead buffer Length =Ls Length = n-Ls
Interleaved Design (Design-i) Li PE1 PE2 PE0 D D X7 X4 X6 X3 X5 X2 X4 X1 X3 X0 D D Y2 Y1 Y0 Input sequence
Lmax Reg l Lmax Mux Li a b comparator a > b Ip Reg Counter for Xi position p Ip Mux p n - Ls Mux Counter for Xi + [n- Ls/2] position a b comparator a = b Code word ready The Match Results Block
Proposed Design (Design-P) PE1 PE2 PE0 X7….X2 X1 X0 D D Y2 Y1 Y0 E0 E1 E2 Li L-encoder
Yj Yj w w Reg Reg Xi Xi w w D a b comparator a = b a b Comparator a = b Ei Ei D Li Accumulator D D Design-P PE Design-i PE
L-Encoder E0 Li0 E1 Li1 E2
Lmax Reg Lmax Reg l Lmax Mux Lmax l Li Mux Li Ls a b c omparator a > b a b comparator a > b a b comparator a = b Ip Reg done Ip Reg Ip p Mux p Counter for Xi position counter p Ip Mux p n - Ls n - Ls Code word ready a b comparator a = b Mux Counter for Xi + [n- Ls/2] position a b comparator a = b Code word ready MRB of Design-P MRB of Design-2i
Parallel Compression PE2 PE1 PE0 D D X0 Y2 Y1 Y0 X1 X2 E0 E1 E2 LI X3 L-encoder X4 X5 PE1 PE2 PE0 X6 X7 D D Y2 Y1 Y0 X8 E0 E1 E2 LII L-encoder
LZ Compression Chip Yi FIFO Xi SALZC component Input sequence Control_FIFO Control Li Host controller Code word
First-in-First-out (FIFO) Block RAM Write_counter Write_address Input_sequence controls Read_counter read_address
I/O Interface of LZ Compression Chip LZ compression chip Data input codeword 8 16 Codeword ready Control signals 6 end
Parallel port interface Latch of input stream LZ compression chip 5 8 8 8 5 En 5 Mux Latch of control signals 5 6 3 S2 S1 En Testing Application
Data Flow of Testing Application Data stream LZ compression Chip PC Compressed data
Input codeword Code checker To pointer En Direct symbol To length Shift control En select R1 R2 R(n-Ls-1) R(n-Ls) output MUX length Pointer Selector logic Decompression Architecture
The Compression Rate (Rc) • Example: The dictionary size (n) = 1k Ls =16 w =8 clk = 104.308 MHz LsW clk Rc = n-Ls+1 Rc= 13 Mbit per second
Software Simulation Data Sets Silesia corpus Calgary corpus
Conclusions • The proposed implementation is area and speed efficient. The compression rate is increased by more than 40% and the design area is decreased by more than 30%. • The prototype is executed using XILINX, Spartan II FPGA. • The chip can be incorporated among real-time systems so that data can be compressed and decompressed on-the-fly.
Future Work • Studying the effect of combining the proposed architecture for LZ data compression and elliptic curve cryptography in a single chip. • Study the fast string matching techniques are required to accelerate the compression process. • By modifying the host controller and including, e.g., dictionaries, our chip can be used for other string-matching based LZ algorithms, such as LZ78 and LZW.