1 / 33

Tinoosh Mohsenin and Bevan M. Baas VLSI Computation Lab, ECE Department

Split-Row: A Reduced Complexity, High Throughput Low Density Parity Check (LDPC) Decoder Architecture. Tinoosh Mohsenin and Bevan M. Baas VLSI Computation Lab, ECE Department University of California, Davis. Outline. Introduction to LDPC Codes Split-Row Decoder Algorithm

rutht
Download Presentation

Tinoosh Mohsenin and Bevan M. Baas VLSI Computation Lab, ECE Department

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Split-Row: A Reduced Complexity, High Throughput Low Density Parity Check (LDPC) Decoder Architecture Tinoosh Mohsenin and Bevan M. Baas VLSI Computation Lab, ECE Department University of California, Davis

  2. Outline • Introduction to LDPC Codes • Split-Row Decoder Algorithm • Error Performance Comparison • Decoder Implementation Results • Conclusion

  3. Error Correction in Communication Systems Error correction is widely used in most communication systems.

  4. LDPC Codes Applications • Standards: • 10 Gigabit Ethernet (10GBASE-T): 2006 • Digital Video Broadcasting (DVB-S2):2005 • Next generation of WiFi and WiMAX • Problems with current LDPC decoders • Lack of enough memory bandwidth • High interconnect complexity [www.ieee802.org/3/an/ ]

  5. LDPC Coding Transmitter: Noisy Channel Encoded Image Receiver: Decoded Image Received Image Iteration 1 Iteration 14 Modified images from [Maccay 2001]

  6. α Received information from channel Row processing Column processing β Row Row processing processing α β Col Col Row Processing processing processing é ù 0 0 1 1 0 0 0 1 0 ê ú Error Error 1 0 0 0 1 0 0 0 1 ê ú correction correction ê ú 0 1 0 0 0 1 1 0 0 ê ú Column Processing Parity check Parity check 0 0 1 0 1 0 1 0 0 ê ú ê ú 1 0 0 0 0 1 0 1 0 ê ú ê ú 0 1 0 1 0 0 0 0 1 ë û LDPC Decoding: Message Passing Algorithm • Performs row and column operations iteratively.

  7. Serial Decoders • One or a few row and column processing units. • Features • Simple • Small area • Small number of memories • Disadvantages • Low memory bandwidth • Low throughput : 100 Kbps-10Mbps

  8. Full Parallel Decoders • Row and column processors are directly mapped according to the parity check matrix • High throughput • Disadvantages • Large circuit area • High interconnect complexity • Example: 2048-bit, 10GBASE-T • Row weight=32, Col weight=6, quantization bit=5 • 139 mm2 in 0.18 µm CMOS • 122,000long inter-processor wires • 1.3 Gbps

  9. Outline • Introduction to LDPC Codes • Split-Row Decoder Algorithm • Error Rate Comparison • Decoder Implementation Results • Conclusion

  10. Key Features of Split-Row Decoder • Row processing (dominates decoder complexity) • Increased parallelism • Reduced number of memory accesses • Reduced processor complexity • Results: • Smaller decoder area and higher utilization • Lower interconnect complexity • Higher throughput • Simpler hardware implementation

  11. N columns columns columns N/2 N/2 row weight= Wr row weight= row weight= Wr /2 Wr/2 Standard vs. Split-Row Decoder Standard Decoder Split-Row Decoder

  12. Split-Row Algorithm-Mathematical View • The magnitude part of the row processor output α, is larger for the Split-Row decoder • By normalizing the α values with a scale factor S<1 the error performance of Split-Row decoder is improved

  13. Outline • Introduction to LDPC Codes • Split-Row Decoder Algorithm • Error Performance Comparison • Decoder Implementation Results • Conclusion

  14. Bit Error Rate Performance Comparison Code length: 1536 bits Message length: 1155 bits Row weight: 16 Column weight:4 No. of iterations:15 MS: MinSum MS Split-Row: MinSum- Split Row S: Scale factor 0.6dB

  15. Bit Error Rate Performance Comparison Code length: 2048 bits Message length: 1723 bits Row weight: 32 Column weight:6 No. of iterations:15 MS: MinSum MS Split-Row: MinSum- Split Row S: Scale factor 0.3dB

  16. Outline • Introduction to LDPC Codes • Split-Row Decoder Algorithm • Error Rate Comparison • Decoder Implementation Results • Conclusion

  17. A Full-Parallel Decoder Implementation • LDPC code example: • Code length=1536 bits • Message length=770 bits • Row weight=6 • Col weight=3 In Split-Row decoder: • Total no. of wires between each half is 3% of total wires. • Row processors in each half are 2.7 times smaller • Each row processor in each half is connected to only 3 column processors

  18. Full Parallel Decoder Architecture 0.18 µm CMOS Technology, 6M layer • Split-Row, each half includes: • 768 row processors • 768 column processors Standard MinSum

  19. Split-Row vs. Standard Decoder (mm) (mm2) (MHz) (Gbps) • 1536-bit (3,6) Quasi-cyclic LDPC code • No. of quantization bits is set to 5 bits per message. • For throughput computation no. of decoding iterations is set to 15. • Reported numbers are based on chip implementation results in 0.18 µm

  20. Conclusion • Split-Row decoder method provides a significant reduction in circuit area • Results in: • Reduced wire interconnect complexity • Increased circuit area utilization • Increased speed • Simpler implementation • A good tradeoff between hardware complexity and error performance

  21. Acknowledgments • Intel Corporation • UC Micro • NSF Grant No. 0430090 • UCD Faculty Research Grant

  22. MinSum: Message Passing (Row processing )

  23. Message Passing (Column processing ) λjis the received information.

  24. é ù 0 0 1 1 0 0 0 1 0 ê ú α 1 0 0 0 1 0 0 0 1 ê ú ê ú 0 1 0 0 0 1 1 0 0 = ê ú H 0 0 1 0 1 0 1 0 0 ê ú α ê ú 1 0 0 0 0 1 0 1 0 ê ú ê ú 0 1 0 1 0 0 0 0 1 ë û y 1 λ1

  25. = 0 (Stop decoding) ≠0 (Repeat decoding)

  26. LDPC Codes • An LDPC code is defined by a binary matrix called parity check matrix H. • Rows define parity check equations (constrains) between encoded symbols in a code word and columns define the length of the code. • V is a valid code word if H٠Vt=0 • Decoder in the receiver checks if the condition H٠Vt=0 is valid. • Example : Parity check matrix for (9, 5) LDPC code, row weight=4, column weight =2:

  27. Row Proc. Col. Proc. Row and Column Processor Architecture

  28. Row+Col Procs. Right Row+Col Procs. left

  29. Throughput=Clk*Code length/Imax • P=cfv2

  30. What is the critical path and how you make sure that sign is computed correctly? • Answer: the critical path is the sign computation, which depends on the other side. The statistical timing analysis in place and route reports the slowest path delay, so it will make sure that the circuit works correctly. • Why the decoder chip becomes smaller even when you make it into half? • Answer: first the size and total no of col processors doesn’t change. The main benefit comes from the row processor which gets smaller than twice. The reason is that inside row processor there are different stages of comparators and they decrease more than twice when the number of inputs reduces to half. • You mentioned the design is power efficient but you didn’t report any power numbers • Answer: For this paper we didn’t get the power numbers, but it can be estimated from the fact the major energy comes from the wires (p=1/2cf^2) and we can say it’s scaled down linearly so it’s about 58% reduction. • Are there other works close to your design?

  31. Which applications can tolerate this error performance loss? • This a very broad question. It really depends on the power budget and how much low you want to go on ber. • What is the difference between viterbi and LDPC code? • What is the difference between the turbo and LDPC? • If don’t know the answer: • I was not involved in That part of project but from what I know …. • Review the previous works • If asked why the chip figure is not square? • If somebody asked: the way yu proposed didn’t decrease the no of wires how do you say that it decreases the interconncet complexity. • You should notice that we are talking about long wires. Because when there is a large no of wires conincting one

  32. Hard decision vs. soft: • In hard decision decoding each received symbol is thresholded to yield a single received bit as input to the decoding algorithm and messages passed between variable and check nodes as single bit only In soft decision decoding, multiple bits are used to represent each received symbol and the messages passed between variable and check node • How did you compute

More Related