1 / 26

High Throughput LDPC Decoders Using a Multiple Split-Row Method

This paper discusses the implementation of high throughput LDPC decoders using a multiple split-row method. The proposed method reduces interconnect complexity and processor complexity while increasing throughput. It is well-suited for long-length LDPC codes and hardware implementations.

montana
Download Presentation

High Throughput LDPC Decoders Using a Multiple Split-Row Method

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. High Throughput LDPC Decoders Using a Multiple Split-Row Method Tinoosh Mohsenin and Bevan M. Baas VLSI Computation Lab, ECE Department University of California, Davis

  2. Outline • Introduction to LDPC Codes and Decoders • Multi-Split-Row Decoding Method • Implementing Multi-Split-Row Decoders • Conclusion

  3. Error Correction in Communication Systems Error correction is widely used in communication systems

  4. LDPC Codes Applications • Standards • Digital Video Broadcasting (DVB-S2): 2005 • 10 Gigabit Ethernet (10GBASE-T): 2006 • Next generation of WiMAX • Challenges with LDPC decoders • High memory bandwidth requirement • High interconnect complexity • Many target applications are power and cost constrained

  5. Row Processing é ù 0 0 1 1 0 0 0 1 0 ê ú 1 0 0 0 1 0 0 0 1 ê ú ê ú 0 1 0 0 0 1 1 0 0 ê ú Column Processing H = 0 0 1 0 1 0 1 0 0 ê ú ê ú 1 0 0 0 0 1 0 1 0 ê ú ê ú 0 1 0 1 0 0 0 0 1 ë û LDPC Decoding: Message Passing Algorithm α • Performs row and column operations iteratively • Example (9,5) LDPC Code • Code length (N) = 9 • Information length = 5 • Row weight (Wr) = 3 • Column weight (Wc) = 2 Row processing Column processing β

  6. é ù 0 0 1 1 0 0 0 1 0 ê ú 1 0 0 0 1 0 0 0 1 ê ú ê ú 0 1 0 0 0 1 1 0 0 = ê ú H 0 0 1 0 1 0 1 0 0 ê ú ê ú 1 0 0 0 0 1 0 1 0 ê ú ê ú 0 1 0 1 0 0 0 0 1 ë û Message Passing (Row processing ) Row Processing SPA: MinSum:

  7. é ù 0 0 1 1 0 0 0 1 0 ê ú 1 0 0 0 1 0 0 0 1 ê ú ê ú 0 1 0 0 0 1 1 0 0 = ê ú H 0 0 1 0 1 0 1 0 0 ê ú ê ú 1 0 0 0 0 1 0 1 0 ê ú ê ú 0 1 0 1 0 0 0 0 1 ë û Message Passing (Column processing ) Column Processing is the received information from the channel

  8. Decoder Architectures • Serial decoders • Single row processor, column processor, shared memory • Simple and small area • Disadvantages • Low throughput: 100 Kbps - 10 Mbps • Semi-parallel decoders • Multiple row and column processors, multiple memory banks • Higher throughput • Example: 2048-bit, rate-1/2, (3,6) programmable decoder [Mansour 2006] • 14.3 mm2, 0.18 μm CMOS • 125 MHz, 640 Mbps

  9. 5x384x32 =61440 Row Row Row 2 1 384 5x2048x6 =61440 Col Col Col Col 2048 1 2 3 Full Parallel Decoders • Row and column processors are directly mapped according to the parity check matrix • Highest throughput • Major challenges • Routing congestion due to extrinsic information passed between row and column processors • Large delay, area, and power caused by long wires • Example: 1024-bit, irregular code, 4 bits per symbol, [Blanksby 2002] • 52.5 mm2, 0.16 μm CMOS • 64 MHz, 1Gbit/sec M N

  10. Outline • Introduction to LDPC Codes • Split-Row Decoder Algorithm • Multi-Split-Row Decoding Method • Implementing Multi-Split-Row Decoders • Conclusion

  11. Goals • Very high throughputs • Area efficient (small circuit area) • Therefore more energy efficient • Well suited for long-length LDPC codes • Well suited for hardware implementations

  12. The Multi-Split-Row Decoder • Key ideas • H matrix is split into multiple blocks • Each block is processed almost independently • Minimal information is shared between blocks • Results • Lower interconnect complexity • Reduced processor complexity • Hardware results • Higher throughput • Smaller decoder area and higher area utilization • Slightly increased error rate

  13. Standard vs. Multi-Split-Row Decoder Standard Multi-Split-Row

  14. Multi-Split-Row Algorithm • The magnitude portion of the row processor output α is larger for the Multi-Split-Row decoder • By normalizing the α values with a scale factor S<1 the error performance of Multi-Split-Row decoder is improved Sign Magnitude S

  15. Optimum Scale factor Multi-Split-4 Multi-Split-2 Bit Error Probability Bit Error Probability Scale Factor = 0.2 Scale Factor = 0.3 (2048,1723) RS-based LDPC code used by 10 Gbit Ethernet standard Row weight: 32 Column weight: 6 No. of iterations:15

  16. Bit Error Rate Performance Comparison Code length: 2048 bits Message length: 1723 bits Row weight: 32 Column weight: 6 No. of iterations:15 SPA: Sum Product Algorithm [Mackay 1999] MinSum: [Fossorier 2002] WBF: Weighted Bit Flipping [Kou, Lin 2001] Improved WBF: [Fossorier 2004] BF: Bit Flipping [Gallager 1963] 0.35dB 0.25dB

  17. Bit Error Rate Performance Comparison Code length: 5256 bits Message length: 4823 bits Row weight: 72 Column weight: 6 No. of iterations: 15 0.25 dB 0.3 dB

  18. Optimum Scale Factors for Different Codes • Multi-split row works best for: • Regular codes • High row-weight codes • The optimum scale factor decreases as the partitioning of the H matrix increases

  19. Outline • Introduction to LDPC Codes and Decoder Arch • Multi-Split-Row Decoding Method • Implementing Multi-Split-Row Decoders • Conclusion

  20. Sign-wire implementation

  21. Full-Parallel Decoder Implementations Standard Multi-Split-Row-2 Multi-Split-Row-4 • (2048,1723) RS-based (6,32) LDPC code

  22. A Full-Parallel Decoder Implementation • Number of sign-passing wires is negligible compared to the total number of wires. TotalNumofWires = 2bMWr+ 2(Spn-1)M • (2048,1723) LDPC code with • N = 2048 • M (number of rows) = 384 • b (bits per symbol) = 5 • Wr = 32

  23. Full Parallel Decoder Chips 0.18 µm CMOS Technology, 6M layer

  24. Three Full Parallel MinSum Decoders • (6,32) (2048,1723) RS-based LDPC code • Resolution of 5 bits per message • Throughputs calculated at 15 decoding iterations • Results based on 0.18 µm CMOS, 1.8 V @ 85 C

  25. Conclusion • Multi-Split-Row decoder method provides a significant reduction in circuit area • Results in: • Reduced wire interconnect complexity • Increased circuit area utilization • Increased speed • Simpler implementation • A good tradeoff between hardware complexity and error performance

  26. Acknowledgments • Support • Intel Corporation • UC MICRO • NSF Grant No. 0430090 • NSF CAREER Award No. 0546907 • UCD Faculty Research Grant • Thanks • Prof. Shu Lin • Lan Lan • Eric Work • Zhiyi Yu

More Related