A Scalable Architecture for LDPC Decoding

A Scalable Architecture for LDPC Decoding Cocco, M.; Dielissen, J.; Heijligers, M.; Hekstra, A.; Huisken, J. Design, Automation and Test in Europe Conference and Exhibition, 2004. Proceedings ,Volume: 3 ,Feb. 16-20, 2004 Pages:88 - 93

Outline • Introduction • Serial approach • UMP algorithm • Dataset in check nodes • Check operation • Computation skill • Memory reduction • Computation for Iteration

Introduction • High code rate (=0.9) LDPC code • K (avg.=30):Row-weight • High code rate, codeword length and High SNR • Memory reduction (1/10)

Serial Approach • Storage media application (optical or magnetic) • Relaxed delay requirement • Process from first bit node to last bit node • Memory storage for message

UMP Algorithm • "FOR 40 ITERATIONS DO" • "FOR ALL BIT NODES DO" • "FOR EACH INCOMING ARC X" • "SUM ALL INCOMING LLRs EXCEPT OVER X" • "SEND THE RESULT BACK OVER X" • "NEXT ARC" • "NEXT BIT NODE" • "FOR ALL CHECK NODES DO" • "FOR EACH INCOMING ARC X" • "TAKE THE ABS MINIMUM OF THE INCOMING • LLRs EXCEPT OVER X" • “TAKE THE XOR OF THE INCOMING LLRs EXCEPT OVER X” • "SEND THE RESULT BACK OVER X" • "NEXT ARC“ • "NEXT CHECK NODE" • "NEXT ITERATION"

UMP algorithm • Not needed knowledge of SNR of channel Robust performance • Not needed complex mathematical function (tanh x) area saving

Check Node 4 Dataset in check nodes • Minimum: Overall minimum value • One-but-minimum • Index

Check operation • Compute exclusive or of all hard bits output by connected bit nodes, except jth. • Compute the minimum of all K absolute value of LLRs of bit nodes to which the check node is connected, except jth.

Computation skill • Minimum: LLRj is not minimum, minimum=overall minimum. Otherwise, minimum=second-to-minimum

Memory reduction • Original size • Reduced size

Memory unit inside Check node

Computation for Iteration • "FOR 40 ITERATIONS DO" • "FOR ALL BIT NODES DO" • “CALCULATE THE OUTPUT MESSAGES FROM THE 3 CONNECTED CHECK NODES“ • “DO RUNNING CHECK NODE UPDATES ON THE 3 CHECK NODES” • “NEXT BIT NODES” • "NEXT ITERATION"

Computation for Iteration NEW | OLD NEW | OLD NEW | OLD NEW | OLD

Control R/W & address Serial input Serial output Time folded architecture FSM & PC μROM Computational Kernel Prefetcher Memory

Prefetch • Every dataset is statically used for 30 consecutive cycles. • Every clock cycle an average of 2R and 2W operations are required. • Delayed writeback • Datasets caching

Tiled architecture FSM & PC μROM Computational Kernel Prefetcher Memory

Result and area distribution • N=1020 R=0.5, 57 tiles 36mm2 with 0.13μm @1GHz, 300Mb/s

Conclusion • Speedup & Simultaneously multiple access  Prefetch • Reduce memory access latency Memory hierarchy • Increase performance N-tiled architecture • Modified version can be pipelined

A Scalable Architecture for LDPC Decoding

A Scalable Architecture for LDPC Decoding

Presentation Transcript

A High-Performance Scalable Graphics Architecture

Scalable Processor Architecture (SPARC)

Massively Parallel LDPC Decoding on GPU

A Scalable Internet Architecture

LDPC Decoding: VLSI Architectures and Implementations

Massively LDPC Decoding on Multicore Architectures

A near real time decoding for LDPC based distributed video coding using CUDA

SITAR: A Scalable Intrusion Tolerant Architecture for Distributed Services

A Scalable Front-End Architecture for Fast Instruction Delivery

Semi-Parallel Reconfigurable Architecture for Real-time LDPC decoding

An Improved Split-Row Threshold Decoding Algorithm for LDPC Codes

Error Correction and LDPC decoding

SEATTLE - A Scalable Ethernet Architecture for Large Enterprises

A Scalable Web Cache Consistency Architecture

LDPC for 11AC

SITAR: A Scalable Intrusion Tolerant Architecture for Distributed Services

Multi-Split-Row Threshold Decoding Implementations for LDPC Codes

DIRAC: A Scalable Lightweight Architecture for High Throughput Computing

DynaSoar A Scalable Architecture for High Performance AI Applications

Scalable JavaScript Application Architecture

Error Correction and LDPC decoding