1 / 65

Software Defined Radio – A High Performance Embedded Challenge

Software Defined Radio – A High Performance Embedded Challenge. Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian Flautner University of Michigan 1 ARM Ltd. Contents. Software defined radio Categories of wireless networks

argus
Download Presentation

Software Defined Radio – A High Performance Embedded Challenge

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1Krisztian Flautner University of Michigan 1ARM Ltd

  2. Contents • Software defined radio • Categories of wireless networks • Core technologies for future networks • Case study : W-CDMA Network • Major algorithms • Workload characterization • Architectural implications Advanced Computer Architecture Laboratory University of Michigan

  3. Software Defined Radio

  4. Wireless Communication System Transport TCP/UDP Network IP Baseband Processing Analog Front-end LINK PPP MAC Physical Layer (PHY) Upper Protocol Layers Packets Application bits “Air” Advanced Computer Architecture Laboratory University of Michigan

  5. Anatomy of Cellular Phone Advanced Computer Architecture Laboratory University of Michigan

  6. Protocol on Wireless Platform Application Processor GPP (Software) DSP/ Accelerator Source coding Audio AMR/QCELP Video MPEG Transport GPP (Software) Baseband Processor Upper layers Network LINK MAC ASIC (Hardware) Physical layer PHY Advanced Computer Architecture Laboratory University of Michigan

  7. Software Defined Radio (SDR) • Use software routines instead of ASICs for the physical layer operations of wireless communication system ASICs (PHY) Software Routines Programmable Hardware • Both Analog Frontend and Digital Baseband are the scope of SDR Advanced Computer Architecture Laboratory University of Michigan

  8. Levels of SDR <source:http://www.sdrforum.org> Advanced Computer Architecture Laboratory University of Michigan

  9. Why we need SDR ? • Seamless wireless connection – End User • Widely different wireless protocols • TDMA : GSM, AMPS • CDMA : IS-95, cdma2000, W-CDMA, IEEE 802.11b • OFDM : IEEE 802.11a/g/n, WiMAX • Needs a terminal that can support multiple wireless protocols • Easy infrastructure upgrade – Service Provider • Wireless protocols evolve continuously • Ex) W-CDMA  W-CDMA + HSDPA • Time to market – Manufacturer • Reduce hardware development time and cost Advanced Computer Architecture Laboratory University of Michigan

  10. Where can we use SDR ? • Basestations • Weak constraints on power and area • Support several hundred subscribers • Will be commercialized first • Wireless terminals • Tight constraints on power and area. • Will be commercialized next Advanced Computer Architecture Laboratory University of Michigan

  11. Why SDR is challenging ? • Analog Frontend • Must be tunable across a range of carrier frequencies and bandwidths. • Digital Baseband • Super computer level computation power. • > 50 Gops per subscriber • Tight power budget. • 200 ~ 300 mW (@terminal) • High level of programmability. • Combination of heterogeneous signal processing algorithms. Advanced Computer Architecture Laboratory University of Michigan

  12. Our Strategy • Performance • Exploit the parallelism in signal processing and forward error correction (FEC) algorithms • Power • Limit the programmability to minimize power consumption. • Minimize both active and idle mode power consumption • There exists trade off between power efficiency and programmability Advanced Computer Architecture Laboratory University of Michigan

  13. Categories of Wireless Networks

  14. Categories of Wireless Networks <source : Wireless communication technology landscape, DELL > Advanced Computer Architecture Laboratory University of Michigan

  15. WWAN (Wireless Wide Area Network) Advanced Computer Architecture Laboratory University of Michigan

  16. WLAN / WMAN • WLAN : Wireless Local Area Network • High data rate • Poor mobility support • WMAN : Wireless Metro Area Network • For last mile problem • 802.16d : Fixed WiMax • 802.16e : Mobile WiMax Advanced Computer Architecture Laboratory University of Michigan

  17. WPAN (Wireless Personal Area Network) • Interconnecting personal devices Advanced Computer Architecture Laboratory University of Michigan

  18. Core technologies of future networks

  19. OFDM (Orthogonal Frequency Division Multiplexing) • Transmit signal over several sub-carriers. • Frequency spectrum of sub-carriers are overlapped. (High spectral efficiency) • Highly susceptible to frequency error in receiver. Advanced Computer Architecture Laboratory University of Michigan

  20. Major Computation in OFDM system • FFT / IFFT • N = 64 : IEEE 802.11a • N = 256~2048 : IEEE 802.16 WiMax • Data precision : 12~16bits • Amount of computations for OFDM operation • ~ 108 complex multiplications / sec Advanced Computer Architecture Laboratory University of Michigan

  21. MIMO (Multiple Input Multiple Output) • Use multiple antennas for signal transmission and reception • In ideal case, linearly increase channel capacity • Can effectively compensate multipath fading effect • Significantly increase receiver complexity <Single Input Single Output (SISO)> Channel Capacity C = W log2(1+SNR) <Multiple Input Multiple Output (MIMO)> Channel Capacity C = min(n, m) * W log2(1+SNR) Advanced Computer Architecture Laboratory University of Michigan

  22. Computation in MIMO receiver • Amount of computation in MIMO receiver • M : # of Tx/Rx antenna • LT : Length of preamble • LP : Length of payload • 4 Tx/Rx antenna, 100 Mbps, 64 QAM, ½ coding rate • ~ 6 x 108 Computations / Sec <source: B. Hassibi, An Efficient Square-Root Algorithm for BLAST> Advanced Computer Architecture Laboratory University of Michigan

  23. LDPC code • Low Density Parity Check (LDPC) code • Turbo code like coding gain with lower implementation cost. • Encoding • Matrix multiplication, c = xG • G (Generator matrix) is large matrix. (e.g. 4K X 4K matrix) • Decoding • Equivalent to find most probable vector x such that Hx mod 2 = 0. • H (Parity check matrix) is large sparse matrix. • Implementation • There exist trade-off between coding gain and implementation complexity Advanced Computer Architecture Laboratory University of Michigan

  24. Hybrid ARQ • Reuse error frames for the decoding of retransmitted frame • Require huge buffer space Advanced Computer Architecture Laboratory University of Michigan

  25. Case Study : W-CDMA system

  26. Major Algorithms

  27. Physical layer of W-CDMA Error Correction Suppress the signal term in outside of stop band Overcome severe error in short time interval Assign signal waveform optimal for data transmission Advanced Computer Architecture Laboratory University of Michigan

  28. Channel Encoder/Decoder • Encoder • Add systematic redundancy on source data • Decoder • Fix errors on received data with the systematic redundancy information generated by encoder • W-CDMA system uses • Convolutional code (for short voice and control message) • Turbo code (for video stream and high speed packet data) Advanced Computer Architecture Laboratory University of Michigan

  29. Input D D D D D D D D Output 0 G = 561 ( octal) 0 Output 1 G = 753 ( octal) 1 Channel Encoder • Consists of flip-flops and exclusive OR gates • Has negligible impact on workload <convolutional encoder of W-CDMA system> Advanced Computer Architecture Laboratory University of Michigan

  30. Channel Decoder • Determine maximally probable code sequence from the received sequence. • Select C having minimum distance with received sequence r • One of dominant workload C1 C2 - {ci} : code set - r : received signal r d1 d2 . . . dN CN Advanced Computer Architecture Laboratory University of Michigan

  31. Channel Decoder – Viterbi Algorithm • Most popular decoding algorithm of convolutional code • Consists of three steps: • Branch metric calculation (BMC) • abs(a-b), Parallelizable • Add compare select (ACS) • min(a+b, c+d), Parallelizable • Trace back (TB) • Recursive pointer tracing, Sequential • Amount of operation in W-CDMA • 16Kbps voice : ~2Gops Advanced Computer Architecture Laboratory University of Michigan

  32. Channel Decoder –Turbo decoder • Two algorithms are widely used • SOVA (Soft Output Viterbi Algorithm) • Less computation intensive • Lower error correction performance • Max-LogMap algorithm • More computation required • Higher error correction performance • Amount of operation in W-CDMA • For 128 Kbps streaming data : ~18 Gops Advanced Computer Architecture Laboratory University of Michigan

  33. Turbo Decoder • Based on the multiple iteration of SOVA / Max-LogMap blocks. • More iterations show better performance. <High level block diagram of turbo decoder> Advanced Computer Architecture Laboratory University of Michigan

  34. Block Interleaver/Deinterleaver • Overcome severe signal attenuation within short time interval which frequently appears at wireless channel. • Interleaver (@transmitter): • Randomize the sequence of source data. • Deinterleaver (@receiver): • Recover original sequence by reordering. • Amount of operation : < 10 Mops <example of signal strength variation> Interleaving Deinterleaving 123456789  147258369  147258369  123456789 Advanced Computer Architecture Laboratory University of Michigan

  35. Spreader/Despreader • Allow the transmission of several signals at the same time. (x[n] and y[n] in the below diagram) • It is based on the orthogonality between spreading codes <orthogonality between codes> Advanced Computer Architecture Laboratory University of Michigan

  36. Spreader/Despreader • Spreader / Despreader also suppress noise • Amount of operation : ~4 Gops Advanced Computer Architecture Laboratory University of Michigan

  37. Scrambler/Descrambler • Randomize the output signal by multiplying pseudo random sequence so called scrambling code. • Allow multiple terminals to communicate at the same time. • Amount of operation : ~ 3 Gops Terminal 2, with scrambling code m Terminal 1, with scrambling code n Advanced Computer Architecture Laboratory University of Michigan

  38. Low Pass Filter • Suppress the signal terms at the outside of stop band frequency. Impulse signal sinc function Time domain Filtering Band limited signal Band unlimited signal Freq. domain <Input signal> <Output signal> Advanced Computer Architecture Laboratory University of Michigan

  39. Low Pass Filter • Use conventional FIR filter • Number of filter tap (N) = 32 ~ 64 • Amount of operation : ~ 12 Gops Advanced Computer Architecture Laboratory University of Michigan

  40. Rake Receiver – Multipath fading • Rake receiver mitigates multipath fading effect • Multipath fading is a major cause of unreliable wireless channel characteristic x(t) y(t) = a0x(t) y(t) = a0x(t)+a1x(t-d1) y(t) = a0x(t)+a1x(t-d1)+a2x(t-d2) Advanced Computer Architecture Laboratory University of Michigan

  41. Rake Receiver - Functions • Ideally the function of rake receiver is to aggregate the signal terms with proper delay compensation y(t) = a0x(t)+a1x(t-d1)+a2x(t-d2) Rake receiver r(t) = a0x(t-tdealy)+a1x(t-d1-dest1)+a2x(t-d2-dest2) = (a0+a1+a2) * x(t-tdelay) • We need to know delay spread of received signal that randomly varies Advanced Computer Architecture Laboratory University of Michigan

  42. RakeReceiver – Detect Delay Spread • Scan the received signal in frame buffer while computing correlation with scrambling code sequence. Correlation window Received signal Correlation Result a1 a2 a0 d1 d2 0 Advanced Computer Architecture Laboratory University of Michigan

  43. Computation of Rake Receiver • Correlation computation : LWLBF • LW : Correlation window = 320 • LB: Frame buffer size = 5120 • F : Operation Frequency = 50 • ~ 80 Mega Multiplications / sec • Multiplications can be converted into subtraction • Amount of operation in W-CDMA : ~25 Gops • Most dominant workload Advanced Computer Architecture Laboratory University of Michigan

  44. Rake Receiver – Overall Architecture Detects delay spread Compensates propagation delay recombine signal terms without delay Advanced Computer Architecture Laboratory University of Michigan

  45. Power Control : Pilot Signal u : Power Control Command • Receiver controls the transmission power of transmitter in order to minimize the interference to other users. • Required computation is negligible Strength of pilot signal is below the reference level Strength of pilot signal is above the reference level Refrence level Terminal Basestation u d u u d d u Terminal sends DOWN command Terminal sends UP command Advanced Computer Architecture Laboratory University of Michigan

  46. H/W operation states • For long idle period between sessions • Periodic wake up for control message reception • Minimum workload but dominate terminal standby time Idle • For short idle period between packet burst • Hold narrow control channel for fast transition to Active • Intermediate workload Control Hold • For packet burst transmission period • Use high speed packet channels up to 2Mbps • Most heavily loaded state Active Radio resource control state defined in W-CDMA specification operation states defined according to H/W activity Advanced Computer Architecture Laboratory University of Michigan

  47. Workload Characterization

  48. Workload Profile • Searcher, Turbo decoder, and LPF are dominant workloads • Workload profile varies according to operation state • One operation is equivalent to one RISC instruction Advanced Computer Architecture Laboratory University of Michigan

  49. Mixture of algorithms with various processing time requirements Classified into two categories Heavy workload with long processing time (turbo decoder, searcher) Light workload with short processing time (Scrambler, spreader, LPF, Power control) Processing Time Requirement Advanced Computer Architecture Laboratory University of Michigan

  50. Most heavy workload algorithms have significant vector parallelism Parallelism • Data width of most operation is 8 bit Advanced Computer Architecture Laboratory University of Michigan

More Related