700 likes | 926 Views
Software Defined Radio – A High Performance Embedded Challenge. Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1 Krisztian Flautner University of Michigan 1 ARM Ltd. Contents. Software defined radio Categories of wireless networks
E N D
Software Defined Radio – A High Performance Embedded Challenge Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, and 1Krisztian Flautner University of Michigan 1ARM Ltd
Contents • Software defined radio • Categories of wireless networks • Core technologies for future networks • Case study : W-CDMA Network • Major algorithms • Workload characterization • Architectural implications Advanced Computer Architecture Laboratory University of Michigan
Wireless Communication System Transport TCP/UDP Network IP Baseband Processing Analog Front-end LINK PPP MAC Physical Layer (PHY) Upper Protocol Layers Packets Application bits “Air” Advanced Computer Architecture Laboratory University of Michigan
Anatomy of Cellular Phone Advanced Computer Architecture Laboratory University of Michigan
Protocol on Wireless Platform Application Processor GPP (Software) DSP/ Accelerator Source coding Audio AMR/QCELP Video MPEG Transport GPP (Software) Baseband Processor Upper layers Network LINK MAC ASIC (Hardware) Physical layer PHY Advanced Computer Architecture Laboratory University of Michigan
Software Defined Radio (SDR) • Use software routines instead of ASICs for the physical layer operations of wireless communication system ASICs (PHY) Software Routines Programmable Hardware • Both Analog Frontend and Digital Baseband are the scope of SDR Advanced Computer Architecture Laboratory University of Michigan
Levels of SDR <source:http://www.sdrforum.org> Advanced Computer Architecture Laboratory University of Michigan
Why we need SDR ? • Seamless wireless connection – End User • Widely different wireless protocols • TDMA : GSM, AMPS • CDMA : IS-95, cdma2000, W-CDMA, IEEE 802.11b • OFDM : IEEE 802.11a/g/n, WiMAX • Needs a terminal that can support multiple wireless protocols • Easy infrastructure upgrade – Service Provider • Wireless protocols evolve continuously • Ex) W-CDMA W-CDMA + HSDPA • Time to market – Manufacturer • Reduce hardware development time and cost Advanced Computer Architecture Laboratory University of Michigan
Where can we use SDR ? • Basestations • Weak constraints on power and area • Support several hundred subscribers • Will be commercialized first • Wireless terminals • Tight constraints on power and area. • Will be commercialized next Advanced Computer Architecture Laboratory University of Michigan
Why SDR is challenging ? • Analog Frontend • Must be tunable across a range of carrier frequencies and bandwidths. • Digital Baseband • Super computer level computation power. • > 50 Gops per subscriber • Tight power budget. • 200 ~ 300 mW (@terminal) • High level of programmability. • Combination of heterogeneous signal processing algorithms. Advanced Computer Architecture Laboratory University of Michigan
Our Strategy • Performance • Exploit the parallelism in signal processing and forward error correction (FEC) algorithms • Power • Limit the programmability to minimize power consumption. • Minimize both active and idle mode power consumption • There exists trade off between power efficiency and programmability Advanced Computer Architecture Laboratory University of Michigan
Categories of Wireless Networks <source : Wireless communication technology landscape, DELL > Advanced Computer Architecture Laboratory University of Michigan
WWAN (Wireless Wide Area Network) Advanced Computer Architecture Laboratory University of Michigan
WLAN / WMAN • WLAN : Wireless Local Area Network • High data rate • Poor mobility support • WMAN : Wireless Metro Area Network • For last mile problem • 802.16d : Fixed WiMax • 802.16e : Mobile WiMax Advanced Computer Architecture Laboratory University of Michigan
WPAN (Wireless Personal Area Network) • Interconnecting personal devices Advanced Computer Architecture Laboratory University of Michigan
OFDM (Orthogonal Frequency Division Multiplexing) • Transmit signal over several sub-carriers. • Frequency spectrum of sub-carriers are overlapped. (High spectral efficiency) • Highly susceptible to frequency error in receiver. Advanced Computer Architecture Laboratory University of Michigan
Major Computation in OFDM system • FFT / IFFT • N = 64 : IEEE 802.11a • N = 256~2048 : IEEE 802.16 WiMax • Data precision : 12~16bits • Amount of computations for OFDM operation • ~ 108 complex multiplications / sec Advanced Computer Architecture Laboratory University of Michigan
MIMO (Multiple Input Multiple Output) • Use multiple antennas for signal transmission and reception • In ideal case, linearly increase channel capacity • Can effectively compensate multipath fading effect • Significantly increase receiver complexity <Single Input Single Output (SISO)> Channel Capacity C = W log2(1+SNR) <Multiple Input Multiple Output (MIMO)> Channel Capacity C = min(n, m) * W log2(1+SNR) Advanced Computer Architecture Laboratory University of Michigan
Computation in MIMO receiver • Amount of computation in MIMO receiver • M : # of Tx/Rx antenna • LT : Length of preamble • LP : Length of payload • 4 Tx/Rx antenna, 100 Mbps, 64 QAM, ½ coding rate • ~ 6 x 108 Computations / Sec <source: B. Hassibi, An Efficient Square-Root Algorithm for BLAST> Advanced Computer Architecture Laboratory University of Michigan
LDPC code • Low Density Parity Check (LDPC) code • Turbo code like coding gain with lower implementation cost. • Encoding • Matrix multiplication, c = xG • G (Generator matrix) is large matrix. (e.g. 4K X 4K matrix) • Decoding • Equivalent to find most probable vector x such that Hx mod 2 = 0. • H (Parity check matrix) is large sparse matrix. • Implementation • There exist trade-off between coding gain and implementation complexity Advanced Computer Architecture Laboratory University of Michigan
Hybrid ARQ • Reuse error frames for the decoding of retransmitted frame • Require huge buffer space Advanced Computer Architecture Laboratory University of Michigan
Physical layer of W-CDMA Error Correction Suppress the signal term in outside of stop band Overcome severe error in short time interval Assign signal waveform optimal for data transmission Advanced Computer Architecture Laboratory University of Michigan
Channel Encoder/Decoder • Encoder • Add systematic redundancy on source data • Decoder • Fix errors on received data with the systematic redundancy information generated by encoder • W-CDMA system uses • Convolutional code (for short voice and control message) • Turbo code (for video stream and high speed packet data) Advanced Computer Architecture Laboratory University of Michigan
Input D D D D D D D D Output 0 G = 561 ( octal) 0 Output 1 G = 753 ( octal) 1 Channel Encoder • Consists of flip-flops and exclusive OR gates • Has negligible impact on workload <convolutional encoder of W-CDMA system> Advanced Computer Architecture Laboratory University of Michigan
Channel Decoder • Determine maximally probable code sequence from the received sequence. • Select C having minimum distance with received sequence r • One of dominant workload C1 C2 - {ci} : code set - r : received signal r d1 d2 . . . dN CN Advanced Computer Architecture Laboratory University of Michigan
Channel Decoder – Viterbi Algorithm • Most popular decoding algorithm of convolutional code • Consists of three steps: • Branch metric calculation (BMC) • abs(a-b), Parallelizable • Add compare select (ACS) • min(a+b, c+d), Parallelizable • Trace back (TB) • Recursive pointer tracing, Sequential • Amount of operation in W-CDMA • 16Kbps voice : ~2Gops Advanced Computer Architecture Laboratory University of Michigan
Channel Decoder –Turbo decoder • Two algorithms are widely used • SOVA (Soft Output Viterbi Algorithm) • Less computation intensive • Lower error correction performance • Max-LogMap algorithm • More computation required • Higher error correction performance • Amount of operation in W-CDMA • For 128 Kbps streaming data : ~18 Gops Advanced Computer Architecture Laboratory University of Michigan
Turbo Decoder • Based on the multiple iteration of SOVA / Max-LogMap blocks. • More iterations show better performance. <High level block diagram of turbo decoder> Advanced Computer Architecture Laboratory University of Michigan
Block Interleaver/Deinterleaver • Overcome severe signal attenuation within short time interval which frequently appears at wireless channel. • Interleaver (@transmitter): • Randomize the sequence of source data. • Deinterleaver (@receiver): • Recover original sequence by reordering. • Amount of operation : < 10 Mops <example of signal strength variation> Interleaving Deinterleaving 123456789 147258369 147258369 123456789 Advanced Computer Architecture Laboratory University of Michigan
Spreader/Despreader • Allow the transmission of several signals at the same time. (x[n] and y[n] in the below diagram) • It is based on the orthogonality between spreading codes <orthogonality between codes> Advanced Computer Architecture Laboratory University of Michigan
Spreader/Despreader • Spreader / Despreader also suppress noise • Amount of operation : ~4 Gops Advanced Computer Architecture Laboratory University of Michigan
Scrambler/Descrambler • Randomize the output signal by multiplying pseudo random sequence so called scrambling code. • Allow multiple terminals to communicate at the same time. • Amount of operation : ~ 3 Gops Terminal 2, with scrambling code m Terminal 1, with scrambling code n Advanced Computer Architecture Laboratory University of Michigan
Low Pass Filter • Suppress the signal terms at the outside of stop band frequency. Impulse signal sinc function Time domain Filtering Band limited signal Band unlimited signal Freq. domain <Input signal> <Output signal> Advanced Computer Architecture Laboratory University of Michigan
Low Pass Filter • Use conventional FIR filter • Number of filter tap (N) = 32 ~ 64 • Amount of operation : ~ 12 Gops Advanced Computer Architecture Laboratory University of Michigan
Rake Receiver – Multipath fading • Rake receiver mitigates multipath fading effect • Multipath fading is a major cause of unreliable wireless channel characteristic x(t) y(t) = a0x(t) y(t) = a0x(t)+a1x(t-d1) y(t) = a0x(t)+a1x(t-d1)+a2x(t-d2) Advanced Computer Architecture Laboratory University of Michigan
Rake Receiver - Functions • Ideally the function of rake receiver is to aggregate the signal terms with proper delay compensation y(t) = a0x(t)+a1x(t-d1)+a2x(t-d2) Rake receiver r(t) = a0x(t-tdealy)+a1x(t-d1-dest1)+a2x(t-d2-dest2) = (a0+a1+a2) * x(t-tdelay) • We need to know delay spread of received signal that randomly varies Advanced Computer Architecture Laboratory University of Michigan
RakeReceiver – Detect Delay Spread • Scan the received signal in frame buffer while computing correlation with scrambling code sequence. Correlation window Received signal Correlation Result a1 a2 a0 d1 d2 0 Advanced Computer Architecture Laboratory University of Michigan
Computation of Rake Receiver • Correlation computation : LWLBF • LW : Correlation window = 320 • LB: Frame buffer size = 5120 • F : Operation Frequency = 50 • ~ 80 Mega Multiplications / sec • Multiplications can be converted into subtraction • Amount of operation in W-CDMA : ~25 Gops • Most dominant workload Advanced Computer Architecture Laboratory University of Michigan
Rake Receiver – Overall Architecture Detects delay spread Compensates propagation delay recombine signal terms without delay Advanced Computer Architecture Laboratory University of Michigan
Power Control : Pilot Signal u : Power Control Command • Receiver controls the transmission power of transmitter in order to minimize the interference to other users. • Required computation is negligible Strength of pilot signal is below the reference level Strength of pilot signal is above the reference level Refrence level Terminal Basestation u d u u d d u Terminal sends DOWN command Terminal sends UP command Advanced Computer Architecture Laboratory University of Michigan
H/W operation states • For long idle period between sessions • Periodic wake up for control message reception • Minimum workload but dominate terminal standby time Idle • For short idle period between packet burst • Hold narrow control channel for fast transition to Active • Intermediate workload Control Hold • For packet burst transmission period • Use high speed packet channels up to 2Mbps • Most heavily loaded state Active Radio resource control state defined in W-CDMA specification operation states defined according to H/W activity Advanced Computer Architecture Laboratory University of Michigan
Workload Profile • Searcher, Turbo decoder, and LPF are dominant workloads • Workload profile varies according to operation state • One operation is equivalent to one RISC instruction Advanced Computer Architecture Laboratory University of Michigan
Mixture of algorithms with various processing time requirements Classified into two categories Heavy workload with long processing time (turbo decoder, searcher) Light workload with short processing time (Scrambler, spreader, LPF, Power control) Processing Time Requirement Advanced Computer Architecture Laboratory University of Michigan
Most heavy workload algorithms have significant vector parallelism Parallelism • Data width of most operation is 8 bit Advanced Computer Architecture Laboratory University of Michigan