200 likes | 341 Views
DSPs Vs General Purpose Microprocessors. AOE 5984 – Real Time Systems. Common DSP Applications…. Communications Audio, Video processing Graphics, 3-D rendering Navigation, radars, GPS Controls – Robotics, guidance, Machine Vision Filtering Frequency-Time transformations (FFT-IFFT).
E N D
DSPs Vs General Purpose Microprocessors AOE 5984 – Real Time Systems
Common DSP Applications… • Communications • Audio, Video processing • Graphics, 3-D rendering • Navigation, radars, GPS • Controls – Robotics, guidance, Machine Vision • Filtering • Frequency-Time transformations (FFT-IFFT)
Common DSP Tasks… • Modulation-Demodulation, Error correction • Noise reduction, equalization, echo cancellation • Audio compression • Vector and Matrix calculations • Control algorithms
DSPs Need to Do… • Efficient repetitive numerical calculations • Maintain numeric fidelity • Provide high memory bandwidth • Streaming data • Real Time processing
DSPs Need to Minimize… • Real Time execution unpredictability • Memory use • Power consumption • Cost • Development time
What Do DSPs Have? • Specialized memory architecture (Harvard) • Specialized parallel execution units • Specialized addressing modes • Specialized instruction sets for parallel execution • Specialized peripherals
x D D D y h0 h1 hn FIR Filtering… • Two data fetches, • Multiply operation, • Accumulate Operation, • Input vector shifting
Multiply-Accumulate (MAC) • Multiplication in single cycle • Execution time ~ 200 ns Register Multiplier ALU Accumulator
Special Hardware Units… • Hardware shifter. • Hardware circular buffers. • Special h/w for zero overhead looping. • Special address generation units.
Address Generation Units… • Work in parallel with DSP core execution unit. • Access new addresses without pausing to calculate new addresses. • Take advantage of predictability in the pattern of data access in DSP algorithms, using special addressing modes. • e.g. register-indirect with post increment addressing, circular (modulo) addressing, bit-reverse addressing in hardware.
Von Neumann Architecture… • Fetch MAC instruction • Read value of ‘x’ • Read value of ‘h’ • Multiply x, h and accumulate • Write result to memory Processor Core Address bus Data bus • 4 memory access operations • One multiplication Memory (Code+Data)
Processor Core Memory B Memory A Harvard Architecture… • Data and Code in separate memory segments • Multiple address and data buses • Double memory bandwidth • Simultaneous code and data fetch AB1 DB1 AB2 DB2
Caches in DSP and GPP… • GPPs normally contain two on-chip caches – one for data and the other for instructions. • Allows full speed retrieval of instructions and data without accessing slower off-chip memory. • DSPs contain a very small instruction cache and no data cache. • GPPs use control logic to determine what code and data goes into cache, while in DSPs it is programmer’s job to make a decision.
Fixed-Point Arithmetic… • Most DSPs use fixed point arithmetic than floating point. • Faster. • Cheaper. • Hardware support for saturation arithmetic, rounding and shifting.
Special Instructions • Why special instructions? • Multiple operations per instruction cycle. • Minimize program memory space. • Specify several parallel operations in a single instruction. • These instructions permit restricted access to registers and do not allow arbitrary operation combinations.
Special Instructions… MAC X0, Y0, A, X: (R0)+,X0, Y:(R4)+N4, Y0 • Multiply contents of X0 and Y0 • Add result to accumulator A • Load register X0 from X memory location pointed to by R0 • Load register Y0 from Y memory location pointed to by R4 • Post-increment R0 by 1 • Post-increment R4 by the contents of register R4 This instruction calculates one tap of the FIR filter in one clock cycle
Execution Time Predictability… • Non-DSP applications have a maximum average response time (firm real time). • DSP applications are hard real time. • Important to be able to calculate exactly the processing time required, or at least the worst time scenario. • GPPs do not have a good execution time predictability. • Lack of execution time predictability affect code optimization.
Execution Time Predictability… • GPPs – complicated algorithms for branch prediction and caching. • Speculative code execution depending on branch prediction. • Programmer does not know which instructions and data will go into cache and when. • Worst case execution time may be a order of magnitude greater than the actual execution time.
Execution Time Predictability… • DSPs do not use branch prediction algorithms. • Programmer decides which instruction go into cache. • No data cache in most DSPs.
Other features of DSPs and GPPs • VLIW (Very Long Instruction Word). • Combines a number of different instructions in a long instruction word. • e.g. 256 bytes word – 8 instructions. • More MACs, ALUs and other execution units. • GPPs use SIMD.