• 230 likes • 443 Views
DSP Architectures Additional Slides. Professor S. Srinivasan Electrical Engineering Department I.I.T.-Madras, Chennai –600 036 srini@ee.iitm.ac.in. Figure 4.3(a) Block diagram of a barrel shifter. Figure 4.3(b) Implementation of a 4-bit, shift-right barrel shifter.
E N D
DSP ArchitecturesAdditional Slides Professor S. Srinivasan Electrical Engineering Department I.I.T.-Madras, Chennai –600 036 srini@ee.iitm.ac.in
Figure 4.3(b) Implementation of a 4-bit, shift-right barrel shifter
Figure 4.9 Register pointer updating algorithm for circular buffer addressing mode: SAR = start address register contents, EAR = end address register contents, PNTR = pointer
Figure 4.10 Different cases that arise in updating the pointer in circular buffer addressing mode
Instruction Level Parallelism • VLIW architecture • Each instruction specifies several operations to be done in parallel • Advantages : Simple hardware • compilers can spot ILP easily • Disadvantages : Little compatibilty between generations • Explicit NOPs bloat code size
Super scalar architecture • Hardware responsible for finding ILP in a sequential program • Advantage : Compatibility between generations • Disadvantage : Very complex hardware
Explicitly Parallel Instruction Computing (EPIC) • Combines VLIW and super scalar architectures • Instructions are grouped into 3 operating blocks and a template block • Template block tells hardware if instructions can be executed in parallel • Also gives information whether the block can be executed in parallel
ILP versus Power • Increasing instructions / cycle • Requires fewer cycles to execute a task • Uses longer clock for same performance • Uses lower supply voltage • And hence uses less power • However, too many functional units and too many transitions per clock cycle increase power consumption.
Low Power architecture • Power consumed by additional circuits vs. ability to lower clock rate while maintaining performance • Circuits must be highly used • Move complexity into software • Voltage scaling : Reduce Vdd • Clock gating : Turn off clock when chip is not in use ( applies to sub-modules of chip also)
VLIW is more suitable than super scalar for low power • - VLIW is smaller for same number of functional units • - Compiler is better at finding parallelism than hardware • Put multiple processors on chip rather than lots of functional units in one processor • Helps in running independent tasks
General Purpose Microprocessor 2000 • GHz clock speed • 32-bit address or more • 32-bit bus, 128-bit instructions • Complex MMU • Super scalar CPU • MMX instructions • On chip cache • Single cycle execution • 32-bit floating point ALU on board • Very expensive • 10s of watts of power
DSP in 2000 • Clock 100 ~ 200 MHz • 16-bit floating point or 32-bit floating point • 16-24 bits address space • Large on-chip and off-chip memories • Single cycle execution of most instructions • Harvard architecture • Lots of special DSP instructions • 50 mw to 2w power • Cheap
Future of DSP Microprocessor • Sufficiently unique for an independent class of applications (HDD, cell phone) • Low power consumption, low cost • High performance within power, cost constraints (MIPS/mw, MIPS/$) • Fixed point & floating point • Better compilers - but users must be informed • Hybrid DSP/ GP systems