250 likes | 301 Views
Explore the fundamental architecture and features of Digital Signal Processors (DSPs) and their relevance in real-time processing. Learn about DSP hardware, common features like SIMD and VLIW, Harvard architecture, pipelining, cache, and DMA. Understand the differences between DSPs and microcontrollers, and review a high-performance DSP model, TMS320C6713, tailored for audio applications.
E N D
Introduction to Digital Signal Processors (DSPs) Dr. Konstantinos Tatas
Outline/objectives • Identify the most important DSP processor architecture features and how they relate to DSP applications. ACOE343 - Embedded Real-Time Processor Systems - Frederick University
What is a DSP? • A specialized microprocessor for real-time DSP applications • Digital filtering (FIR and IIR) • FFT • Convolution, Matrix Multiplication etc ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Hardware used in DSP ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Common DSP features • Harvard architecture • Dedicated single-cycle Multiply-Accumulate (MAC) instruction (hardware MAC units) • Single-Instruction Multiple Data (SIMD) Very Large Instruction Word (VLIW) architecture • Pipelining • Cache • DMA ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Harvard Architecture • Physically separate memories and paths for instruction and data ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Single-Cycle MAC unit Can compute a sum of n-products in n cycles ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Single Instruction - Multiple Data (SIMD) • A technique for data-level parallelism by employing a number of processing elements working in parallel ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Very Long Instruction Word (VLIW) • A technique for instruction-level parallelism by executing instructions without dependencies (known at compile-time) in parallel • Example of a single VLIW instruction: F=a+b; c=e/g; d=x&y; w=z*h; ACOE343 - Embedded Real-Time Processor Systems - Frederick University
CISC vs. RISC vs. VLIW ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Pipelining • DSPs commonly feature deep pipelines • TMS320C6x processors have 3 pipeline stages with a number of phases (cycles): • Fetch • Program Address Generate (PG) • Program Address Send (PS) • Program ready wait (PW) • Program receive (PR) • Decode • Dispatch (DP) • Decode (DC) • Execute • 6 to 10 phases ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Direct Memory Access (DMA) • The feature that allows peripherals to access main memory without the intervention of the CPU • Typically, the CPU initiates DMA transfer, does other operations while the transfer is in progress, and receives an interrupt from the DMA controller once the operation is complete. • Can create cache coherency problems (the data in the cache may be different from the data in the external memory after DMA) • Requires a DMA controller ACOE343 - Embedded Real-Time Processor Systems - Frederick University
Cache memory • Separate instruction and data L1 caches (Harvard architecture) • most systems uses DMA ACOE343 - Embedded Real-Time Processor Systems - Frederick University
DSP Harvard Architecture VLIW/SIMD (parallel execution units) No bit level operations Hardware MACs DSP applications Microcontroller Mostly von Neumann Architecture Single execution unit Flexible bit-level operations No hardware MACs Control applications DSP vs. Microcontroller ACOE343 - Embedded Real-Time Processor Systems - Frederick University
The TMS320C6713’s high performance CPU and rich peripheral set are tailored for multichannel audio applications such as broadcast and recording mixing, home and large venue audio decoders, and multi-zone audio distribution. The TMS320C6713 device is based on the high-performance advanced VelociTI very-long-instruction-word (VLIW)architecture developed by Texas Instruments (TI). The VelociTI architecture provides ample performance to decode a variety of existing digital audio formats and the flexibility to add future formats.
Architecture of TMS320C67xxTMS320C6713 DSP Starter Kit (DSK) Block Diagram
A TMS320C6713 DSP operating at 225 MHz. • 16 Mbytes of synchronous DRAM • 512 Kbytes of non-volatile Flash memory • (256 Kbytes usable in default conguration) • 4 user accessible LEDs and DIP switches • Software board conguration through • registers implemented in CPLD ACOE343 - Embedded Real-Time Processor Systems - Frederick University
JTAG emulation through on-board JTAG • emulator with USB host interface or external emulator
Review Questions • Which of the following is not a typical DSP feature? • Dedicated multiplier/MAC • Von Neumann memory architecture • Pipelining • Saturation arithmetic • Which implementation would you choose for lowest power consumption? • ASIC • FPGA • General-Purpose Processor • DSP ACOE343 - Embedded Real-Time Processor Systems - Frederick University
References • DR. Chassaing, “DSP Applications using C and the TMS320C6x DSK”, Wiley, 2002 • Texas Instruments, TMS320C64x datasheets • Analog Devices, ADSP-21xx Processors