330 likes | 1.6k Views
Matlab Extensions for the Development, Testing and Verification of Real-Time DSP Software. David P. Magee Communication Systems Engineer Texas Instruments Dallas, TX. Presentation Outline. DSP Software Development DSP Simulator Introduction to Intrinsics FFT Example
E N D
Matlab Extensions for the Development, Testing and Verification of Real-Time DSP Software David P. Magee Communication Systems Engineer Texas Instruments Dallas, TX
Presentation Outline • DSP Software Development • DSP Simulator • Introduction to Intrinsics • FFT Example • Algorithm Optimization Results • Other Matlab and Simulink Extensions • Closing Remarks • Q & A
Develop Floating Point Simulation Debug Simulation Step 1: Develop Understanding Develop Fixed Point Simulation Debug Simulation Step 2: Address Scaling Issues Develop Assembly Code Debug Assembly Code Step 3: Optimize for Performance DSP Software Development • Common steps for DSP software development
Issues with the 3 Step Approach • Each step takes time and resources • Algorithm testing at each stage • Multiple versions of the algorithm – version control headaches • Evaluation of processor instruction set compatibility and MIPS requirements often occurs late in the software development cycle • Debugging algorithms on a pipelined and/or parallel processor can be very difficult (the problem is getting more difficult as processors become more complicated) Can the development cycle be improved ? Yes !
Develop Floating Point Simulation Debug Simulation Step 1: Develop Understanding Simultaneously Develop Fixed Point Simulation and Assembly Code Simultaneously Debug Simulation and Assembly Code Step 2: Address Scaling Issues and Optimize for Performance Improved Software Development Cycle • Merge Steps 2 and 3 Question: How can these steps be combined ?
Floating Point Simulation System Simulation Matlab Simulation Environment Fixed Point Simulation System Simulation Host Environment DSP Simulator Matlab + DSP Simulator • Develop Floating Point and Fixed Point Simulations in a single development environment - Matlab • Develop and test C/C++ code for Fixed Point Simulation in cooperation with the DSP Simulator • Migrate the C/C++ code directly to the target DSP
DSP Simulator C/C++ code MEX-file Matlab DSP Simulator in Matlab Develop and Debug Fixed Point C/C++ Code in Matlab Benefits: • Accelerate the development and analysis of DSP code • A mechanism to implement your IP blocks in efficient DSP code • Process large amounts of data • Compare fixed point and floating point algorithm implementations • Provide mixed simulation environment with fixed point and floating point algorithm implementations • Advanced graphing capabilities
What is a MEX-file ? • A file containing one function that interfaces C/C++ code to the Matlab shell • MathWorks specifies the syntax for this function void mexFunction(int nlhs,mxArray *plhs[ ], int nrhs,const mxArray *prhs[ ]) • See http://www.mathworks.com • Enter mex files into their Search engine
What is a DSP Simulator ? • A library of functions that simulate the mathematical operations of DSP assembly instructions. • For TI DSPs, the compiler recognizes special functions called Intrinsics and maps them directly into inline assembly instructions • In the DSP Simulator, make each function represent a supported compiler Intrinsic
C code C6x Assembly Code Function Example() { . y = _add2(a,b); . } Example: . ADD2 . S1 A1,A2,A3 . . Intrinsic Example • ADD2: adds the upper and lower 16-bit portions of a 32 bit register • Intrinsic: dst = _add2(src1,src2) • Assembly Instruction: ADD2 (.unit) src1,src2,dst Compile
DSP Simulator typedef struct _REG32X2 { short lo; short hi; } reg32x2; int32 _add2(int32 a,int32 b) { int32 y; reg32x2 *pa,*pb,*py; pa = (reg32x2 *)&a; pb = (reg32x2 *)&b; py = (reg32x2 *)&y; py->lo = pa->lo+pb->lo; py->hi = pa->hi+pb->hi; return(y); } // end of _add2() function C code Function Example() { . y = _add2(a,b); . } DSP Simulator Example • C Code with _add2() Intrinsic
DSP Simulator • How many Intrinsics exist for each DSP family ? TMS320C54x: 36 TMS320C55x: 42 TMS320C62x: 59 TMS320C64x: 135 TMS320C64+: 162 TMS320C67x: 68 Most algorithms previously written in assembly code can now be expressed in C/C++ code with Intrinsic function calls
DSP Simulator • Consists of two files • C6xSimulator.c • C6xSimulator.h • Contains C functions for representing the numerical operations of 158 DSP assembly instructions • Can control endianness with a symbolic constant
DSP Simulator and C++ • DSP Simulator works in C++ programming environments • Partition data into appropriate types (real, complex) and bit widths (8/16/32 bits) • Write functions in C++ • Use operator overloading for required data types to map operators to the desired Intrinsic functions Benefit: Operator overloading allows for easy migration to next generation DSP instruction sets
Using the DSP Simulator • Develop C/C++ code with Intrinsic function calls • Compile and link the C/C++ code and the DSP Simulator to form a Matlab executable file • Debug and evaluate the performance of the fixed point algorithms in Matlab • Rely on TI tools to generate an optimized assembly version of the C/C++ code for the target DSP Benefit: One version of C/C++ code runs in Matlab and in the target DSP !
Migrating C/C++ Code to the DSP • How does it work ? C/C++ code can directly access DSP assembly instructions without actually writing assembly code Benefit: Eliminate headaches associated with assembly programming • Pipeline scheduling • Register allocation • Unit allocation • Stack manipulation • Parallel instruction debug Conclusion: Make the compiler do the hard work !
When is the C/C++ Code Optimized ? • Look at compiler report in the assembly file to determine unit loading. • Look at the assembly code. Are all the units being used each cycle ? • Try to balance loading by using different sequence of Intrinsics to perform the same overall mathematical operation. • e.g. X * 4 => X << 2 • May require manual unrolling of loops. • Determine the ideal number of MAC operations for an algorithm and compare it to the compiler report
Limitations • DSP software engineer must perform algorithm mapping from floating point to fixed point manually • ranges for floating point values • fixed point scaling issues • saturation issues • DSP software architecture is limited to the creativity of the software engineer Recommendation: Develop an automated tool that converts Matlab/Simulink floating point files to fixed point DSP C/C++ code using the programming guidelines discussed in the paper.
FFT Example Developed an FFT for the C64x DSP architecture Briefly discuss • FFT Functions • FFT Simulation File • Development time between hand coded assembly and C code with Intrinsics • Software development time • Software performance
// inside the Radix-2 stage for(k=Nover2;k>0;k--) { . // compute the real part // (x0.real-x1.real)*w1.real reg2 = _mpyhir(w1,reg1real); // (x0.imag-x1.imag)*w1.imag reg3 = _mpylir(w1,reg1imag); reg2 -= reg3; // compute the imag part // (x0.imag-x1.imag)*w1.real reg4 = _mpyhir(w1,reg1imag); // (x0.real-x1.real)*w1.imag reg5 = _mpylir(w1,reg1real); reg4 += reg5; . } FFT Functions The FFT functions • Main FFT function • First FFT stage • Radix-2 stage • Radix-4 stage • Last FFT stage Example: Radix-2 stage • Uses mpyhir() and mpylir() Intrinsics Note: Twiddle factor indexing not shown in this Example
% test_fft.m % initialize some parameters Nin = 64; N = 128; NumFFTs = 1000; % create a random input h = rand(NumFFTs,Nin); h = [h;zeros(NumFFTs,N-Nin)]; % compute FFT using Matlab function Hd = fft(h,[],2); % call the fixed point function [H] = ti_fft(h1dfilt,Nin,N); % compute the NSR in dB scale e = Hd-H; NSR = 10*log10(sum(abs(e).^2,2)… ./sum(abs(Hd).^2,2)); FFT Simulation File The simulation file is a Matlab script file • Performs the simulation • Calls the floating point Matlab FFT function fft() • Calls the fixed point FFT function ti_fft() • Compares the frequency responses of fixed point and floating point FFTs in Matlab • Computes the SNR, NSR, etc. using Matlab
FFT Development Time Software Development Time Comparison • Time required to develop hand-coded assembly functions • 2-3 person months • Time required to develop C code with Intrinsic function calls • 2-3person weeks Development time is reduced by a factor of 4 to 5 !
FFT Performance Comparison Metric: Kernel sizes and cycle counts • Kernel sizes for hand-coded assembly functions • FirstFFTStage: 18*(N/16) • R2Stage: 7*(N/8) • R4Stage: 12*(N/8) • LastFFTStage: 24*(N/16) • Kernel sizes for C code with Intrinsic function calls • FirstFFTStage: 19*(N/16) • R2Stage: 8*(N/8) • R4Stage: 14*(N/8) • LastFFTStage: 27*(N/16) Intrinsics performance is within 15% of assembly !
Algorithm Optimization Results In most cases, Intrinsics performance is within 10% !
DSP Simulator Library Function N Function 1 Function 2 C/C++ code MEX-file Matlab Matlab Function Libraries For a particular DSP application • The DSP Simulator emulates the numerical behavior of the DSP instructions • Power User develops a library of optimized algorithms that contain Intrinsic function calls • General user writes C/C++ code that calls the optimized functions in the library • The user’s C/C++ code is compiled with the DSP Simulator, the library and the MEX-file • User tests the algorithms for performance, evaluates cycle counts, etc. in Matlab • The same C/C++ code is migrated directly to the target DSP
Library Library Library NoiseEst NoiseEst ChanEst ResEqu SlidingMode Hinf OuterProduct InnerProduct PID FIR RS BF IC VectorSum Viterbi Matlab Function Library Examples Math Library Communications Library Controls Library Benefit: Ability to share fixed-point DSP C/C++ code and test vectors between multiple users
Closing Remarks DSP Simulator Benefits • Develop fixed point DSP code in Matlab • Easily compare floating point and fixed point algorithm implementations in Matlab • Bit-true, fixed point simulations • Reduce software development time by a factor of 4 to 5 • Incorporate DSP code into higher level system simulations • Debugging code in Matlab is easier than in a real-time system • Easily evaluate/predict MIPS requirements • Run the same C/C++ source code in Matlab and in the DSP • Easily migrate algorithms to new DSP instruction sets • Develop software before next generation DSPs are available
Q & A • Thanks for attending my presentation !