240 likes | 565 Views
Introduction to Discrete Cosine Transform and Quantization. DSP TA: Wei-Nien Chen 2007/05/09. Typical Transform coding scheme. Discrete cosine transform. Any signal= Σ cosine of various frequencise. Decompose signals into combinations of DCT basis vectors. 1-D DCT. 2-D DCT (8x8). Why DCT.
E N D
Introduction to Discrete Cosine Transform and Quantization DSPTA: Wei-Nien Chen 2007/05/09
Discrete cosine transform • Any signal=Σcosine of various frequencise. • Decompose signals into combinations of DCT basis vectors
Why DCT • Energy compaction. (for further quantization)
original 1 coef. 3coef. 5 coef. 10 coef. 30coef.
Quantization • Represent coefficients using less steps(fewer bits) to obtain compression.
Weighted Quantization • Quantization Table: Each DCT coefficients uses different quantization step size obtained from psychophysical studies.
Formula Quantization Inverse Quantization
Optimization Tips DSPTA: Wei-Nien Chen 2007/05/09
Source Efficiency Coding Effort C source File 50-80% Low Compiler Optimizer Linear Assembly 90-100% Med Assembly Optimizer Assembly 100% High Hand Optimizer Code Efficiency vs. Coding Effort
Software Development Tool (CCS) • Code Composer Studio (CCS) • Software pipeline is very important • Build option setting • Optimization level : File level (-o3) • Configurations • Release mode is faster than Debug mode (profile)
Floating Point vs. Fixed Point • Fixed Point Operation • Fixed point:char, short, int, long • Floating point:float, double • Computation cycles with different data types
SIMD 32 bits A1 (short) A2 (short) int + short short B1 (short) B2 (short) = char char char char A1+B1 A2+B2 Intrinsic Functions • Use Intrinsic Function (spru198i) • Packet Data Processing • Change int type to char, short type • Put 2 “16-bit” data or 4 “8-bit” data in a 32-bit space • Single instruction multiple data (SIMD) (intrinsic)
Loop Unrolling/Loop order • Loop unrolling • Break the branch barrier • Trade off between performance and code size • #pragma MUST_ITERATE(min, max, multiple), • #pragma UNROLL(n) • Loop order #pragma MUST_ITERATE(10) For (i=0;i<N;i++) { …….. } Faster Slower
Intrinsic Library • TI has provided several optimized function for programmer’s, such as FIR, IIR, FFT, DCT, etc. (spru023b) • Include libraries and use it!! • Be ware of data type!! • Library location: \CCStudio3\c6400\dsplib\lib\dsp64x.lib \CCStudio3\c6400\imglib\lib\img64x.lib
Memory Bottle Neck • Memory Management is important • Designer’s Responsibility • Memory Load/Store is critical • 80% time for load/store • Linker Command File (*.cmd) • Allocate memory
Command File MEMORY { ISRAM: o = 0x00000000 l = 0x00040000 SDRAM: o = 0x80000000 l = 0x08000000 } SECTIONS { .text > ISRAM //Code .cinit > ISRAM //Initial values for global/static variables .stack > ISRAM //Stack (local variables) .const > ISRAM //Global and static string literals .switch > ISRAM //Tables for switch instructions .cio > ISRAM //Buffers for studio functions .bss > ISRAM //Global and static variables .far > ISRAM //Global and static declared far .sysmem > SDRAM //Memory for malloc functions (heap) .mycode > ISRAM .mydata > ISRAM } -stack 0x1F74 -heap 0x500000 #pragma CODE_SECTION(function_name,”mycode”) #pragma DATA_SECTION(array_name,”mydata”)
Easier said than done • Enhanced Direct Memory Access (EDMA) • DSP/BIOS • Cache • Tools that help you: Compiler consultant Optimization tools: cache tune/code size tune