1 / 21

Implementation of G.729 algorithm on TI’s TMS320C62xx

Implementation of G.729 algorithm on TI’s TMS320C62xx. Zoha Pajouhi. Outline. What is G.729? How Does it work ? ََََ Acquaintance with C62xx processors Steps to be taken for implementation Optimizations required for implementation Implementation of the algorithm Summary & Conclusions.

Download Presentation

Implementation of G.729 algorithm on TI’s TMS320C62xx

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Implementation of G.729 algorithm on TI’s TMS320C62xx Zoha Pajouhi May 2005

  2. Outline • What is G.729? • How Does it work ? • ََََAcquaintance with C62xx processors • Steps to be taken for implementation • Optimizations required for implementation • Implementation of the algorithm • Summary & Conclusions

  3. What is G.729 ? • G.729 is a speech coding technique • Most important usage is in VoIP • Compress speech signal from 64kbps to 8kbps • It uses the CS-ACELP algorithm • CS-ACELP stands for conjugate structure algebraic code excited linear prediction • Has different annexes :G.729 Annex A , G.729 Annex B , G.729 Annex D , G.729 Annex E . • Different in :difficulty of the algorithm ,utilizing voice activity detection , reduced bit rates at the cost of lower quality & vice versa • I discuss implementing G.729 Annex A

  4. How Does G.729 Work ? (1) • Uses input frames of 10ms which is equal to 80 samples • Each frame has two subframes of 5ms • The idea of the algorithm is to predict the next coming signals by means of linear prediction . • It also uses statistical data to distinguish the resemblance of the signal to special signals in it’s code book • Exact signal variation is irrelevant to our discussion .

  5. How Does G.729 Work ? (2) • These operations are performed once per frame • Pre processing : scales down the signal by a factor of 2 , passing from a HP filter. • LP (linear prediction) analysis : Uses linear prediction to model the signal , the LP coefficients are converted to LPC coefficients for less sensitivity to quantization noise .

  6. How Does G.729 Work ? (3) • Quantization : LPCs are quantized and used throughout the rest of the algorithm . • Open-loop pitch analysis : pitch analysis is too difficult , this part gives us a rough estimation of the pitch

  7. How Does G.729 Work ? (4) • These operations are performed twice a frame or once per subframe • Closed-loop pitch analysis : Determines exact pitch delay through a closed loop • Fixed codebook search : computes the resemblance of the signal to the different codes in the codebook is used for consonant sounds • Adaptive codebook search : Is the same as fixed codebook except for it is for nonconsonant sounds .

  8. How Does G.729 Work ? (5) • The bit allocation is as follows :

  9. Why use DSP processors ? • Complex & long calculations in speech coding algorithms • Short design period Lead to using DSP processors instead of custom hardware design • Different DSP processors available : floating and fixed point processors usually trade off in price vs. precision.

  10. DSP processor

  11. Acquaintance with DSP C62xx processors • 150-250 MHz clock • 8 instructions/clk • 1200-2000 million instructions/s (MIPs) • 8 , 16 & 32 bit data manipulation better memory access • Low power • VLIW code structure ,6 ALUs ,2 multipliers • 40-bit mathematic function • Saturation and normalization blocks • On-chip Ram , etc.

  12. Steps to be taken for implementation • System simulation & analysis : Matlab or C or C++ is usually used . • Simulation of the implementation on the processor :one can make different optimization choices according to the processor being used . • Optimization : will be discussed later • Conversion to processor assembly

  13. Optimization • Different issues should be considered : • Processor independent issues • Locating independent instructions between two dependant ones to use parallelism • Optimized usage of registers • Loop folding/unfolding . • Folding for reduced code memory • Unfolding for reducing loop overhead • Using pointers instead of arrays

  14. Optimization • Processor dependant issues • Substituting special functions instead of assembly : e.g. : saturation has got a special instruction & doesn’t need to be rewritten . • Reading 32 bit data instead of 16 bit in 16 bit operations to reduce memory access time • Pipelining the algorithm : sometimes done by the design software sometimes not :e.g. :the inner loops are pipelined but not the outer ones. • Using hardware over flow bit , not supported in C6000 series .

  15. Implementation of the algorithm • The multichannel system can run more than one algorithm at the same time. • Any algorithm compliant with the eXpress DSP Algorithm Standard (xDAIS) is capable of multichannel processing. • An xDAIS-compliant algorithm requires three functional modules: • initialization, freeing and kernel. • The kernel module performs the algorithm processing while • the initialization and freeing module initializes/frees the algorithm context data.

  16. Framework initialization • Initialize context data and store to the desired memory location. • The system repeatedly calls an algorithm/algorithms until all the frames have been processed

  17. Algorithm implementation The G.729 speech encoder is divided into five submodules: • Submodule 1 contains pre-processing, the LP analysis and LPC to LSP conversion ,including Pre_Process(), Autocorr(), Lag_window(), Levinson() and Az_lsp(). • Submodule 2 calls Qua_lsp() to conduct LSP quantization. • Submodule 3 generates the interpolated LPC parameters, computes weighted speech, and finds the open-loop pitch, including Int_qlpc(), Int_lpc(), Weight_Az(), Residu(), Syn_filt() and Pitch_ol(). • Submodules 1 through 3 are for frame processing and should be done once per frame. Submodules

  18. Algorithm Implementation • Submodule 4 performs the closed-loop fractional pitch search and the adaptive codebook search, calling Pitch_fr3(), Enc_lag3(), Pred_lt_3(), Convolve() and G_pitch(). • Submodule 5 performs the innovative codebook search and filter memory update, calling ACELP_Codebook(), Corr_xy2(), Qua_gain() and Syn_filt(). • 4 and 5 are for subframe processing and should be repeated twice per frame

  19. Data Memory requirements The data memory is divided into three groups: • Context data : The context data are the static variables and arrays with values that must be kept from one frame to the next. • Tables : The constant tables are sorted into the G729_TABLES ,these tables contain different constants needed for the algorithm • Local variables and arrays :The local variables and arrays are stored in the stacks to be simply used when the algorithm is implemented .

  20. Summary & Conclusion • The G.729 standard is a popular choice for applications, such as VoIP, that require efficient use of bandwidth and good speech quality. • This standard has a good balance of bit-rate and frame size, producing acceptable speech quality • The TI DSP processors are capable of performing the algorithm • Although implementation seems simple but there are various issues to be considered .

  21. References • ITU-T G.729 Annex A:Reduced Complexity 8 kb/s CS-ACELP Codec for Digital Simultaneous Voice and Data , Redwan Salami, Claude Laflamme, Bruno Bessette, and Jean-Pierre AdoulUniversity of Sherbrooke , IEEE Communications Magazine ,September 1997 • CODING OF SPEECH AT 8 kbit/s USING CONJUGATE-STRUCTURE ALGEBRAIC-CODE-EXCITED LINEAR-PREDICTION (CS-ACELP) , ITU-T Recommendation G.729 , (03/96) • G.729/A Speech Coder: Multichannel TMS320C62x Implementation , Chiouguey Chen ,Xiangdong Fu , TI Application Report SPRA564B - February 2000 • TMS320C6211, TMS320C6211BFIXED-POINT DIGITAL SIGNAL PROCESSORS ,AUGUST 1998 ,REVISED MARCH 2004 ,TI corp. Datasheets

More Related