1 / 31

MELP Vocoder

The MELP Vocoder is a speech coding algorithm that produces natural-sounding speech at very low bit rates. It uses mixed-excitation and adaptive spectral enhancement techniques to reduce mechanical or buzzy sounds and minimize tonal noise. This vocoder is robust in background noise environments and includes features such as mixed excitation, aperiodic pulses, pulse dispersion, and adaptive spectral enhancement.

gloriao
Download Presentation

MELP Vocoder

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MELP Vocoder

  2. Outline • Introduction • MELP Vocoder Features • Algorithm Description • Parameters & Comparison

  3. Introduction • Traditional pitched-excited LPC vocoders use either a periodic train or white noise for synthesis filter  intelligible speech at very low bit rates • But sometimes results in mechanical or buzzy sound and are prone to tonal noise

  4. Introduction • These problems arise from: • Inability of a simple pulse train to reproduce all kind of voiced speech • MELP vocoder uses a mixed-excitation model and it represents a richer ensemble of speech characteristic  Produce more natural sounding speech

  5. MELP vocoder • Robust in background noise environments • Based on traditional LPC model, also includes additional features Mixed excitation Aperiodic pulses Pulse dispersion Adaptive spectral enhancement

  6. كد كننده وكدر MELP فيلتر خطاي پيشگويي گفتار ورودي محاسبه نهايي گام پنجره گذاري همينگ خطاي پيشگويي محاسبه دامنه تبديل فوريه محاسبه شدت صدايي و ميزان پراكندگي نقاط اوج محاسبه گام محاسبه LPC تبديل LPC به LSF MSVQ توسعه پهناي باند LPC آرايه اي از بردارها مرتب سازي LSF كوانتيزاسيون دامنه‌هاي تبديل فوريه بردار كوانتيزه شده بردار LSF الگوريتم ايجاد فاصله حداقل 50 هرتز الگوريتم ايجاد فاصله حداقل 50 هرتز

  7. وكدر MELP موقعيت پنجره‌هاي آناليز

  8. وكدر MELP محاسبه دامنه‌هاي تبديل فوريه • فيلتر ايجاد پالس وظيفه توليد قطارپالس را دارد. اين كاربا استفاده از FFT و 200 نمونه از سيگنال و استخراج پوش پاسخ ضربه صورت مي‌گيرد.

  9. وكدر MELP محاسبه شدت‌هاي صدايي و تعيين پرچم غير پريوديك 1- مرحله اول تخمين (L=40,41,…,160) 2- تعيبن شدت صدايي باند پايين 3- تعيين شدت صدايي 4 باند ديگر

  10. P=12.64 P=6.77 P=1.16 P=1.1 وكدر MELP ميزان پراكندگي نقاط اوج

  11. وكدر MELP ميزان پراكندگي نقاط اوج

  12. وكدر MELP جدول اختصاص بيت

  13. Mixed Excitation • Mixed-excitation is implemented using a multi-band mixing model • This model can simulate frequency dependent voicing strength • Using a mixture of Aperiodic/periodic and white noise as excitation • Primary effect of this unit is to reduce the buzz in broadband acoustic noise

  14. Aperiodic pulses • When input signal is voiced, MELP vocoder can synthesize speech using either aperiodic or periodic pulses. • Aperiodic pulses used during transition regions between voiced and unvoiced segments of speech signal  Producing erratic glottal pulses without tonal noise

  15. Pulse Dispersion • Pulse dispersion is implemented using fixed pulse dispersion filter based on a flattened triangle pulse • The pulse dispersion filter improves the match of bandpass filtered synthetic and natural speech waveforms in frequency bands which do not contain a formant resonance.  Spreading the excitation energy with a pitch period Reduce harsh quality of the synthetic speech

  16. Adaptive spectral enhancement filter • Based on the poles of the vocal tract filter • Is used to enhance the formant structure in the synthetic speech • This filter improves the match between synthetic and natural bandpass waveforms  more natural speech output

  17. MELP Algorithm Description (Encoder) • filter out any low frequency noise • This filtered speech is again filtered in order to perform the initial pitch search for the pitch estimation • The next step is to perform the Bandpass voicing analysis - In this step we decide to use periodic/Aperiodic train or white noise model

  18. MELP Algorithm Description (Encoder) cont’d • In this stage A voice degree parameter is estimated in each band, based on the normalized correlation function of the speech signal and the smoothed rectified signal in the non-DC band • Let sk(n) denote the speech signal in band k, uk(n) denote the DC-removed smoothed rectified signal of sk(n). The correlation function: • P – the pitch of current frame • N – the frame length • k – the voicing strength for band (defined as max(Rsk(P),Ruk(P)))

  19. MELP Algorithm Description (Encoder ) cont’d • The jittery state is determined by the peakiness of the fullwave rectified LP residue e(n): • If peakiness is greater than some threshold, the speech frame is then flagged as jittered (Aperiodic flag will be set)

  20. MELP Algorithm Description (Encoder) cont’d 4.Applying a LPC analysis 5. Calculating final pitch estimate 6. Calculating Gain estimate 7. quantize the LPC coefficients, pitch, gain and bandpass voicing • Fourier magnitudes are determined and quantized • The information in these coefficients improves the accuracy of the speech production model at the perceptually-important lower frequencies

  21. MELP Encoder Bandpass Voicing Decision Gain Calculator Pitch Search Input signal Pre filter Quantize Gain, pitch, Voicing, jitter LPC Analysis Filter Final Pitch And voicing Decision LSF quantization Fourier Magnitude calculation Apply Forward Error Correction Transmitted Bitstream

  22. MELP Algorithm (Decoder) • Decoding the pitch • Applying gain attenuation • Interpolating linearly all of the synthesis parameters pitch-synchronously • Generating mixed-excitation

  23. MELP Algorithm (Decoder) cont’d • Applying anadaptive spectral enhancement filter • LPC synthesis and applying gain factor • Dispersion filtering

  24. MELP Decoder Received Bitstream Adaptive Spectral Enhancement Decode parameters Noise Generator Noise Shaping Filter + Pulse Generator Pulse Position Jitter Pulse Shaping Filter LPC Synthesis Filter Pulse Dispersion Filter Synthesized Speech gain

  25. Parameter Quantization

  26. Bit transmission order

  27. Comparison of the 2400 BPS MELP with other Standard Coders • Diagnostic Acceptability Measure • Two Conditions • Quiet • Office • Continuously Variable Slope Delta Modulation (CVSD) • 16,000 bps • Code Excited Linear Prediction (CELP) • 4800 bps • FS1016 • Mixed Excitation Linear Prediction (MELP) • 2400 bps • FIPS Publication 137 • Linear Predictive Coding (LPC) • 2400 bps

  28. Comparison of the 2400 BPS MELP with other Standard Coders (cont’d) • Mean Opinion Score in Six Conditions Quiet • Anechoic Sound Chamber • Dynamic Microphone Quiet - H250 • Anechoic Sound Chamber • H250 Microphone 1% Random Bit Errors • Anechoic Sound Chamber • Dynamic Microphone 0.5% Random Block Errors • Anechoic Sound Chamber • Dynamic Microphone • 50% Errors within a 35ms block Office • Modern Office Environment • Dynamic Microphone Mobile Command Environment • Field Shelter • EV M87 Microphone

  29. Comparison of the 2400 BPS MELP with other Standard Coders (cont’d) • Complexity with three Measurements • RAM • ROM • MIPS

  30. Voice samples LPC 10

  31. Voice samples Original Sound MELP 1800 MELP 2000 MELP 2200

More Related