90 likes | 299 Views
III Digital Audio III.9 (We Oct 29) Phase vocoder for tempo and pitch changes. Phase Vocoder. This algorithm is built in order to enable tempo and pitch changes of a digital sound file. Tempo change : Same music, but played at a different tempo
E N D
III Digital Audio III.9 (We Oct 29) Phase vocoder for tempo and pitch changes
Phase Vocoder This algorithm is built in order to enable tempo and pitch changes of a digital sound file. Tempo change: Same music, but played at a different tempo Pitch change: Same tempo, but transposed pitch. The algorithm was first described by James L. Flanagan and R.M. Golden in a paper “Phase Vocoder” published in Bell System Technical Journal, Vol. 45, No. 9, p. 1493, November 1966. James L. Flanagan
Phase Vocoder The basic technique starts with so-called resampling. amplitude amplitude time time Δ Δ’ frequencies’ = frequencies. Δ/Δ’ problem: sound changes dramatically! Why???
Phase Vocoder The fundamental idea here is this: Construct a new sample with longer duration + same pitch ⇒back to original duration + higher pitch via resampling Construct a new sample with shorter duration + same pitch ⇒back to original + lower pitch via resampling. amplitude So basically we are dealing with the time change problem! The procedure is that we first cover the original signal by a sequence of sound frames of equal length, but in order to grasp their commonalities, we choose overlapping frames. Typically this is achieved by 75%, and the frame duration is typically 1/20 sec (corresponding to 20 Hz fundamental frequency for finite Fourier). time
Phase Vocoder The idea is to work on these frames, processing them on the frequency space, and then generating a synthesis sound by adding these new frames with different overlapping times and thereby changing the tempo of the overall signal:
Phase Vocoder A frame is generated from the original sample by multiplying it with a Hanning window function H(t) amplitude amplitude time time amplitude time
Phase Vocoder amplitude time First step of the algorithm: Analysis The frame is the transformed to frequency representation via FFT. The fundamental frequency is of course f = 1/frame duration = 1/D. The highest frequency is fs = n.f, so that we have n frequency intervals from 0 to fs(n-1)/n Hz. Attention: n has nothing to do with the original sample frequency of the signal!! The temporal delay of ¼ frame has then 2n/4 (temporal) samples; this number is called analytical hop size hopa. In other words, we have the equation hopa/2fs = 2n/(4.2nf) = D/4 = Da= analytical hop time between successive frames. D
Phase Vocoder What is the problem now? The reproduction of the frames with different distances causes phase problems: synthetical hop time analytical hop time Da Ds
Phase Vocoder sin(2πft) Second step (processing) : We look at the phase problem: sin(2πf(t+Da)) = sin(2πft+ΔΦ) ΔΦ can be calculated, omit this! ΔΦ = 2πf.Da , f = ΔΦ/2πDa = “true frequency”. Third step (synthesis): Replace Da by the synthetic frame distance Ds and then set a new phase of frame i ΔΦs, i = ΔΦs, i-1 +2πf.Dswhere the (i-1)th phase has been calculated by recursion. Correct the complex coefficients of the FFT transform accordingly. FFTransform back, multiply each frame by a Hamming curve and add it all. Frame i-1 Frame i take sinusoidal signal component