510 likes | 803 Views
ELEC484 Phase Vocoder. Kelley Fea. Overview. Analysis Phase Synthesis Phase Transformation Phase Time Stretching Pitch Shifting Robotization Whisperation To Do Denoising Stable/Transient Components Separation. Analysis Phase. Analysis Phase. Based on Bernardini’s document
E N D
ELEC484Phase Vocoder Kelley Fea
Overview • Analysis Phase • Synthesis Phase • Transformation Phase • Time Stretching • Pitch Shifting • Robotization • Whisperation • To Do • Denoising • Stable/Transient Components Separation
Analysis Phase • Based on Bernardini’s document • pv_analyze.m • Inputs: inx, w, Ra • Uses hanningz.m to create window • Modulates signal with window • Performs FFT and fftshift • Outputs: Mod_y, Ph_y • (Moduli and Phase)
pv_analyze.m function [Mod_y, Ph_y] = pv_analyze(inx, w, Ra) % pv_analyze.m for ELEC484 Project Phase 1 % Analysis phase... based on Bernardini % inx = original signal % w = desired window size % Ra = analysis hop size % Get size of inx; store rows and columns separately [xrow, xcolumn] = size(inx); % Create Hanning window % using the hanningz code found in Bernardini win = hanningz(w);
pv_analyze.m % Figure out the number of windows required num_win = ceil( (xrow - w + Ra) / Ra ); % Matrix for storing time slices (ts) ts = zeros(w, num_win); % Modulation of the signal with the window happens here count = 1; for i = 0:num_win % the frame ends... frame_end = w - 1;
pv_analyze.m % checks to see where the end of the frame should be % if the count + frame_end goes outside of the size limitations do... if ( count + frame_end >= size(inx,1)) frame_end = size(inx,1) - count; end % determine where the end of the window is win_end = frame_end+1; % Set value of the time slice to match the windowed segment ts = inx( count : count + frame_end ) .* win( 1 : win_end );
pv_analyze.m % FFT value of ts using fftshift which moves zero frequency component Y( 1 : win_end,i+1 ) = fft( fftshift(ts) ); % Increment count by hop size count = count + Ra; end % End for loop % Set output values for Moduli and Phase and return the matrices Mod_y = abs(Y); Ph_y = angle(Y); end % End ph_analyze.m
Synthesis Phase • Also based on Bernardini’s document • pv_synthesize.m • Inputs: Mod_y, Ph_y, w, Rs, Ra • Uses hanningz.m to create window • Calculates difference between actual and target phases (delta phi) • Recombines Moduli and Phase into Array of complex numbers
Synthesis Phase • Performs IFFT and Overlap add • Sum all samples using tapering window • Final result is divided by absolute of the maximum value • Output: outx
pv_synthesize.m function outx = pv_synthesize( Mod_y, Ph_y, w, Rs, Ra ) % pv_synthesize.m for ELEC484 Project Phase 1 % Set number of bins and frames based on the size of the phase matrix [ num_bins, num_frames ] = size (Ph_y); % Set matrix delta_phi to roughly the same size as the phase matrix delta_phi = zeros( num_bins, num_frames-1 ); % PF same size as Ph_y PF = zeros( num_bins, num_frames ); % Create tapering window win = hanningz(w);
pv_synthesize.m % Phase unwrapping to recover precise phase value of each bin % omega is the normal phase increment for Ra for each bin omega = 2 * pi * Ra * [ 0 : num_bins - 1 ]' / num_bins; for idx = 2 : num_frames ddx = idx-1; % delta_phi is the difference between the actual and target phases % pringcarg is a separate function delta_phi(:,ddx) = princarg(Ph_y(:,idx)-Ph_y(:,ddx)-omega); % phase_inc = the phase increment for each bin phase_inc(:,ddx)=(omega+delta_phi(:,ddx))/Ra; end % End for loop
pv_synthesize.m % Recombining the moduli and phase... % the initial phase is the same Ph_x(:,1) = Ph_y(:,1); for idx = 2:num_frames ddx = idx - 1; Ph_x(:,idx) = Ph_x(:,ddx) + Rs * phase_inc(:,ddx); end % Recombine into array of complex numbers Z = Mod_y .* exp( i * Ph_x ); % IFFT and overlap add % Create X of specified size X = zeros( ( num_frames * Rs ) + w, 1);
pv_synthesize.m count = 1; for idx = 1:num_frames endx = count + w - 1; real_ifft = fftshift( real( ifft( Z(:,idx) ))); X( [count:endx] )= X(count:endx) + real_ifft .* win; count = count + Rs; end % sum of all samples multiplied by tapering window k = sum( hanningz(w) .* win ) / Rs; X = X / k; % Dividing by the maximum keeps things in proportion outx = X/abs(max(X)); end % end ph_synthesize.m
hanningz.m • Used because hann() gives incorrect periodicity: w = .5*(1 - cos(2*pi*(0:n-1)'/(n)));
princarg.m • Returns the principal argument of the nominal initial phase of each frame a=Phasein/(2*pi); k=round(a); Phase=Phasein-k*2*pi;
Time Stretching • Modify hop size ratio between analysis (Ra) and synthesis (Rs) % Analysis function [Mod_y, Ph_y] = pv_analyze(inx, w, Ra); % Do Time Shifting here % % Modify hop size ratio hop_ratio = Rs / Ra; hop_ratio = 2; Rs = hop_ratio * Ra; % Synthesis function X2 = pv_synthesize( Mod_y, Ph_y, w, Rs, Ra );
Pitch Shifting • Attempted to multiply a factor by the phase
Pitch Shifting % Analysis function [Mod_y, Ph_y] = pv_analyze(inx, w, Ra); % Do Pitch Shifting here % Ph_y = princarg(Ph_y*1.5); % Synthesis function X4 = pv_synthesize( Mod_y, Ph_y, w, Rs, Ra );
Robotization • Set phase (Ph_y) to zero % Analysis function [Mod_y, Ph_y] = pv_analyze(inx, w, Ra); % Do Robotization here % Ph_y = zeros(size(Ph_y)); % Synthesis function xout = pv_synthesize( Mod_y, Ph_y, w, Rs, Ra );
Whisperization • deliberately impose a random phase on a time-frequency representation % Analysis function [Mod_y, Ph_y] = pv_analyze(inx, w, Ra); % Do Whisperization here % Ph_y = ( 2*pi * rand(size(Ph_y, 1), size(Ph_y, 2)) ); % Synthesis function xout = pv_synthesize( Mod_y, Ph_y, w, Rs, Ra );
Denoising • emphasize some specific areas of a spectrum
Stable Components Separation • Calculate the instantaneous frequency by making the derivative of the phase along the time axis. • Check if this frequency is within its “stable range”. • Use the frequency bin or not for the reconstruction.
Conclusion • Rest of effects need to be properly implemented: • Stable/Transient Components Separation • Denoising
Questions? Thank you!