160 likes | 292 Views
Time-scale and pitch modification. Algorithms review. Alexey Lukin. The problem. Goal: change duration or tonality of musical piece Naïve approach: (analog) record on tape and change playback speed (digital) resample the waveform Alas: pitch and duration change synchronously!.
E N D
Time-scale and pitch modification Algorithms review Alexey Lukin
The problem • Goal: change duration or tonality of musical piece • Naïve approach: • (analog) record on tape and change playback speed • (digital) resample the waveform Alas: pitch and duration change synchronously! Celine Dion Speed up by 20% “Time-scale and pitch modification algorithms”
The problem • Goal: independent control of times-scale and pitch, timbre should be natural! • Applications: • Samplers and virtual instruments • Production: synchronization of audio and video • Post-production: pull-up, pull-down • Entertainment: karaoke (changing key) • Education: sonic microscope • More? “Time-scale and pitch modification algorithms”
Time domain • Time-domain algorithms operate with the waveform, not spectrum • Break the signal into short granules • Repeat or discard (or shift) some granules to change duration • Resample to change pitch Some pictures in this presentation are taken from Ph.D. thesis of J. Bonada “Time-scale and pitch modification algorithms”
Time domain • Time-domain algorithms operate with the waveform, not spectrum • Break the signal into short granules • Repeat or discard (or shift) some granules to change duration • Resample to change pitch • Problems: • Granules can add in-phase (good) or out-of-phase (bad) • Transients are duplicated or discarded Guitar+castanets Slow down to 220% length “Time-scale and pitch modification algorithms”
Time domain • Solutions: • Ensure that pasted granules are in phase by selecting granule size to be multiple of pitch (requires autocorrelation or pitch analysis) • Prohibit duplicating and skipping of transient granules (requires detection of transients and advanced scheduling of granules duplication) Fixed granule size Pitch-synchronous granule size (“PSOLA”) Pitch-synchronous granule size, transients detection “Time-scale and pitch modification algorithms”
Time domain • Pitch-synchronous overlap-add (PSOLA) • Granules are 2 pitch periods long • Granules are repeated or discarded • Requires pitch detection → unstable results for non-pitched or polyphonic material “Time-scale and pitch modification algorithms”
Time domain • Summary • Very fast (1…5% CPU) • Good quality for pitched signals (solo instruments, vocal) • Poor quality for non-pitched and polyphonic material: • Amplitude modulation (out-of-phase overlapping of granules for some parts/instruments) • Repeated or discarded transients (unless special care taken) • Implementations • Editors, samplers: Audition, Cubase , Logic, Ableton, ACID • Vocal correctors: Melodyne, Autotune + – “Time-scale and pitch modification algorithms”
Vocoders • Frequency-domain algorithms operate with a short-time spectrum of the signal • Idea: build a spectrogram of a signal (using a short-time Fourier transform) and re-synthesize a signal from a spectrogram with a different time stride (hop) • Problem: during synthesis, signal granules can overlap out-of-phase • Solution: phase modification at each frequency channel called phase unwrapping “Time-scale and pitch modification algorithms”
Vocoders • Traditional vocoder algorithm: • Calculate shift-time Fourier transform (STFT) of a signal • Unwrap phases of each frequency channel (to compensate for change of synthesis stride at step 3), don’t modify magnitudes • Synthesize a signal using inverse STFT with a different time stride “Time-scale and pitch modification algorithms”
Vocoders • Magnitudes do not change • Phase unwrapping equations should provide in-phase overlapping of shifted granules at each frequency channel – “horizontal phase coherence” (phase increment) (phase unwrapping) (synthesis phase) “Time-scale and pitch modification algorithms”
Vocoders • Phase coherence problem • Horizontal phase coherence is ensured by phase unwrapping • How about vertical phase coherence (coherence of phases between different frequency bins)? It is lost! (except cases of integer stretching ratios) This leads to: • “Phasiness” due to out-of-phase signals in frequency bins within every signal harmonic • Transients are time-smeared along the whole granule Guitar+castanets Vocoder 220% length “Time-scale and pitch modification algorithms”
Vocoders • Vertical phase coherence improvement: “phase locking” algorithm locks phases within each spectrum peak • Divide frequency spectrum into intervals of harmonics • Unwrap phase of central (peak) frequency channel • Modify phases of other bins accordingly to the phase of the central channel • This reduces phasiness, but still doesn’t help transients No phase locking Phase locking “Time-scale and pitch modification algorithms”
Vocoders • How to improve sharpness of transients? • Frequency resolution of human hearing is not uniform: it is better at low frequencies and worse at high frequencies • So, we can use longer STFT windows at bass (for getting better frequency resolution) and shorter windows at treble Just phase locking Phase locking and multiple window sizes “Time-scale and pitch modification algorithms”
Vocoders • How to improve sharpness of transients? • We can directly paste transients to output without stretching (and phase modification) • Unwrapping of steady harmonics through transients Phase locking and multiple window sizes + transients pasted “Time-scale and pitch modification algorithms”
Vocoders • Summary • Good quality for complex, polyphonic signals • Some phasiness (even with phase locking) • Smearing of transients (unless special care taken) • Noises sometimes sound unnaturally • CPU-intensive (but still faster than realtime) • Implementations • Specialized software: SlowGold, Serato Time’n’Pitch, iZotope Radius + – “Time-scale and pitch modification algorithms”