Structure Discovery of Pop Music Using HHMM

Structure Discovery of Pop Music Using HHMM E6820 Project Jessie Hsu 03/09/05

Problem Description • Given • Wav signal of a pop song • Discover the structure of the song • Intro • Verse • Chorus • Bridge • Outro

HMM Framework • Model the music signal as a series of state transitions Hidden States …… Observations ……

HMM Framework: Hierarchical HMM • Each observation is an audio frame of one beat length Hidden States at Structure Level Intro Verse Outro Hidden States at Frame Level …… Observations ……

Representing a HHMM • HHMM parameters • Prior of each state at structure level and frame level π • State transition probabilities at structure level and frame level α • Emission parameters for each state at both levels • Each state is modeled as a mixture of Gaussians • Mean μand covariance matrices Σof each Gaussian

Training a HHMM • EM for HHMM • Look for maximum likelihood state sequence and model parameters • M-step: Best state sequence • Backward-forward algorithm • Viterbi algorithm • E-step: Parameter estimation • Priors at both levels π • State transition probabilities α • Emission parameters - Gaussian mixture mean μand covariance matrices Σ

Preprocessing • Beat detection • Segment the music into beat-length frames • Feature extraction • Repetition related feature (chorus/nonchorus) – Chroma vector • Intensity related feature (vocal/nonvocal) - Subband based Log Frequency Power Coefficients • Pitch related features – narrowband spectrogram features (Hann windowed FFT coefficients) • And possibly more….under investigation

Tasks • HHMM on a test song • Songs with I-V1-C1-V2-C2-(V3-C3)-B-O structure • Manually label structures as ground truth • Predefine the number of states at both structure and frame levels • Preprocessing • Model fitting • Evaluation • Accuracy of structure identification • Accuracy of structure timing

Reference • Y. Wang, M.-Y. Kan, T. L. New, A. Shenoy, J. Yin, “LyricAlly: Automatic Synchronization of Acoustical Musical Signals and Textual Lyrics”, ACM MM 2004 • C. Raphael, “A Hybrid Graphical Model For Aligning Polyphonic Audio With Musical Scores”, ISMIR 2004 • C. Raphael, “Automatic Segmentation of Acoustic Musical Signals Using Hidden Markov Models”, IEEE Trans on PAMI, 1999 • P. J. Walmsley, S. J. Godsill, P. J. W. Rayner, “Polyphonic Pitch Tracking Using Joint Bayesian Estimation of Multiple Frame Parameters”, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 1999 • L. Xie, S.-F. Chang, A. Divakaran, H. Sun, “Learning Hierarchical Hidden Markov Models for Video Structure Discovery”, Tech Report, Columbia Univ, 2002 • L. Xie, S.-F. Chang, A. Divakaran, H. Sun, “Unsupervised Mining of Statistical Temporal Structures in Video”, Video Mining, Ch 10, Kluwer Academic Publishers, 2003 • R. J. Turetsky, D. P. W. Ellis, “Ground-Truth Transcriptions of Real Music from Force-Aligned MIDI Synthesis”, ISMIR 2003

Structure Discovery of Pop Music Using HHMM

Structure Discovery of Pop Music Using HHMM

Presentation Transcript

Pop Music and Movies

1980’s Pop Music

Radio and Music Discovery

Korean (Pop) Music

Pop Music

Pop Music history

Music Genre: POP

Structure-function relationships (Using Discovery Studio)

RDA and Music Discovery

The Development of Pop Music

Pop Music Production

discovery of the structure of dna

Pop Music

POP MUSIC – NASHVILLE BECOMES MUSIC CITY, USA

Pop Music

Pop Music

Pop Music

Pop Music

POP MUSIC