140 likes | 242 Views
A Probabilistic Model for Melody Segmentation. By Miguel Ferrand, Peter Nelson, and Geraint Wiggins. Outlines. Overview of this model N-gram models and Entropy A case study Compare with the experiment from real listeners Discussion. Overview.
E N D
A Probabilistic Model for Melody Segmentation By Miguel Ferrand, Peter Nelson, and Geraint Wiggins
Outlines • Overview of this model • N-gram models and Entropy • A case study • Compare with the experiment from real listeners • Discussion
Overview • A probabilistic approach to predict segmentation boundaries in melodies • No knowledge of music theories is used in this model, pure mathematic method • Use entropy as a measure of unpredictability of music features • Guess that segmentation boundaries will appear at the changes of entropy
N-gram Models (1) • N-gram grammar (Nth order Markov model): P of occurrence of a symbol depends on the prior occurrence of n -1 other symbols. • The probability of sequence s = w1…wl of length l (wji: wi…wj, n: the order)
N-gram Model (2) • Problems: • Data sparseness: some P(wi | …) = 0 • Longer sequences will have lower counts if training corpus is small • Use linear interpolation smoothing method, Take tri-gram for example, P(wk | wk-3, wk-2, wk-1) = λ1P(wk) + λ2P(wk | wk-1) + λ3P(wk | wk-2, wk-1), where λ1 + λ2 + λ3 = 1 and λ1 < λ2 < λ3
Entropy • For an N-gram model M, entropy Hc(M) associated with context c, (e is all possible successor symbol of c) P(e | c) is calculated from linear interpolation smoothing method. Low entropy usually means high predictability.
A case study (1) • Deliège’s experiment • Subjects listened to a melody and had to identify segmentation points in real-time. (Use the solo for English Horn, from Tristan and Isolde by Wagner) • Subjects are both musically trained and untrained. • Found 8 main segment boudaries
A case study (2) • Translate melody information to event-based representation • Pitch Step (PS): interval distance to following event in semitones • Pitch Contour (PC): the sign of PS, {-1, +1, 0} • Duration Ratio (DR): DR of the present and following event • Duration Contour (DC): the change of DR; -1 if DR >1; 1 if DR < 1; 0 if DR = 1
A case study (4) • Tri-gram, bi-gram and uni-gram model was generated for PS, PC, DR and DC. • Standard deviation of entropy is calculated with sliding window (size = 10) • Results
Result • Duration based features have a much higher entropy variance than pitch based features. Therefore time based features are more likely to convey more information for segmentation. • Distinct changes in entropy happened to be melody segment boundaries indicated by listeners.
Discussion • N-gram model might be over-simplified for music sequences. • A state depends only on the previous states. • However, human’s memory is not infinite, either. • The ability to establish large-span temporal relations is limited.