180 likes | 350 Views
Polyphonic Audio Key Finding Using the Spiral Array CEG Algorithm. Research by Elaine Chew and Ching-Hua Chuan University of Southern California Presentation by Sean Sweeney DigiPen Institute of Technology CS 582 / April 17, 2011 Dr. Dimitri Volper. Presentation Flow.
E N D
Polyphonic Audio Key Finding Using the Spiral Array CEG Algorithm Research by Elaine Chew and Ching-Hua Chuan University of Southern California Presentation by Sean Sweeney DigiPen Institute of Technology CS 582 / April 17, 2011 Dr. DimitriVolper
Presentation Flow • Musical Pitch and Key • Human Perception of Pitch • The Spiral Array Model • Pitches • Chords • Keys • The CEG Algorithm • Algorithm • Visualization
Musical Pitch and Key • Pitch • The perceived value of a tone, “Low” to “High” • Psycho-acoustic (subjective) perception of Frequency • Frequency (Hz) is a scientific measurement of period • Key (Western music) • Labels the “center” tone in a section of music • Standard smallest interval: Semitone or “half-step” • Standard pattern of semitones around “center” • Ascending: 2,2,1,2,2,2,1
Human Perception of Pitch • Limited range of perception • Typically 20Hz – 20,000Hz • Range tends to decrease with age • Noticable Difference is coarser at low Hz • Less distance (Hz) between lower sounds • Around 1400 perceivable intervals • Certain frequency distances sound relatively close • Thirds, Fifths, Octaves
The Spiral Array Model Helical Structure Toroidal across Octaves Distance in 3D model approximates perceived closeness between pitch Pitch, chord and key can all map to the same space
Chords in the Spiral Array Standard chords are based on three supporting tones Create Triangles in 3D relative to the model Triangles are effectively continuous, as pitch is Major and Minor chords’ centers thus form helixes
Key in the Spiral Array Simple keys are based on three supporting chords Creates triangles in 3D, based on supporting chords’ triangular centers Triangles are effectively continuous, as chords are Major and Minor keys’ centers thus form helixes
Center of Effect • Center of Effect (CE) • Relative location of a chord based on its supporting tones • Notes of different strength change the CE location • Complex chord CE’s will not line up exactly on the model
Center of Effect Generator (CEG) Key-Finding • Center of Effect relates position of multiple pitches in model • Spatially closest chord is most likely key • Correlates input music to standard key structure
Helping Visualize the CEG Algorithm Keys exist as a triangle in 3-space Keys’ centers-of-effect make up two helixes in the 3D model In standard intonation, keys are discrete (12 minor, 12 major)
Helping Visualize the CEG Algorithm From a complex audio signal, weighted values are calculated for bins on each discrete tone The weighted values approximate the current key’s location on the model The spatially-closest key is the most likely match
CEG Key-Finding Algorithm • Pitch detection • Extract pitch class and strength from signal • Key finding • Nearest Neighbor Search in Spiral Array
Fast Fourier Transform • Efficient algorithm to compute Discrete Fourier Transform • O(n log n) vs O(n2) • Transforms function into its Frequency Domain representation • Widely used across many fields • Solving Partial Differential Equations • Data Compression • Polynomial Multiplication • Spectral Analysis • Frequency bands
Algorithm for Pitch Class/Strength from FFT For each frequency spectrum in a 0.37 second period: • For each frequency band find peak value • For each pitch-class, k, and its strength at time j: Fjk, is the sum of all peak values for that frequency band (and others related by octaves) • Normalize • Divide all pitch-strength values by the largest: • Divide all pitch-strength values by their sum: (k = 0, 1, …, 11)
CEG Key-Finding Algorithm • Pitch detection • Extract pitch class and strength from signal • Key finding • Nearest Neighbor Search in Spiral Array
CEG Algorithm For pitch class and strength from each 0.37 seconds: • Assign pitch-names to pitch classes: • Generate CE for previous 5 seconds; and • Assign pitch-names to current pitch-classes by nearest neighbor search in Spiral Array Space • Determine Key based on pitch names: • Generate the cumulative CE from beginning to current • Perform nearest-neighbor search to find closest key
Questions? Bibliography: Polyphonic Audio Key Finding Using the Spiral Array CEG Algorithm Chuan, C. and Chew, E. IEEE International Conference on Multimedia & Expo 2005 Towards a Mathematical Model of Tonality Chew, E. Doctoral dissertation, MIT 2000