230 likes | 402 Views
Tabla Gyan. Realtime tabla recognition and resynthesis Parag Chordia (GTCMT) Alex Rae (GTCMT ). Overview. What : Stroke type. When: Stroke timing. Resynthesis. Transformation: Timbre Rhythm. Video Demo. The Drum. Dayan – treble drum Bayan – bass drum. Tabla Language.
E N D
Tabla Gyan Realtime tabla recognition and resynthesis Parag Chordia (GTCMT) Alex Rae (GTCMT)
Overview What : Stroke type When: Stroke timing Resynthesis Transformation: Timbre Rhythm
The Drum • Dayan – treble drum • Bayan – bass drum
Recognition Architecture Input music Onset detection Rhythm Training data Statistical Model SVM Bayesian Neural Net ke te dhe Stroke Label dha tun ge
Build Model: Training Data • Several Datasets • Professional musician • Home recording • Audio recordings manually edited and labeled
Build Model: Target Mapping • Standardize idiosyncratic traditional naming conventions • Map timbrally similar (or identical) strokes to the same category
Build Model: Feature Extraction Spectral Features • MFCCs (24) • Centroid • Variance • Skewness • Kurtosis • Slope • Roll-off Spectral centroid F1 F2 F3 . . . Fn Variance Kurtosis Feature Vector
Build Model: Trained Model • WEKA machine learning package • Support Vector Machine • Models trained on different datasets can be saved for future use
Audio: Input • Live audio is taken from a close-mic’dtabla • Stereo signal provides partial separation of drums
Audio: Segmentation • Onset detection done in Max using bonk~ • More recent parallel project uses spectral flux algorithm in Java • End of stroke marked by next onset (1 sec buffer size) • Onset times stored
Audio: Feature Extraction Spectral centroid F1 F2 F3 . . . Fn Variance Kurtosis Feature Vector
Output: Classification • Feature vector is fed to previously trained model • Single category label returned feature vector SVM label
Output: Symbolic Score • Stroke label combined with timing and amplitude information • Score stored in temporary buffer in Max patch .3204 .9665 2 .3527 .5715 6 .3031 .3648 6 .3325 .9827 6 .2970 .4762 2 .3865 .5928 1 .3496 .6603 8 .7046 .4621 1 .3144 .5024 6 .7152 .2990 6 .3387 .8891 2 .2902 .7342 6 .3868 .9051 7 .3049 .5727 1
Output: Timbre Remapping Stroke labels can be flexibly remapped
Future Directions • Beat tracking • Modeling specific types of improvisational forms (e.g. qaida, tihai …) • Automate transformations • Improve interface so it can be “played” • Tracking of expressive parameters (e.g. bayan pitch modulation)
Conclusions • Shown a realtime tabla interaction system • Implemented as Max java external using machine learning to identify strokes • Supports flexible transformations • Foundation for more general improvisation system