210 likes | 387 Views
Further Development of a Classifier for Musical Genre Classification and retrieval. Kris West School of Computing Sciences University of East Anglia kristopher.west@uea.ac.uk. Outline. Background Genre classification Approaches Classifiers Results Research Perspectives
E N D
Further Development of a Classifier for Musical Genre Classification and retrieval Kris West School of Computing Sciences University of East Anglia kristopher.west@uea.ac.uk
Outline • Background • Genre classification • Approaches • Classifiers • Results • Research Perspectives • Reducing data processing costs • More powerful decision strategies • Other applications • In terms of an Integrated MIR system
BackgroundClassical Approaches to Genre Classification • Calculate features over a whole piece of audio OR • Calculate features and average over whole piece • Means and Variances • Model the distribution of classes • GMMs, KNN, LDA, MAP, SVMs etc. • Classify novel input • Classify one vector of features
BackgroundOur Approach to Genre Classification • Calculate features for every frame • 23 ms frames (512 samples @ 22050Hz) • Model distribution of individual frames • Classify novel input • Classify each individual frame • Classify whole piece by majority vote
BackgroundMotivation Unlikely that all frames belonging to a genre will belong to a single distribution More likely that there are multiple distributions in feature space, each of which should be fitted with its own model Number of distributions per class will vary and is hard to predict
BackgroundInitial comparisons • Classical approach • ~62% accuracy • Our approach • ~60.5% • Explanation • Classical approach used stronger feature set • Finer distribution of classes in our approach • i.e. Harder to separate classes (Non-linear) • Classifiers not up to task • Too much data for an SVM
BackgroundDevelop a better classifier • Base it on a decision tree • Recursive sub-division of feature space • Model many distributions for a single class • No limit on complexity • Keep on growing the tree • Only bounded by accuracy of sampling/feature calculation • Don’t have to define number of components/distributions in advance • Easy integration of disparate feature sets including categorical variables • Lots of existing research • Classification and Decision Trees • Breiman, L., Friedman, J., Olshen, R. & Stone, C. (1984)
Classification and Regression TreesDisadvantages • Computational complexity • Has to enumerate a large number of splits • Larger feature vectors = more splits to evaluate • Single splits = n possible splits • Linear combinations = n + n(n-2) possible splits n = length of feature vector
Classification and Regression TreesSplitting techniques • Traditionally nodes are split by a threshold of a single variable, a linear combination of variables or the value of a categorical variable • Any classification scheme can be used to split a node in the tree • Form all combinations classes within data • Train a binary classifier for each combination • Evaluate splits and select best • Evaluated: • Gaussian classifiers (GAUSS-CART) • Fisher Criterion Linear Discriminant analysis (LDA-CART)
Classification and Regression TreesAdvantages • Computational complexity reduced • Has to enumerate less splits • Usually less classes than features • Less combinations • Use all classes, no subsets n = length of feature vector
BackgroundInitial comparisons • Classical approach • ~62% accuracy (~66% with CART) • Our approach (CART) • ~83% • Explanation • Identifying individual timbres in a genre
Research Perspectives • Reduce computational costs of technique • Use a segmentation technique • Calc & append 1st and 2nd Order differentials of features or another trajectory • Then use harmonic/simple temporal modelling • Grow simpler more transparent decision trees? • Use Variable Bit Rate (VBR) • Calc difference between frames • Use one frame and a count to represent many similar, sequential frames • Do we need to re-expand or adapt classifier training?
Research Perspectives • Try stronger feature sets • Have tried timbral features such as Flux, Roll-off, Centroiad (mean and banded) • No effect, or reduced accuracy • Try other instrument identification features • Rise time etc. • Try Beat, Pitch and Rhythmic features
Research Perspectives • Use a more powerful technique than bag of frames to decide final output • Model common frame classification errors • Markov chains • Ergodic markov chains of frame sequences • Ngrams • Analogous to markov chains • Neural Nets • Train NNet on output, to decide final classification • Lots of input lines, One output line per class • All of above trained on re-substitution of training data • and independent test samples used to validate tree
Research Perspectives • Apply approach to Instrument Identification • Already performing successful timbral matching • Need database • Record it?
Research Perspectives • Use tree to perform Timbral music similarity • Train tree according to genre • Pre-compute a distance measure between mean vectors of leaf nodes from all other nodes • Big matrix (unless tree has already been simplified) • Use co-occurrence of leaf nodes • Group similar nodes into a single symbol • Using a threshold of the distance measure
Research Perspectives • Alternatively • Reduce music to symbol sequence (Leaf node numbers) • Cache distance scores for each leaf node • Collect distances of all query nodes from training example nodes • Normalise for number of frames • For length invariance • Return lowest n scores OR • Use tf*IDF on symbol sequence/Greenstone • Either use N-grams of symbols OR • Treat each symbol as a word
Research Perspectives • Use symbol sequence (Leaf node numbers) to perform or augment onset detection • Identify nodes corresponding to transients
Research Perspectives • Stick to timbral/Instrument identification/Semantic features • Semantic and episodic memory of music are subserved by distinct neural networksNeuroImage, Volume 20, Issue 1, September 2003, Pages 244-256Hervé Platel, Jean-Claude Baron, Béatrice Desgranges, Frédéric Bernard and Francis Eustache