70 likes | 95 Views
Explore innovative strategies to maximize learning efficiency with minimal labeled data. Discover ways to improve classifier accuracy using small datasets. Techniques include Maximum Curiosity, Terrible Graphics, Minimum Marginal Hyperplane, Maximum Entropy, and Entropic Tradeoff.
E N D
Lots of data, very few labels • Choose unknowns to be labeled • Varying methods of choosing this unknown • Hopefully, will find the best classifier with very small number of examples
Maximum Curiosity • Generate new training sets by taking known data and adding assumed values for all unknowns • Run those through a learner and do statistics on results • Assume highest r value (cross-validated correlation coefficient) results from correct pairing
Terrible Graphics Additive Curiosity Variant: Sum, not max
Minimum Marginal Hyperplane • Based on Support Vector Machines • After learning SVM on known data, pick unknowns closest to boundary and repeat • Takes advantage of geometric features of SVMs
Maximum Entropy • Calculate entropy of assumed datasets • Assume that the most informative item is that which is most uncertain (highest entropy)
Entropic Tradeoff • Choose a mix of easily-classified and highly informative • At each step, choose both highest and lowest entropy unknowns to classify