330 likes | 393 Views
Semi-supervised Machine Learning Gergana Lazarova. Sofia University “St. Kliment Ohridski”. Semi-Supervised Learning. Labeled examples Unlabeled examples Training data Usually, the number of unlabeled examples is much bigger than that of the labeled ones
E N D
Semi-supervised Machine LearningGergana Lazarova Sofia University “St. Kliment Ohridski”
Semi-Supervised Learning • Labeled examples • Unlabeled examples • Training data • Usually, the number of unlabeled examples is much bigger than that of the labeled ones • Unlabeled examples are easy to collect
Self-Training • At first, only the labeled instances are used for learning • After that, this classifier predicts the labels of the unclassified instances. • A portion of the newly labeled examples (former unlabeled) augments the set of labeled examples and the classifier is retrained. • An iterative procedure
Cluster-then-label • It first clusters the instances (labeled and unlabeled) into k groups, performing unsupervised clustering algorithm. • After that, for each cluster Cj - based on the labeled examples in it, a supervised algorithm is learned and used to classify the unlabeled examples, which belong to Cj.
Semi-supervised Support Vector Machines • Since unlabeled examples do not have labels, we do not know on which side of the boundary they are • Hat loss function: • Decision boundary
Graph-based Semi-supervised Learning • Graph-based semi-supervised learning constructs a graph from the training examples. • The nodes of the graph are data points (labeled and unlabeled) and the edges represent similarities between points. Fig. 1 A semi-supervised graph
Graph-based Semi-supervised Learning • An edge between two vertices represents the similarity (wij) between them. The closer two vertices are, the higher the value of wij is. • MinCut Algorithm - find a minimum set of edges whose removal blocks the whole flow from one of the classes to the other class.
Semi-supervised Multi-view Learning Fig. 2Semi-supervised Multi-view Learning
Multi-View Learning– examples Fig. 3 – Multiple Sources of Information
Semi-supervised Multi-view Learning Co-training - the algorithm augments the set of labeled examples of each classifier, based on the other learner's predictions. (1) Each view (set of features) is sufficient for classification; (2) The two views (feature sets of each instance) are conditionally independent given the class. Co-ЕМ
Multi-View Learning – error minimization • Loss function - measures the amount of loss of the prediction. • Risk. The risk associated with f is defined as the expectation of the loss function • Emperical Risk- the average loss of f on a labeled training set. • Multi-view minimization problem
Semi-supervised Multi-view Genetic Algorithm • Minimizes the semi-supervised multi-view learning error • It can be applied to multiple sources of data • It works for convex and non-convex functions. Approaches based on gradient descend require a convex function. When a function is not convex, it is a hard optimization problem.
Semi-supervised Multi-view Genetic Algorithm • Individual: • Fitness Function • Do not change the size of the chromosome and do not mix the features of different views when applying crossover and mutation.
Experimental Results • “Diabetes” (UCIMachineLearningRepository) • Views: k = 2, x = (x(1), x(2)) • MAX_ITER = 20000, N = 100 • Comparison to supervised equivalents Тable 2Comparison to supervised equivalents
Sentiment analysis in Bulgarian • Most of the research has been conducted in English. • Sentiment analysis in Bulgarian suffers from labeled examples shortage. • A Sentiment Analysis System in Bulgarian– Each instance has attributes from multiple sources of data (a Bulgarian and English view)
DataSet • English reviews– amazon • Bulgarian reviews - www.cinexio.com
Big Data • Bulgarian view: 17099 features • English view: 12391 features Fig. 4Big Data - Modelling
Examples (1) • Rating: ** • F(SSMVGA) = 1.965 F(supervised) = 3.13
Examples(2) • Rating: ** • F(SSMVGA) = 1.985 F(supervised) = 1.98
Examples(3) • Rating: ***** • F(SSMVGA) = 1.985 F(supervised) = 1.98
Multi-view Teaching Algorithm • A semi-supervised two-view learning algorithm • A modification of the standard co-training algorithm • Improve only the weaker classifier • Uses only the most confident examples of the stronger view • Combining the views • Application – object segmentation
A Semi-supervised Image Segmentation System • A “teacher” should label few points of each class, giving the algorithm the idea of the clusters • The aim is to augment the training set with more labeled examples, reaching a better predictor. • The first view contains the coordinates of the pixels (x, y): view1 = (X, Y) • The second view contains the RGB values of the pixels (red, green, blue values ranging from 0 to 255)
DataSet Fig. 5 – Original Image, desired segmentation Fig. 6 – Original Image, desired segmentation Fig. 7 – Original Image, desired segmentation
Experimental Results • 2 experiments: • Comparison of the multi-view teaching algorithm, based on naïve Bayes classifiers (for the underlying learners) to a supervised naïve Bayes classifier: • Comparison of the multi-view teaching algorithm, based on multivariate normal distribution (MND-MVTA) and a Bayesian supervised classifier based on multivariate normal distribution (MND-SL):
Results (1) • Comparison of the multi-view teaching algorithm, based on naïve Bayes classifiers (for the underlying learners) to a supervised naïve Bayes classifier • The image consists of 50700 pixels. At each cross-validation step only a small amount of labeled pixels is used. Multiple tests were held depending on the number of labeled examples (4, 6, 10, 16, 20, 50 pixels). Тable 4 Accuracy based on the number of labeled examples
Results (1) • Comparison of the multi-view teaching algorithm, based on naïve Bayes classifiers (for the underlying learners) to a supervised naïve Bayes classifier • 16 labeled examples Таблица 5Сравнение на алгоритмите NB и MVTA
Results (2) • Comparison of the multi-view teaching algorithm, based on multivariate normal distribution (MND-MVTA) and a Bayesian supervised classifier based on multivariate normal distribution (MND-SL): • 16 labeled examples Таблица 6 Comparison of MND-MVTA andMND-SL
Examples • Multi-view Teaching Naïve Bayes Supervised
Examples • Multi-view Teaching Naïve Bayes Supervised
Thank you! Благодаря за вниманието! どうもありがとうございます!