40 likes | 264 Views
OTL: A Framework of Online Transfer Learning. Co-regularized Online Transfer Learning Algorithm & Its Mistake Bound.
E N D
OTL: A Framework of Online Transfer Learning Co-regularized Online Transfer Learning Algorithm & Its Mistake Bound Theorem 2. Let be a sequence of examples, where and for all . In addition a and . And we split the instance into two views . Then for any and , the number of mistakes M made by the proposed COTL algorithm is bounded from above by: Peilin Zhao¹, Steven C.H. Hoi¹ ¹Nanyang Technological University Online Transfer Learning Algorithm Problem Setting • Old/source data space: • where and . • Kernel in old/source data space: . • Old/source classifier: • where are the set of support vectors for the old/source training • data set, and are the coefficients of support vectors. • New/target domain: • Homogeneous domain: . • Heterogeneous domain: . • A sequence of examples from the new/target domain: where Remark: when the distance between h and the optimal is less than the distance between and the zero function, we can transfer knowledge from the old classifier to the new domain. Experimental Results Experimental Testbed and Setup Datasets: “a7a”, “w7a”, “mushrooms”, “spambase”, “usenet2” and “newsgroup4”. Download from: http://www.csie.ntu.edu.tw/˜cjlin/libsvmtools/datasets/ http://www.ics.uci.edu/˜mlearn/MLRepository.html “newsgroup4” is a concept drift dataset created from “newsgroup20”, whose detail is shown in table 1. Algorithms: PA is Passive Aggressive algorithm. PAIO is PA initialized with h (only using the first view for heterogeneous case). OTL(fixed) is OTL with fixed weights 0.5. COTL0 is COTL with first view classifier initialized with zero. SCT is combining PAIO of first view with PA of second view using fixed weights 0.5 RBF kernel: , (homo); , (hetero), for all the dataset and algorithm. Repeat 20 times for every dataset. Mistake Bound for OTL Algorithm Theorem 1. Let us denote by M the number of mistakes made by the OTL algorithm, we then have M bounded from above by: Goal: learn some prediction function on the new/target domain from the sequence of examples. Remark: Denote by and the mistake bound of model h and f, respectively. Notice and . Assume and . We have Homogeneous Domain Concept drift: the target variable to be predicted changes over time in the learning process. • Observation from experimental results: • Homogeneous • For the datasets without concept drifting, PA suffers more mistakes. • For the datasets with concept drifting, OTL works the best than the other algorithms. • OTL produce denser classifiers and spend more time. • Heterogeneous • PA without using information from old domain achieved the highest mistake rate. • COTL algorithm has the smallest mistake rate. • COTL produce denser classifiers and spend more time. Heterogeneous Domain Basic idea: based on the ensemble learning concept for transferring knowledge from the old/source domain. we construct an entirely new prediction function f only from the data in the target domain. Assumption: , for simplicity , the first m dimensions of represents the old feature space . Basic idea: split into and , then adopt a co-regularization principle to update the two classifiers and simultaneously for the two views. Prediction: Initialization: , Updating: Prediction: where and are weights, and . Updating weights: where , and .. ~ Conclusions • Propose the first framework for Online Transfer Learning. • Propose algorithms for two cases: homogeneous domain and heterogeneous domain. • We offered theoretical analysis. And encouraging results show the proposed algorithms are effective. where and are positive and The Twenty-Third Annual Conference on Neural Information Processing Systems (NIPS2009)