180 likes | 383 Views
Dual Transfer Learning. Mingsheng Long 1,2 , Jianmin Wang 2 , Guiguang Ding 2 Wei Cheng, Xiang Zhang, and Wei Wang 1 Department of Computer Science and Technology 2 School of Software, Tsinghua University, Beijing 100084, China. Outline. Motivation The Framework Dual Transfer Learning
E N D
Dual Transfer Learning Mingsheng Long1,2, Jianmin Wang2, Guiguang Ding2 Wei Cheng, Xiang Zhang, and Wei Wang 1Department of Computer Science and Technology 2School of Software, Tsinghua University, Beijing 100084, China
Outline • Motivation • The Framework • Dual Transfer Learning • An Implementation • Joint Nonnegative Matrix Tri-Factorization • Experiments • Conclusion
Notations • Domain • Feature space Two domains are different • Task • Given feature space and label space • Learn or estimate where Two tasks are different
Motivation Source comp.os Target comp.hardware Latent factors Task scheduling Performance Power consumption Architecture Cause the discrepancy between domains Represent the commonality between domains Exploring the marginal distributions
Motivation Source comp.os Target comp.hardware Model parameters Task scheduling → comp Performance →comp Power consumption →comp Architecture →comp Represent the commonality between tasks Exploring the conditional distributions
The Framework: Dual Transfer Learning (DTL) • Simultaneously learning the marginal distribution and the conditional distribution • Marginal mapping: learning the marginal distribution • Conditional mapping: learning the conditional distribution • Exploring the dualityfor mutual reinforcement • Learning one distribution can help to learn the other distribution
An Implementation: Joint NMTF Source comp.os Target comp.hardware Latent factors Task scheduling Performance Power consumption Architecture Cause the discrepancy between domains Marginal mapping: learning the marginal distribution Represent the commonality between domains
An Implementation: Joint NMTF Source comp.os Target comp.hardware Model parameters Task scheduling → comp Performance →comp Power consumption →comp Architecture →comp Conditional mapping: learning the conditional distribution Represent the commonality between tasks
An Implementation: Joint NMTF Dual Transfer Learning Joint Nonnegative Matrix Tri-Factorization Solution to the Joint NMTF optimization problem
Joint NMTF: Theoretical Analysis • Derivation Formulate a Lagrange function for the optimization problem Use the KKT condition • Convergence Prove it by the auxiliary function approach [Ding et al. KDD’06]
Experiments • Open data sets • 20-Newsgroups • Reuters-21578 • Each cross-domain data set • 8,000 documents, 15,000 features approximately • Evaluation Criteria
Experiments • Non-transfer methods: NMF, SVM, LR, TSVM • Transfer learning methods: • Co-Clustering based Classification (CoCC) [Dai et al. KDD’07] • Matrix Tri-Factorization based Classification (MTrick) [Zhuang et al. SDM’10] • Dual Knowledge Transfer (DKT) [Wang et al. SIGIR’11]
Experiments Parameter sensitivity and algorithm convergence
Conclusion • We proposed a novel Dual Transfer Learning (DTL) framework • Exploring the duality between the marginal distribution and the conditional distribution for mutual reinforcement • We implemented a novel Joint NMTF algorithm based on the DTL framework • Experimental results validated that DTL is superior to the state-of-the-art single transfer learning methods
Any Questions? Thank you!