Dual Transfer Learning

Dual Transfer Learning Mingsheng Long1,2, Jianmin Wang2, Guiguang Ding2 Wei Cheng, Xiang Zhang, and Wei Wang 1Department of Computer Science and Technology 2School of Software, Tsinghua University, Beijing 100084, China

Outline • Motivation • The Framework • Dual Transfer Learning • An Implementation • Joint Nonnegative Matrix Tri-Factorization • Experiments • Conclusion

Notations • Domain • Feature space Two domains are different • Task • Given feature space and label space • Learn or estimate where Two tasks are different

Motivation Source comp.os Target comp.hardware Latent factors Task scheduling Performance Power consumption Architecture Cause the discrepancy between domains Represent the commonality between domains Exploring the marginal distributions

Motivation Source comp.os Target comp.hardware Model parameters Task scheduling → comp Performance →comp Power consumption →comp Architecture →comp Represent the commonality between tasks Exploring the conditional distributions

The Framework: Dual Transfer Learning (DTL) • Simultaneously learning the marginal distribution and the conditional distribution • Marginal mapping: learning the marginal distribution • Conditional mapping: learning the conditional distribution • Exploring the dualityfor mutual reinforcement • Learning one distribution can help to learn the other distribution

Nonnegative Matrix Tri-Factorization (NMTF)

An Implementation: Joint NMTF Source comp.os Target comp.hardware Latent factors Task scheduling Performance Power consumption Architecture Cause the discrepancy between domains Marginal mapping: learning the marginal distribution Represent the commonality between domains

An Implementation: Joint NMTF Source comp.os Target comp.hardware Model parameters Task scheduling → comp Performance →comp Power consumption →comp Architecture →comp Conditional mapping: learning the conditional distribution Represent the commonality between tasks

An Implementation: Joint NMTF Dual Transfer Learning Joint Nonnegative Matrix Tri-Factorization Solution to the Joint NMTF optimization problem

Joint NMTF: Theoretical Analysis • Derivation Formulate a Lagrange function for the optimization problem Use the KKT condition • Convergence Prove it by the auxiliary function approach [Ding et al. KDD’06]

Experiments • Open data sets • 20-Newsgroups • Reuters-21578 • Each cross-domain data set • 8,000 documents, 15,000 features approximately • Evaluation Criteria

Experiments • Non-transfer methods: NMF, SVM, LR, TSVM • Transfer learning methods: • Co-Clustering based Classification (CoCC) [Dai et al. KDD’07] • Matrix Tri-Factorization based Classification (MTrick) [Zhuang et al. SDM’10] • Dual Knowledge Transfer (DKT) [Wang et al. SIGIR’11]

Experiments Parameter sensitivity and algorithm convergence

Conclusion • We proposed a novel Dual Transfer Learning (DTL) framework • Exploring the duality between the marginal distribution and the conditional distribution for mutual reinforcement • We implemented a novel Joint NMTF algorithm based on the DTL framework • Experimental results validated that DTL is superior to the state-of-the-art single transfer learning methods

Any Questions? Thank you!

Dual Transfer Learning

Dual Transfer Learning

Presentation Transcript

issues in transfer learning

Transfer of Learning

Transfer of Learning

Transfer to Learning

DUAL STRATEGY ACTIVE LEARNING

TRANSFER OF LEARNING

Dual-Use Technology Transfer

Learning theory Transfer of Learning

Transfer Learning

Transfer for Supervised Learning Tasks

Transfer Learning Part I: Overview

Transfer of Learning

TRANSFER OF LEARNING

Towards Heterogeneous Transfer Learning

Transfer Learning Segment

Learning Transfer in Girls Volleyball

Transfer Learning for Image Classification

Dual Exceptionalities: Giftedness and Learning Disabilities

Learning for Transfer

Unsupervised and Transfer Learning Challenge

Cash Transfer Learning

Transfer of Learning