150 likes | 291 Views
Chang Wang , Sridhar mahadevan. Heterogeneous Domain Adapation using Manifold Alignment. Layout. Problem Introduction Problem Definition Who cares Previous work & challenges Contribution A glance at methods. Problem Introduction. Example problem
E N D
Chang Wang, Sridhar mahadevan Heterogeneous Domain Adapation using Manifold Alignment
Layout • Problem Introduction • Problem Definition • Who cares • Previous work & challenges • Contribution • A glance at methods
Problem Introduction • Example problem • Input: Three collections of documents in • English (sufficient labels) • Italian (sufficient labels) • Arabic (few labels). • Target: Assign labels to the Arabic documents. • A way: find a common feature space for 3 domains • Shared labels, (sports, military) • No shared documents. (no instance correspondence) • No words translations are available.
Problem Introduction Arabic docs English docs Shared label set: {sports, military} No corresponding instances or words Italian Docs Question: Can we construct a common feature space so that we can use English docs and Italian docs to help classify Arabic docs?
Problem Introduction English docs Common feature space Italian docs Arabic docs
Problem Introduction • Given K input datasets in different domains, with different features, but all of the datasets shared the same label set. • Source domain have sufficient labeled instances. • Target domain have few labeled instances. • Question: Can we construct a common feature space? • So all instances in different domain can be mapped to the same feature space, so that we can perform learning task?
Problem Definition Source k Target Source 1 Common feature space Learning : # instances (domain i) : # features (domain i) : dimension of common feature space
Problem Definition • Input: K datasets from different domain • : dataset k • : instance i in dataset k • is defined by feature • Goal: construct dimension common feature space for learning • Output: k mapping functions, , • matrix
Who may benefit? • Search engine • classify docs, rank docs, find docs topics • Businessman • Customer clustering • Biologist • Match protein
Challenges • Target domain have little labels • No instance correspondence • Source domain and target domain have different feature space
Previous work • Most work assumes that the source domain and the target domain have the same features. • Manifold regularization • Do not leverage source domain information • Transfer learning based on manifold alignment: use both label and unlabeled instance to learn mapping • require small amount of instance correspondence.
Contribution • Transfer learning perspective • Can work on different feature space • Cope with multiple input domain • Can combine with existing domain adaption methods • Manifold alignment perspective • Need no instance correspondence • Use label to learn alignment
A glance at methods • Find a set of mapping functions • matrix • 3 Criterions • Instances from the same class (across domains) are mapped to similar locations • Instances from different class (across domains) are mapped to separate locations • Preserve topology in the original domain.
A glance at methods Common feature space English docs Italian docs Arabic docs Minimise distance(1,5)=0 Maximise distance(1,6)= Minimise distance(10,11)=1
A glance at methods • Encode 3 criterion in a cost function • Minimize • , for any pair with the same label • , for any pair with different labels • *similarity(,), for any pair in one original domain