170 likes | 345 Views
Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification. John Blitzer, Mark Dredze and Fernando Pereira University of Pennsylvania, ACL 07’. Research Purposes. How to adapt classifiers across domains? Books, DVDs, electronics, and kitchen applications
E N D
Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification John Blitzer, Mark Dredze and Fernando Pereira University of Pennsylvania, ACL 07’
Research Purposes • How to adapt classifiers across domains? • Books, DVDs, electronics, and kitchen applications • SCL • How to select domains to annotate that would be good proxies to many other domains? • A-distance
excellent Cell phone review Good-quality reception Domain adaptation: SCL • SCL: structural correspondence learning • Two type of words: • Excellent and awful: pivot features • New words: features Computer review Fast dual-core
SCL & SCL-MI • Select pivot features • Select m pivot features which occur frequently in both domain: frequency. • Using frequency is good in POS tagging because they are very often function words, but not the same in sentiment classification. • Choose the ones with the highest mutual information to the source label (pos, neg).
SCL & SCL-MI (exclusive pivots) • Top pivots selected by SCL, but not SCL-MI (left) and vice-versa (right) • Observe feature vector x. • Weight k:pivots, d:features • Apply the projection • Learn the predictor
Dataset • Amazon product reviews: books, DVDs, electronics and kitchen appliances. • Rating (0-5): 0-2 negative, 4-5 positive, 3 dropped. • Balanced composition (labeled): 1,000 positive and 1,000 negative examples for each domain. • Unlabeled: 3,685 DVDs and 5,945 kitchen instances.
Baseline & experiment settings • Linear predictors on unigram and bigram features for classification. • Trained to minimize stochastic gradient descent. • For SCL & SCL-MI: • Pivots must occur in more than five docs in each domain.
Experiments • Labeled set: 1600 training instances and 400 instances. • Baseline: linear classifier trained without adaptation • Upper bound: inside test • Ex. Baseline 72.8%, SCL-MI adaptation 79.7%, inside 80.4%: adaptation loss for baseline 7.6%, adaptation loss for SCL-MI 0.7%, relative reduction in error due to adaptation for SCL-MI 90.8%
Correcting Misalignments • Supervised training objective • Vs: source model weight vector • 50 target domain labeled instances (for a single engineer to label with minimal effort)
Experiment results: loss • Show adaptation from only the two domains on which SCL-MI performed the worst relative to the supervised baseline.
Measuring Adaptability • The A-distance: two domains can differ in arbitrary ways, we are only interested in the differences that affect classification accuracy. (A: sets on which a linear classifier returns positive value)
Use the Huber loss as a proxy for the A-distance. • Given two domains, compute SCL representation, create and train a linear classifier. • Compute the empirical average per-instance Huber loss, then calculate 100*(1-loss). Refer this as A-distance.
Proxy A-distance & adaptation loss • Select books or DVDs, but not both.
Conclusion and future work • Domain adaptation: useful in sentiment classification, improve SCL by using MI, correct misalignments by using small labeled target domain data. • Select labeled domain by A-distance. • Future work: addressing the ranking problem.