1 / 15

CS 479, Section 1: Natural Language Processing

This work is licensed under a Creative Commons Attribution-Share Alike 3.0 Unported License . CS 479, Section 1: Natural Language Processing. Lecture #17: Text Classification; Naïve Bayes. Thanks to Dan Klein of UC Berkeley for many of the materials used in this lecture. . Announcements.

meda
Download Presentation

CS 479, Section 1: Natural Language Processing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. This work is licensed under a Creative Commons Attribution-Share Alike 3.0 Unported License. CS 479, Section 1:Natural Language Processing Lecture #17: Text Classification; Naïve Bayes Thanks to Dan Klein of UC Berkeley for many of the materials used in this lecture.

  2. Announcements • Reading Report #7 • M&S 7 • Due: today • Mid-term Exam • Next Thu-Sat • Review on Wed • Prepare your 5 questions • Project #2, Part 1 • No pair programming; you may collaborate (must acknowledge); do your own work • Help session: Tuesday, CS Conference Room, 4pm • Early: Monday after the mid-term • Due: Wednesday after the mid-term • ACM Programming Contest: • Saturday, Oct. 13 • http://acm.byu.edu/competition

  3. Objectives • Introduce the problem of text classification • Introduce the Naïve Bayes model • Revisit log-domain arithmetic

  4. Overview • So far: n-gram language models • Model fluency for noisy-channel processes (ASR, MT, etc.) • No representation of language structure or meaning • Now: Naïve Bayes models • Introduce a single new global variable ( for class label) • Model a (hidden) global property of text (the label) • Still a very simplistic model family

  5. Text Classification • Goal: classify documents into broad semantic classes(e.g., sports, entertainment, technology, politics, etc.) • Which one is the politics document? • And how much deep processing did that decision require? • Motivates an approach: bag-of-words, Naïve-Bayes models • Another approach in an upcoming lecture … Democratic vice presidential candidate John Edwards on Sunday accused President Bush and Vice President Dick Cheney of misleading Americans by implying a link between deposed Iraqi President Saddam Hussein and the Sept. 11, 2001 terrorist attacks. While No. 1 Southern California and No. 2 Oklahoma had no problems holding on to the top two spots with lopsided wins, four teams fell out of the rankings — Kansas State and Missouri from the Big 12 and Clemson from the Atlantic Coast Conference and Oregon from the Pac-10.

  6. Naïve-Bayes Models • Idea: pick a class, then generate a document using a language model given that class. What are the independence assumptions in this model?

  7. Naïve-Bayes Models • Naïve-Bayes assumption: all words are conditionally independent of one another given the class. • Compare to a unigram language model: We have to smooth these! c w1 w2 wn wn = STOP . . .

  8. Estimating Class Probabilitywith Naïve Bayes • For a chosen set of classes: • We have a joint model of class label and document: • We can easily compute the posterior probability of a class given a document (it’s just a conditional query on the model):

  9. Classifying with Naïve Bayes • Given document d,

  10. Log(arithmic) Domain Photo credit: Nathan Davis and Aaron Davis, Spring 2007, Google Campus, Mountain View, CA

  11. Classifying using Log Domain

  12. Practical Matters • How easy is Naïve Bayes to train? to test? • What should we do with unknown words? • Can work shockingly well for text classification (esp. in the wild). • How about NB for spam detection? • Can you use NB for word-sense disambiguation • How can unigram models be so terrible for language modeling, but class-conditional unigram models work for text classification?

  13. Insight for Project #2.1 • Think of these local model terms as a class-dependent unigram model

  14. Proper Name Classification • Movie Beastie Boys: Live in Glasgow • Person Michelle Ford-Eriksson • Place Ramsbury • Place Market Bosworth • Drug Dilotab • Drug Cyanide Antidote Package • Person Bill Johnson • Place Ettalong • Movie The Suicide Club • Place Pézenas • Company AMLI Residential Properties Trust • Drug Diovan • Place Bucknell • MovieMarie, Nonna, la vierge et moi • Person Chevy Chase c c1 c2 cn . . . Character-levelevidence i.e., “features”

  15. Next • Read the project requirements • Start working through the tutorial • The Mid-term exam covers up through using Naïve Bayes for classification

More Related