1 / 21

Introduction to NLP

Explore key concepts in NLP, such as information extraction, sentiment analysis, and more. Learn about the challenges and tools for tackling ambiguity in natural language understanding.

hurt
Download Presentation

Introduction to NLP

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to NLP What is Natural Language Processing? Many slides reused from Dan Jurafsky/Christopher Manning (Stanford)

  2. Question Answering: IBM’s Watson • Won Jeopardyon February 16, 2011! WILLIAM WILKINSON’S “AN ACCOUNT OF THE PRINCIPALITIES OFWALLACHIA AND MOLDOVIA” INSPIRED THIS AUTHOR’S MOST FAMOUS NOVEL Bram Stoker

  3. Information Extraction Event: Curriculum mtg Date: Aug. 23, 2017 Start: 10:00am End:11:30am Where:JBHT 532 Subject: curriculum meeting Date: Aug. 22, 2017 To: Susan Gauch Hi Susan, we’ve now scheduled the curriculum meeting. It will be in JBHT 532 tomorrow from 10:00-11:30. -Dave Create new Calendar entry

  4. Information Extraction & Sentiment Analysis Attributes: zoom affordability size and weight flash ease of use • nice and compact to carry! • since the camera is small and light, I won't need to carry around those heavy, bulky professional cameras either! • the camera feels flimsy, is plastic and very light in weight you have to be very delicate in the handling of this camera Size and weight ✓ ✓ ✗

  5. Machine Translation • Helping human translators • Fully automatic Enter Source Text:  这 不过 是 一 个 时间 的 问题 . Translation from Stanford’s Phrasal: This is only a matter of time.

  6. Making progress on this problem… • The task is difficult! What tools do we need? • Knowledge about language • Knowledge about the world • A way to combine knowledge sources • How we generally do this: • probabilistic models built from language data • P(“maison”  “house”) high • P(“L’avocatgénéral”  “the general avocado”) low • Luckily, rough text features can often do half the job.

  7. Ambiguity makes NLP hard:“Crash blossoms” Violinist Linked to JAL Crash Blossoms Teacher Strikes Idle Kids Red Tape Holds Up New Bridges Hospitals Are Sued by 7 Foot Doctors Juvenile Court to Try Shooting Defendant Local High School Dropouts Cut in Half 100%REAL

  8. A few of the 83+ parses for The post office will hold out discounts and service concessions as incentives. [Shortened WSJsentence.] • S NP Aux VP The postoffice will V NP holdout NP Conj NP discounts and N N PP service concessions asincentives 8

  9. S NP Aux VP PP The postoffice will V NP asincentives holdout NP Conj NP discounts and serviceconcessions 9

  10. 1 • S NP Aux VP Conj VP The postoffice will VP and V NP V NP PP holdout discounts service concessions asincentives

  11. 1 • S NP Aux VP Conj VP The postoffice will VP and V NP V NP PP holdout discounts service concessions asincentives

  12. S NP Aux VP The postoffice will V PP PP hold P NP asincentives out NP Conj NP discounts and serviceconcessions 14

  13. 1 • S NP VP The post office willhold Conj VP VP NP PP V NP and V out discounts service concessions asincentives

  14. Famous Ambiguity Examples • Time flies like an arrow • Translation English -> Russian -> English: • The spirit is willing but the flesh is weak • Became “The vodka is good but the meat is rotten” • Punctuation matters • Czarina saved a man’s life by changing: • Pardon impossible, to be sent to Siberia • Became “Pardon, impossible to be sent to Siberia”

  15. Where do problems comein? • Syntax • Part of speech ambiguities • Attachment ambiguities • Semantics • Word senseambiguities • (Semantic interpretation and scope ambigui- ties) 15

  16. Why else is natural language understanding difficult? segmentation issues non-standard English idioms Great job @justinbieber! Were SOO PROUD of what youve accomplished! U taught us 2 #neversaynever & you yourself should never give up either♥ dark horse get cold feet lose face throw in the towel the New York-New Haven Railroad the New York-New Haven Railroad tricky entity names world knowledge neologisms unfriend Retweet bromance Where is A Bug’s Life playing … Let It Be was recorded … … a mutation on the for gene … Mary and Sue are sisters. Mary and Sue are mothers. But that’s what makes it fun!

  17. Language Technology making good progress Sentiment analysis still really hard mostly solved Best roast chicken in San Francisco! Question answering (QA) The waiter ignored us for 20 minutes. Q. How effective is ibuprofen in reducing fever in patients with acute febrile illness? Coreference resolution Spam detection ✓ Let’s go to Agra! Paraphrase ✗ Carter told Mubarak he shouldn’t run again. Buy V1AGRA … Word sense disambiguation (WSD) XYZ acquired ABC yesterday ABC has been taken over by XYZ Part-of-speech (POS) tagging I need new batteries for my mouse. Summarization ADJ ADJ NOUN VERB ADV Parsing Colorless green ideas sleep furiously. The Dow Jones is up Economy is good The S&P500 jumped I can see Alcatraz from the window! Named entity recognition (NER) Housing prices rose Machine translation (MT) Dialog PERSON ORG LOC 第13届上海国际电影节开幕… Where is Citizen Kane playing in SF? Einstein met with UN officials in Princeton The 13th Shanghai International Film Festival… Castro Theatre at 7:30. Do you want a ticket? Information extraction (IE) PartyMay 27add You’re invited to our dinner party, Friday May 27 at 8:30

  18. Statistical NLPmethods • Involve deriving numerical data from text • Are usually but not always probabilistic • Many techniques areused: • – n-grams, history-based models, decision trees / de- cision lists, memory-based learning, loglinearmod- els, HMMs, neural networks, vector spaces, graphi- calmodels,…

  19. This class • Teaches key theory and methods for statistical NLP: • Probability and information theory classifiers • N-gram language modeling • Statistical Parsing • Inverted index, tf-idf, vector models of meaning • For practical, robust real-world applications • Information extraction • Spelling correction • Information retrieval • Sentiment analysis

  20. Skills you’ll need • Simple linear algebra (vectors, matrices) • Basic probability theory • C++ or Java or Python programming • Knowledge of data structures

  21. Introduction to NLP What is Natural Language Processing?

More Related