1 / 13

Natural Language Processing at NYU: the Proteus Project

Learn how to extract vital information from vast text data using NLP techniques at NYU's Proteus Project. Discover methods to understand and process language for accurate insights. Explore weakly supervised learning and active learning for robust knowledge discovery. Join the course for in-depth understanding.

cseymore
Download Presentation

Natural Language Processing at NYU: the Proteus Project

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. NYU Natural Language Processing at NYU:the Proteus Project Ralph Grishman September 2009

  2. Proteus Project Faculty • Ralph Grishman • Satoshi Sekine • Adam Meyers http://nlp.cs.nyu.edu/

  3. ‘Just the Facts’ • Vast amount of information is now available on-line in text form • but getting ‘the facts’ can be very hard and slow • Where has Secretary Clinton been over the last month? • Which places on the East Coast have had swine flu outbreaks this month? • To move from search to question answering we need more than a bag of words • we need to figure out who-did-what-to-whom

  4. Understanding natural language isn’t easy • The rebels strafed the car … with automatic weapons fire. … with the Minister and his deputy. • They … died instantly. … were promptly arrested. Understanding language requires a lot of knowledge.

  5. How to get all this knowledge? • By hand … too expensive • Use weakly supervised learning • Give a few examples (‘seeds’) • Use very large text corpus to learn similar examples

  6. Knowledge Discovery: An Example • Goal: want to keep track of all the hirings and departures of executivesneed to find all the ways such events are described • Method: • identify a few seed patterns • retrieve documents containing patterns • find subject-verb-object pattern with • high frequency in retrieved documents • relatively high frequency in retrieved docs vs. other docs • add pattern to seed and repeat

  7. #1: pick seed pattern Seed: < person retires >

  8. #2: retrieve relevant documents Seed: < person retires > Fred retired. ... Harry was named president. Maki retired. ... Yuki was named president. Relevant documents Otherdocuments

  9. #3: pick new pattern Seed: < person retires > < person was named president > appears in several relevant documents Fred retired. ... Harry was named president. Maki retired. ... Yuki was named president.

  10. #4: add new pattern to pattern set Pattern set: < person retires > < person was named president >

  11. Results for some event types, unsupervised learning can do as well as manual pattern development Recall and precision asa function of number of iterations of learner:

  12. Robust Learning • Quality of learned patterns is uneven • ambiguity of language leads us to learn incorrect patterns • Need to identify cases of uncertainty • Potential linguistic ambiguities • With multiple classifiers using distinct features, cases where they disagree • Query user for selected uncertain examples • Weakly supervised learning +active learning robust, rapid knowledge discovery

  13. For More Information • Project web site nlp.cs.nyu.edu • Course G22.2590 - Natural Language Processing (Spring 2010)

More Related