1 / 15

The START Information Access System

The START Information Access System. Boris Katz http://www.ai.mit.edu/projects/infolab/. The Problem:. Finding information on line Two Approaches: 1. Keyword search (search engines, e.g., AltaVista) 2. Natural language processing. What’s Wrong with Keyword Search?.

ciara
Download Presentation

The START Information Access System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The START InformationAccess System Boris Katz http://www.ai.mit.edu/projects/infolab/

  2. The Problem: • Finding information on line • Two Approaches: • 1. Keyword search (search engines, e.g., AltaVista) • 2. Natural language processing

  3. What’s Wrong with Keyword Search?

  4. What’s Right About Natural Language Processing?

  5. What’s Wrong with Natural Language Processing (today)? • 1. Too hard • Full-text NL understanding still beyond reach • Intersentential reference • Paraphrasing • Summarization • Common sense implication • 2. Too slow • 3. Not all information is language • Most Web resources are not textual • Maps and Images • Sound and Video • Multimedia • Web resources are distributed across numerous non-traditional databases

  6. What is START? • START (SynTactic Analysis using Reversible Transformations) provides multimedia information access using natural language. • Natural language • Natural language is human language. You don’t have to learn a special language to use START. Ask your questions in English; enter information using English. • Multimedia access using natural language annotations • START lets you use English to access any kind of information: text, pictures, movies, and more. • “Just the right information” • START gives you the answer you want without including a thousand others. • Virtual collaboration • START retrieves information from its own knowledge base and from databases all over the Web.

  7. Natural Language • Natural language is human language. You don’t have to learn a special language to use START. Ask your questions in English; enter information using English

  8. Multimedia Access Using Natural Language Annotations • START lets you use English to access anykind of information: text, pictures, movies, and more.

  9. Just the Right Information • START gives you the answer you want without including a thousand other answers.

  10. Virtual Collaboration • START retrieves information from its own knowledge base and from databases all over the Web.

  11. Natural Language Annotations • Bridge the gap between our ability to analyze natural language sentences and other information and our desire to access the huge amount of data now available on the Web. • Annotations are collections of natural language sentences and phrases that describe the content of various information segments. • START • analyzes these annotations • creates the necessary representational structures • produces special pointers to the information segments summarized by the annotations.

  12. Natural Language Annotations Document Annotation + Xxx xx xx xxx xxxx x “Neptune was discovered using mathematics.” START Server START Server Xxx xx xxxx xx xx xxxxx x xxx xxx x xxx x xxx START Server START Server Information Provider (negotiation) Question “How was Neptune discovered?” (submitted) Information Seeker (retrieved) Document Xxx xx xx xxx xxxx x Xxx xx xxxx xx xx xxxxx x xxx xxx x xxx x xxx

  13. Uniform Access NL questions IMDb Queries U.S. Census START Omnibase Fortune500 Data Multimedia responses POTUS HPKB • Local knowledge base of ternary expressions • Core vocabulary • Uniform interface to multiple database formats (Web, text, etc.) • Extended lexicon

  14. How START Works Omnibase (external knowledge) Scripts Potus IMDb U.S. Census World Factbook WWW Web browser START HTML English English Scripts Parser Generator Input T-exps Matcher Annotations Native knowledge T-exps from KB Database of T-exps

More Related