The START Information Access System

The START InformationAccess System Boris Katz http://www.ai.mit.edu/projects/infolab/

The Problem: • Finding information on line • Two Approaches: • 1. Keyword search (search engines, e.g., AltaVista) • 2. Natural language processing

What’s Wrong with Keyword Search?

What’s Right About Natural Language Processing?

What’s Wrong with Natural Language Processing (today)? • 1. Too hard • Full-text NL understanding still beyond reach • Intersentential reference • Paraphrasing • Summarization • Common sense implication • 2. Too slow • 3. Not all information is language • Most Web resources are not textual • Maps and Images • Sound and Video • Multimedia • Web resources are distributed across numerous non-traditional databases

What is START? • START (SynTactic Analysis using Reversible Transformations) provides multimedia information access using natural language. • Natural language • Natural language is human language. You don’t have to learn a special language to use START. Ask your questions in English; enter information using English. • Multimedia access using natural language annotations • START lets you use English to access any kind of information: text, pictures, movies, and more. • “Just the right information” • START gives you the answer you want without including a thousand others. • Virtual collaboration • START retrieves information from its own knowledge base and from databases all over the Web.

Natural Language • Natural language is human language. You don’t have to learn a special language to use START. Ask your questions in English; enter information using English

Multimedia Access Using Natural Language Annotations • START lets you use English to access anykind of information: text, pictures, movies, and more.

Just the Right Information • START gives you the answer you want without including a thousand other answers.

Virtual Collaboration • START retrieves information from its own knowledge base and from databases all over the Web.

Natural Language Annotations • Bridge the gap between our ability to analyze natural language sentences and other information and our desire to access the huge amount of data now available on the Web. • Annotations are collections of natural language sentences and phrases that describe the content of various information segments. • START • analyzes these annotations • creates the necessary representational structures • produces special pointers to the information segments summarized by the annotations.

Natural Language Annotations Document Annotation + Xxx xx xx xxx xxxx x “Neptune was discovered using mathematics.” START Server START Server Xxx xx xxxx xx xx xxxxx x xxx xxx x xxx x xxx START Server START Server Information Provider (negotiation) Question “How was Neptune discovered?” (submitted) Information Seeker (retrieved) Document Xxx xx xx xxx xxxx x Xxx xx xxxx xx xx xxxxx x xxx xxx x xxx x xxx

Uniform Access NL questions IMDb Queries U.S. Census START Omnibase Fortune500 Data Multimedia responses POTUS HPKB • Local knowledge base of ternary expressions • Core vocabulary • Uniform interface to multiple database formats (Web, text, etc.) • Extended lexicon

How START Works Omnibase (external knowledge) Scripts Potus IMDb U.S. Census World Factbook WWW Web browser START HTML English English Scripts Parser Generator Input T-exps Matcher Annotations Native knowledge T-exps from KB Database of T-exps

The START Information Access System

The START Information Access System

Presentation Transcript

Information Access in the Humanities

Online Access to Fire System Information

Space Network Access System (SNAS) BETA Test Quick Start

(Based on the START system)

Running Start Information

Enhancing Access to Drought Information Using the CUAHSI Hydrologic Information System

The Information System

The Internet and Information Access

The LHC Access System

The LHC Access System

Information System to Access HITRAN via the Internet

The LHC Access System

The Information System

The PAPI System Point of Access to Providers of Information

Commissioning the access system

Information Access in the Humanities

The Information System

The Information System

Online Access to Fire System Information

START: Natural Language Access to Information

Commissioning the access system