1 / 8

LaSIE: The Large Scale Information Extraction System

LaSIE: The Large Scale Information Extraction System. Robert Gaizauskas Natural Language Processing Group Department of Computer Science University of Sheffield. Outline. History: The Evolution of an IE System Applications: Projects using LaSIE Demo. History: The Evolution of an IE System.

penny
Download Presentation

LaSIE: The Large Scale Information Extraction System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. LaSIE: The Large Scale Information Extraction System Robert Gaizauskas Natural Language Processing Group Department of Computer Science University of Sheffield

  2. Outline • History: The Evolution of an IE System • Applications: Projects using LaSIE • Demo AKT Workshop

  3. History: The Evolution of an IE System • January 1, 1995: LaSIE official “birth date” • EPSRC 3 year/3 person grant called Large Scale Information Extraction “to develop GATE, an architecture for combining modules to produce a system to extract information from large-scale running text to data templates in a specified domain to build and evaluate a particular set of modules and submit it to standard evaluation” GATE  LaSIE • Prehistory: • TIC/POETIC systems at Sussex 1987-1993 • CRL MUC-5 System at NMSU 1993(?) AKT Workshop

  4. History: Chronology of LaSIE • September 1995: LaSIE v1.0 participates in MUC-6 • Four tasks: Named Entity, Coreference, Template Element, Scenario Template (management succession) • Scores - P & R : NE: 89 (96) CO: 71/51 (72/59) TE: 70 (80) ST: 49 (56) • November 1996: GATE v1.0 • Contains VIE (Vanilla Information Extraction) system • VIE = LaSIE v1.5 • LaSIE v1.5 has essentially same functionality at LaSIE v1.0, but is embedded in GATE AKT Workshop

  5. History: Chronology of LaSIE (cont) • April 1997: LaSIE v2.0 participates in MUC-7 • Five tasks: NE, CO, TE, ST and Template Relations (TR) new for MUC-7 • Scores – P & R: NE: 86 (93) CO: 62 (62) TE: 77 (87) TR: 55 (76) ST: 51 (44) • Only site to participate in all 5 tasks (“inside GATE”) • 1997-2000 LaSIE serves as basis for a number of IE applications (below) • October 2000: LaSIE v2.1 (rolled up changes, since v2.0) AKT Workshop

  6. History: LaSIE People • Initial RAs: Hamish Cunningham, Kevin Humphreys, Takahiro Wakao • Others: • Saliha Azzam • Mark Hepple • Chris Huyck • Brian Mitchell • Sandy Robertson • Pete Rodgers • Yorick Wilks AKT Workshop

  7. LaSIE: IE Applications • PASTA • Protein Active Site Template Acquisition • BBSRC • EMPathIE • Enzyme and Metabolic Pathways IE • GlaxoWellcome/Elsevier • Competitor/Market Intelligence • New project launches/person tracking • British Gas; Mars Foods • STOBS • Structured Transcription of Broadcast Speech • EPSRC • EXALT • Extracting Amendments from Legal Text • Venns • Extracting info from Biographical Dictionaries AKT Workshop

  8. LaSIE: Applications (continued) • TRESTLE • Text Retrieval Extraction and Summarisation Technologies for Large Enterprises • GlaxoWellcome • Question Answering • TREC-8 and TREC-9 QA Track • CLARITY • Cross-language information retrieval • EC FW-5 AKT Workshop

More Related