110 likes | 182 Views
AMITIES Progress Report. Consortium Meeting 2/3 May 2002 University of Sheffield. Overview. Demonstrator Natural Language Understanding GALAXY Issues Corpus Transcription Issues. Demonstrator. Bilateral Sheffield/SUNY Base Dialogue Manager Generation Module PGDBS Missing NLU.
E N D
AMITIES Progress Report Consortium Meeting 2/3 May 2002 University of Sheffield
Overview • Demonstrator • Natural Language Understanding • GALAXY Issues • Corpus • Transcription Issues
Demonstrator • Bilateral Sheffield/SUNY • Base Dialogue Manager • Generation Module • PGDBS • Missing NLU
Natural Language Understanding • Extracting ‘semantic’ information • Names • Locations • Dates • Account numbers • Information Extraction
GATE for AMITIES • ANNIE • A Nearly New Information Extraction system • Finite state transducer (FASTUS) • Series of ‘grammars’ which look for objects • Developed largely for prose (MUC)
ANNIE grammars • ACE • Automatic Content Extraction • Includes extraction on such media as: • ASR • OCR • Adapting Sheffield ACE grammars
Adapting ACE Grammars • Remove orthographic information • Case • Punctuation • Diacritics • Generate Multilingual Capability • Separate grammars for French, German
GALAXY Issues • Address multilinguality within GALAXY • Character encoding • Relationship between GATE and GALAXY • Embed GALAXY API within GATE – a specific linguistic resource for creating GALAXY servers
NLU todo • Adapt ANNIE grammars for poor ASR output • Acquire high quality French/German resources (POS taggers…) • Learning information from corpus
Corpus • Working closely with GE CRD and GESC Leeds • Iteratively improving transcription performance • Dealing with complex cases, to prepare corpus for eventual public release
Corpus Analysis • First cut break down of corpus content • Automatic call classification algorithm • First cut utterance classification using DAMSL dialogue act set