1 / 74

Experiences from large NLP Projects

Experiences from large NLP Projects. Jan Alexandersson. German research center for Artificial Intelligence GmbH Stuhlsatzenhausweg 3, Geb. 43.1 66123 Saarbrücken Tel.: (0681) 302-5 347 Email: janal@dfki.de www.dfki.de/~janal. Overview. Introduction What was VerbMobil What is SmartKom

ailish
Download Presentation

Experiences from large NLP Projects

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Experiences from large NLP Projects Jan Alexandersson German research center for Artificial Intelligence GmbH Stuhlsatzenhausweg 3, Geb. 43.1 66123 Saarbrücken Tel.: (0681) 302-5347 Email: janal@dfki.de www.dfki.de/~janal

  2. Overview • Introduction • What was VerbMobil • What is SmartKom • Scaling • Experiences from VerbMobil • Conclusion

  3. What was... http://verbmobil.dfki.de ?

  4. VerbMobil - What was it? • Speech-to-speech translation system • Robust processing of spontaneous dialogs • Speaker independent (adaptive) • Languages: English, German, Japanese • Domains: Appointment scheduling, travel planning and hotel reservation, remote PC maintenance • Summary of the dialogue automatically generated by the system • The system mediates between two humans, it does not play an active role • There is no control of the ongoing dialog by the system

  5. The Verbmobil Partners

  6. Prof. v. Hahn Univ. Hamburg Prof. Gibbon Univ. Bielefeld Prof. Paulus TU Braunschweig Dr. Eisele Philips, Aachen Prof. Ney RWTH Aachen Woz-Experimente, Datensammlung Signalnahe Evaluierung Multilinguale Wortlisten Prof. Mahr TU Berlin Dr. Klein, Dr. Wolf DLR, PT Prof. Blauert Univ. Bochum Kontextaus-wertung (LISP, Prolog, Java) Sprecheradaption Prof. Hess Univ. Bonn Dr. Reuse BMBF Referat 524 Prof. Hoffmann TU Dresden Erkenner Aachen Stat. Transfer (C++,C) Akustische Synthese (C, C++) Datensammlung Multilinguale Prosodiesteuerung (C++,C) Prof. Waibel CMU, Pittsburgh; Prof. Sag CSLI, Stanford, USA Datensammlung, Erkennung Syntax (C, C++, Prolog) Prof. Kurematsu ATR International, Kyoto, Japan Reparatur, Prosodie D, E (C) Prof. Görz Prof. Niemann Univ. Erlangen Prof. Tillmann LMU München Dr. Ruske TU München Dr. Block Siemens, München Syntax, Rob. Semantik, Dialog (LISP, Prolog) System integration (C++, Tcl-Tk) Prof. Pinkal Univ. d. Saarlandes Prof. Uszkoreit Prof. Wahlster DFKI, Saarbrücken Datensammlung, Integrierte Verarbeitung (C, C++, LISP, Prolog) A. Klüter DFKI, Kaiserslautern Erkenner DC, Sprachsteuerung (C, C++, Fortran) Multilinguale Erkenner (C, C++) R. Reng Temic, Ulm Dipl.-Ing. Mangold DaimlerChrysler, Ulm Chunk-Parser (Prolog) Transfer (Prolog) Prof. Waibel Univ. Karlsruhe Prof. Hinrichs Univ. Tübingen Prof. Rohrer Univ. Stuttgart The Verbmobil Partners

  7. BMBF-Funding Phase I, 1.01.93 – 31.12.96 62.7 Mio. DM 31.6 Mio € BMBF-Funding Phase II, 1.01.97 - 30.9.2000 53.3 Mio. DM 27 Mio € Industrial investment I+II 32.6 Mio. DM 16.5 Mio € Related industrial R & D activities ca. 10 Mio € ca. 20 Mio. DM Total 85.1 Mio € 168.6 Mio. DM Facts About the Project • 23 participating institutions (in Verbmobil II), from Germany and the USA • Over 900 full-time employees and students involved over the whole duration • Funded by the German Ministry for Education and Science and the participating companies:

  8. Verbmobil Consortium Steering Committee DLR G. Klein Manager Module 1 Manager Module n ... Project Organization German Federal Ministryfor Research and Education Scientific Management Group of Module Managers Scientific Head W. Wahlster Deputy Scientific Head A. Waibel Module Coordinator N. Reithinger Head of Project Management Group R. Karger Head of System Integration Group A. Klüter Verbmobil Advisory Board

  9. Input Conditions Naturalness Adaptability Dialog Capabilities Close-Speaking Microphone/Headset Push-to-talk Speaker Dependent Isolated Words Monolog Dictation Increasing Complexity Speaker Independent Information- seeking Dialog Read Continuous Speech Telephone, Pause-based Segmentation Spontaneous Speech Open Microphone, GSM Quality Speaker Adaptive Multiparty Negotiation Verbmobil Challenges for Language Engineering

  10. Classification of Machine TranslationMethods Interlingua Semantic Transfer SemanticStructure SemanticStructure SemanticAnalysis SemanticGeneration SyntacticStructure Syntactic Transfer SyntacticStructure SyntacticGeneration SyntacticAnalysis Word Structure Word Structure Direct Translation MorphologicAnalysis MorphologicGeneration Source Language Target Language

  11. The VerbMobil Case Interlingua Semantic Transfer SemanticStructure SemanticStructure SemanticAnalysis SemanticGeneration SyntacticStructure Syntactic Transfer SyntacticStructure SyntacticGeneration SyntacticAnalysis Word Structure Word Structure Direct Translation MorphologicAnalysis MorphologicGeneration ProsodicAnalysis ProsodicAnnotation Speech Signal Speech Signal Source Language Target Language

  12. The Graphical User Interface

  13. Focuses of Speech Recognitionin Verbmobil DaimlerChrysler University ofKarlsruhe Multilinguality Robustness LargeVocabulary RWTHAachen

  14. German English Japanese General Speech Recognition Task Audio Signal Recognizers Word Hypotheses Graph interface between acoustic and linguistic processing

  15. What Linguistic Analysis Really Needs • Syntactic Boundaries • He saw? the man? with the telescopeProsody cannot help • DialogAct Boundaries • No, I have no time at all on Thursday. D • But how about on Friday? • Dialog acts are pragmatic units that chunkthe input into • units which can be processed alone. • Prosodic Syntactic Boundaries • Of course?not?on Saturday • Syntactic boundaries that correlate to the acoustic-phonetic • reality; help during analysis within one chunk/dialog act. • Important in spontaneous speech with elliptical utterances.

  16. Speech Signal Word Hypotheses Graph Multilingual Prosody Module Prosodic features: l F0 l duration l energy l .... Boundary Information Boundary Information Sentence Mood Accented Words Prosodic Feature Vector Dialog Act Segmentation and Recognition Search Space Restriction Lexical Choice Speaker Adaptation Constraints for Transfer Speech Synthesis Dialog Understand. Translation Parsing Generation Prosody in Verbmobil

  17. Facts about Repairs in the Verbmobil Corpus • 21% of all turns in the Verbmobil corpus (79 562 turns) contain at least one self correction • The syntactic category is preserved in most cases(For example: Out of a sample of 266 verb replacements, 224 are again mapped to verbs) • Repairs take place in a restricted context(in 98% the reparandum consists of less than 5 words) • Repair sequences underlie certain regularities

  18. The Understanding of Spontaneous Speech Repairs I need a car next Tuesday oops Monday Editing Phase Repair Phase Original Utterance Reparans Editing Term Reparandum Recognition of Substitutions Transformation of the Word Hypotheses Graph I need a car next Monday

  19. Architecture of Repair Processing “On Thursday I cannot no I can meet äh after one”

  20. Multiple Approaches • Mono-cultural approaches are dangerous • humans vs. viruses  diversity • Microsoft vs. ILOVEYOU and copycats  alternative software solutions • Some sources of errors in a speech translation system • external • spontaneous speech: not well formed, hesitations, repairs • bad acoustic conditions • human dialog behavior • internal • knowledge gaps in modules • software errors • probabilistic processing  Use multiple engines, varying approaches on various stages of processing

  21. Multiple Approaches in Verbmobil • Exclusive alternatives: three different 16 kHz German speech recognizers with various capabilities • Competing approaches: • three parsers: HPSG, Chunk, Statistical • five translation tracks: case-based, dialog-act based, statistical, substring- based, linguistic (deep) semantic translation • Needed: selection and combination of results from competing tracks • parsers: combination of partial analyses in the semantic processing modules • translation: pre-selection module

  22. Multiple Translation Tracks - Approaches and Advantages • Case-based: • Approach: uses examples from the aligned bilingual Verbmobil corpus • Advantage: good translation if input matches example in corpus • Dialog-act based: • Approach: extract core intention (dialog act) and content • Advantage: robust wrt. recognition errors • Statistical • Approach: use statistical language and translation models • Advantage: guaranteed translation with high approximate correctness • Substring- based • Approach: combines statistical word alignment with precomputation of translation ”chunks” and contextual clustering • Advantage: guaranteed translation with high approximate correctness • Linguistic (deep) semantic translation • Approach: “classic” approach using semantic transfer • Advantage: high quality translation in case of success

  23. Example Based Translation • Result: Translation and a confidence value • Benefit: Improving Verbmobils translation capabilities through an additional translation path • Responsible: DFKI, Kaiserslautern • Task:Providing a translation based on translation templates and partial linguistic analysis • Input: WHGs or best Hypothesis • Method: Definite Clause Grammar (DCG), graph matching algorithms

  24. Dialog-Act Based Translation • Result: Translation and a confidence value, additionally content descriptions for the dialog module • Benefit: Robust translation and content extraction even when the recognition is erroneous • Responsible: DFKI, Saarbrücken • Task:Robustly provide a translation of core intentions and contents of the domain • Input: Prosodically annotated best hypothesis (flat WHG) • Method: Statistical dialog-act classifier and Finite State Transducers

  25. Statistical Translation • Result: Translation and a confidence value • Benefit: Approximative correct translation for spontaneous speech • Responsible: RWTH Aachen • Task:Provide approximative correct translations • Input: Prosodically annotated best hypothesis (flat WHG) • Method: Use statistical language and translation models

  26. Deep Translation • Result: Translation containing content information, suited for high quality speech synthesis • Benefit: Delivers the highest quality, but is sensitive to recognition errors and spontaneous speech phenomena • Responsible: Siemens AG, DFKI Saarbrücken, Universität Tübingen, Universität des Saarlandes, Universität Stuttgart, TU Berlin, CSLI Stanford • Task:Provide high quality translations • Input: Prosodically annotated WHG and contextual information • Method:Use syntactic and semantic approaches to analysis, transfer, and generation

  27. Modules Involved • Deep Analysis: HPSG Parser • Dialog Semantics:combination of parsing results, and semantic resolution • Transfer: VIT to VIT transfer • Generation: TAG generation from VITs • Dialog+Context: provides contextual information • Integrated processing comprises • search through the WHG • statistic parser • chunk parser • Semantic Construction provides VITs from statistic and chunk parser output

  28. The Multi-Parser Approach • Verbmobil uses three different syntactic parsers: an HPSG parser, a chunk parser, and a probabilistic LR parser. • Every parser implements another level of parsing accuracy, depth of syntactic analysis, and robustness of the analyzing process. • Chunk parser: Most robust but least accurate analysis • HPSG parser: Most accurate by least robust analysis • Probabilistic parser: Level of accuracy and robustness between HPSG and chunk parser

  29. HPSG Processing • Result: Source language VITs • Benefit: Delivers the highest quality, but is sensitive to recognition errors and spontaneous speech phenomena • Responsible: DFKI Saarbrücken, CSLI Stanford • Task:Thorough syntactic analysis • Input: Word chains from integrated processing • Method: Apply HPSG analysis

  30. The Result is a Syntactic Tree “Alright, and that should get us there about nine in the evening.”

  31. ... but analysis is not always spanning “The train arise at seven thirty. We could take a cab it to the hotel problem train station.”

  32. Semantic Construction • Result: VITs • Benefit: Providing results of shallow parser to the deep analysis track • Responsible: Universität Stuttgart (IMS) • Task:Convert and extend syntax trees to VITs • Input: Syntax tree from statistical and chunk parsers • Method: Compositional construction using semantic lexicon

  33. Schematic Processing Input: Syntactic tree Lexcion access and interpretation of the grammatical roles Intermediate representation: Application Tree Compositional semantic construction Intermediate representation: VIT Non compositional semantic construction using transfer rule engine Intermediate representation: Resulting VIT

  34. Dialog Semantics • Result: VIT ready for transfer • Benefit: Enhances robustness of deep analysis and provides vital information for transfer • Responsible: Universität des Saarlandes, Saarbrücken • Task:Combining results from various parsers, reinterpret and correct VITs, and resolve non-local ambiguities • Input: VITs from different parsers • Method: VIT models and rule based approaches

  35. Combining Analyses from Various Parsers • Parsers deliver VITs for segments of a turn • May be spanning analyses or just partial fragments • Combination necessary, both analyses of one parsers, but also analyses from various parsers • Combination criteria • HPSG is better than statistical parsers is better than chunk parser • Integrated results are better than fragments • Longer results are better than short ones

  36. Semantic Based Transfer • Result: VITs for generation • Benefit: Translate VITs inside the deep translation path • Responsible: Universität Stuttgart (IMS) • Task:Transfer VITs from the source to the target language • Input: VITs • Method: Rule based transfer

  37. Context Evaluation • Result: disambiguated transfer requests • Benefit: Higher quality of transfer results • Responsible: Technical University (TU) Berlin • Task:Resolving ambiguities in the dialog context during semantic transfer • Input:Requests from transfer • Method: Using world knowledge and rules

  38. Dialog Processing • Result: context information and dialog summaries and minutes • Benefit: Verbmobil knows what happens throughout the dialog and can present it • Responsible: DFKI, Saarbrücken • Task:Provides dialog context for all tracks and computes main information for dialog summaries • Input: Data from a lot of modules • Method: Frame-like topic structuring and rules

  39. Syntactic Analysis Probabilistic Analysis of Dialog Acts (HMM) Robust Dialog Semantics Dialog Act VIT Dialog Act Recognition of Dialog Plans (Plan Operators) Semantic Transfer Dialog Phase Dialog Information in Semantic Transfer

  40. The Intentional Structure VM_Dialogue Dialogue Level PH_Greet PH_Nego Phase Level G_Greet G_Nego G_Nego Game Level M_Greet M_Tr_Init M_Init M_Resp M_Greet Move Level DA Level Greet Pol_Form Request Suggest Reject Introduce Feedback Speaker A B A B

  41. Collaboration for a New Functionality: Summaries • Provide the users with a summary of the topics that were agreed • Two benefits • have a piece of information to use in calendars etc. • control the translation • Approach: exploit already existing modules for • content extraction • dialog interpretation • planning the summary • generation • transfer

  42. Summaries • Dialog module keeps track of the dialog:dialog model, context extraction, translations: dialog history • Three types of documents: • Minutes: relevant exchanges • Summary: dialog results • Scripts: complete dialog script

  43. MultilingualSummaries • Multilinguality: Integration of transfer module: Context Syndialog Dialog VITs VITs VM-PROTO Transfer (GE) VM-PROTO GENGER GENENG Document structure German Summary (HTML) English Summary (HTML)

  44. Result Summary

  45. Generation • Result: Strings, enriched with content-to-speech (CTS) information to support synthesis • Benefit: Output from the semantic transfer track • Responsible: DFKI, Saarbrücken • Task:Robustly generate the output of the semantic transfer in German, English, or Japanese • Input: VITs from transfer • Method: Constraint system for micro-planning, TAG grammar (reusing HPSG grammars) for syntactic realization

  46. Multiple Translation Tracks –Approx. correct translation 120 100 97 case based 95 88 85 83 statistical 81 80 79 78 79 DA based 75 69 68 Sem. based 65 66 60 Substring 57 49 47 45 46 Selection (Man) 40 40 44 46 40 Selection (Learning) 37 Selection (Manual) 20 0 WA > 50% WA > 75% WA > 80% 37 44 46 case based 69 79 81 statistical 40 45 46 DA based 40 47 49 Sem. based 65 75 79 Substring 57 66 68 Selection (Automatic) 78 83 85 Selection (Learning) 88 95 97 Selection (Manual)

  47. Verbmobil – The Book There are over 600 refereed papers on the various aspects of and achievements in Verbmobil. Wolfgang Wahlster (ed.): "Verbmobil: Foundations of Speech-to-Speech Translation" Springer-Verlag Berlin Heidelberg New York. 679 Pages ISBN 3-540-67783-6

  48. What is... http://smartkom.dfki.de ?

  49. Reference Architecture for Multimodal Systems Media Input Processing Media Output Rendering 2 Nov. 2001 Dagstuhl Seminar Fusion and Coordination in Multimodal Interaction edited by: M. Maybury Interaction Management Mode Coordination Mode Analysis G Discourse Management T Language Biometrics Multimodal Fusion A Graphics Application Interface ReferenceResolution Multimodal ReferenceResolution Gesture G Context Management Initiate Sound V Mode Design Terminate Expectation Management Information, Applications, People Presentation Design A Request User(s) Language Intention Recognition Select Content Respond Graphics G Design Action Planning Gesture Integrate A Allocate V Sound Coordinate User Modeling G Animated Presentation Agent Layout User ID Domain Model Task Model User Model Discourse Model Context Model Media Models Application Models Representation and Inference, States and Histories

  50. Situated Delegation-oriented Dialog Paradigm: Collaborative Problem Solving IT Services Service 1 Personalized Interaction Agent User specifies goal delegates task Service 2 cooperate on problems asks questions Service 3 presents results

More Related