120 likes | 247 Views
The LIFELIKE SYSTEM. ASK ALEX STORYBOARD. ISL. LifeLike System Overview. Multi-modal voice-input avatar with National Science Foundation Industry & University Cooperative Research Program (I/UCRC) expertise
E N D
The LIFELIKE SYSTEM ASK ALEX STORYBOARD ISL
LifeLike System Overview • Multi-modal voice-input avatar with National Science Foundation Industry & University Cooperative Research Program (I/UCRC) expertise • LifeLike Recognizer utilizes untrained Automatic Speech Recognition methods enhanced by context-specific grammars • LifeLike Dialog Manager implements a Context-Based Reasoning architecture driven by multiple user-centered goals • Funded by the National Science Foundation Recognized Input LifeLike Dialog Manager (Nguyen, Hung) LifeLike Recognizer (Leon-Barth, Dookhoo) User Input Context Voice Input Response String LifeLike Speech Output System (UIC) LifeLike Output Synthesized Response Response String
LifeLike Recognizer (Leon-Barth, Dookhoo) ChantSR SAPI4 Recognizer SAPI5 Recognizer IBM ViaVoice Dragon Naturally Speaking Nuance VoCon 3200 LifeLike Recognizer • Chant Middleware provides dictation and grammars control • SAPI 5.1 recognizes speech and provides services { Smart Layer { SRC Layer VoCon Dragon SAPI4 SMAPI { SAPI5 SRE Layer
Synchronization LifeLike Recognizer Architecture(Leon-Barth, Dookhoo) • Grammar rule content in W3C Speech Recognition Grammar Specification (SRGS) • Grammars contain limited context vocabulary • Matching phonemes with grammars improve recognition accuracy and speed LifeLike Recognizer ChantSR MS SAPI 5.1 Dictation Mode LifeLike Dialog System Grammar Recognition Dictation Mode Text Repository Grammars XML
Microsoft SAPI5.1& Chant • Speech recognition: • Converting an acoustic signal (i.e. audio data), captured by a microphone • Microsoft Speech SDK: • Tool for speech engines and applications for Microsoft Windows • CHANT SpeechKit: • Speech recognition management class that provides a productive way to develop software that listens • Your application sets properties and invokes methods through the speech recognition management class • Handles the low-level functions with speech recognition engines (i.e., recognizers) Voice Chant/C# MS SDK SAPI Speech Recognition
XML W3C Grammars <GRAMMAR LANGID="409"> <RULE TOPLEVEL="ACTIVE" NAME="CenterID"> <PHRASE> <!--Welcome back avelino which center are you from?--> <RULEREF NAME="agencyCenters"/> </PHRASE> </RULE> <p <DICTATION MIN="1" MAX="3"/> /> <LIST PROPNAME="agencyCenters"> <PHRASE VALSTR="UT">u t</PHRASE> <PHRASE VALSTR="UCF">central florida</PHRASE> <PHRASE VALSTR="UIC">illinois ?at chicago</PHRASE> <PHRASE VALSTR="UT">university ?of texas</PHRASE> <PHRASE VALSTR="UIC">u i c</PHRASE> <PHRASE VALSTR="UCF">u c f</PHRASE> </LIST> Main Rule Dictation Rule Phrase List • Tradeoff between coverage and conflicts • Standardized way for grammars recognizers
LifeLike Dialog System • Speech Disambiguator • Spell Check (contextualized spelling and phonetic matching) • Semantics Check (linguistic analysis using NLP toolkit) • Knowledge Manager • User-centered (NSF user profiles in XML format) • Domain-specific (AskAlex Ontology) • General knowledge (WordNet, ConceptNet, Semantic Web) • Subsets of data extracted into Context Specific Knowledge • Context-Based Dialogue Manager • Context-Based Reasoning architecture • Multiple, asynchronous user goal recognition • Conversational Primitives • Domain-Specific Contexts • AIML (Artificial Intelligence Markup Language) chatbot models LifeLike Dialog System (Hung, Nguyen) Speech Disambiguator(Hung) Knowledge Manager (Nguyen, Hung) LifeLike Recognizer (Leon-Barth, Dookhoo) Context-based Dialogue Manager (Hung) LifeLike Speech Output (UIC)
LifeLike Dialog System Architecture LifeLike Dialog System (Hung, Nguyen) LifeLike Recognizer (Leon-Barth, Dookhoo) Speech Disambiguator (Hung) Knowledge Manager (Nguyen, Hung) Spell Check Dictation String Context Phrase String SemanticsCheck Dataset Disambiguated String NSF User Data Context Context-based Dialogue Manager (Hung) Context Specific Knowledge AskAlex Ontology Context Response String General Knowledge Context LifeLike Speech Output (UIC) Response String Dataset Updated Data
Knowledge Manager Repository Building • Problem –Disconnect between user profile and avatar • Objective – Create a relational memory profile of the user • Approach • XML Representation • AskAlex Ontology • Relational Interaction
System Communication Protocol • Avatar and LifeLike Dialog System • Shared Memory and Socket • Variable Length Delimiter Protocol • Avatar Origin - start, stop • LDS Origin - Speech, Text, and Documents • In Dialog XML style markup • LifeLike Recognizer and LDS • Socket • Variable Length Delimiter Protocol • LR Origin – speech interpretation • LDS Origin - start, stop, contextual information ? ? ? ? ? ? ?
January Storyboard Sequence • Initiate Interaction • Avatar IE • IE Avatar, SR • User Speaks • SR IE • Avatar Reacts • IE Avatar, SR • Go to 2 until complete
LIFELIKE OPEN DISCUSSION ASK ALEX STORYBOARD Open Discussion Recognition Dialog system Synchronization ! Animation system Communication Protocols LIFELIKE ASKALEX Testing/Integration ISL ASK ALEX STORYBOARD