200 likes | 213 Views
This project aims to develop a guru system with an avatar interface that can communicate with humans through spoken language. The focus is on creating a lifelike avatar appearance and dialog. The project is funded by NSF and carried out collaboratively by UCF and UIC. The goal is to showcase the feasibility and efficacy of natural language interfaces and lifelike avatars in training scenarios.
E N D
Towards Interactive Training with an Avatar-basedHuman-Computer Interface UCF University of Central Florida University of Illinois at Chicago
Topics • What we are doing • Why are we doing it • How we are doing it • How this relates to simulation and training • Results • Summary 2
What are we doing? … briefly • Building a “guru” system with an avatar interface capable of communicating with humans via spoken language • Our current emphasis is on avatar interface • Lifelike in its appearance and dialog • Project funded by NSF to a collaborative research team at the University of Central Florida (UCF) and the University of Illinois at Chicago (UIC)
The What … (continued) • The initial objective: replace a specific human being – Dr. Alex Schwarzkopf at the NSF • Founding and long-time director of the Industry/University Collaborative Research Centers (IUCRC) program • Recently retired • Follows a separate three-year research project at UCF to capture Dr. Schwarzkopf’s knowledge about the I/UCRC and code it in an easily retrievable system (called AskAlex)
Challenges • Facing several challenges in this project: • Make an avatar that is visually recognizable as Dr. Schwarzkopf, both in facial features and in his mannerisms and tendencies. • Provide a natural language interface that understands the questions from the user. • Manage the dialog in a way that is natural as well as effective. • Interface the system to AskAlex that can answer questions posed by users on or about the I/UCRC. • Simulate his voice, choice of words and inflections as closely as possible.
Why are we doing this? • We seek to show the feasibility of effective interfaces that employ natural language and a recognizable human-like avatar. • Is it practical? • Do humans respond better to lifelike avatars than text-based interfaces? • We are trying to preserve the legacy of Dr. Schwarzkopf by not only preserving his knowledge but in many ways himself, too.
How are we doing this? • Two parts to the work: • UIC charged with the visual elements of avatar • UCF charged with the communication and general intelligence of avatar
Creating a Virtual Alex • Vicon motion capture hardware
FaceGen software for expressions Creating a Virtual Alex (cont)
Alex at Work • Alex sits at his desk at NSF, a familiar environment (sitting at desk in office wearing a suit). • Alex provides information by speaking and showing textual or graphical information on the TV screen in his office.
Interacting with the User • Alex displayed on a 50” display • Shows upper body with room for him to gesture • Multiple microphones used so Alex can turn towards speaker • Blinking and other involuntary motions based on Alex’s mannerisms • Responsive Avatar Framework • Skeleton Animation Synthesizer • Facial Expression Synthesizer • Lip Synchronizer
ChantSR LifeLike Recognizer VoCon Dragon SAPI4 SMAPI SAPI5 SAPI4 Recog SAPI5 Recog IBM Via Voice Dragon Nat. Speak Nuance VoCon 3200 Synchronization Speech Recognizer Architecture Operational View: LifeLike Recognizer ChantSR MS SAPI 5.1 Dictation Mode LifeLike Dialog System Grammar Recognition Layered Recognition Design: Dictation Mode Text Repository Grammars XML
Dialog System Architecture LifeLike Dialog System LifeLike Recognizer Speech Disambiguator Knowledge Manager Spell Check Dictation String Context Phrase String Semantics Check Dataset Disambiguated String NSF User Data Context Context-based Dialogue Manager Context Specific Knowledge AskAlex Ontology Context Response String General Knowledge Context LifeLike Speech Output Response String Dataset Updated Data
Dialog Management • Semantic Disambiguator • Spelling and semantic check of SR input • Uses contextual matching processes • Knowledge Manager • 3 sources of knowledge: General, Domain-specific, User Profile • Contextualized Knowledge Base • CxBR-based Dialog Manager • Goal recognition • Inference Engine determines state of conversation for contextual relevance • Goal management • Goal Stack keeps track of conversational goals • Agent actions • Context Topology dictates conversational output using Inference Engine and Goal Stack
Relation to S&T • Perform functions done by humans • Mixed Reality Training, AAR, Concierges, Interrogator Training, Intelligent Tutoring Systems • Multimedia Integration • hybrid speech and pointing interface with user • Text, graphics, links to clarifying information or documents • Customizable environments support changing mission requirements • Characteristics (age, gender, appearance) can be varied independent of domain knowledge to match each trainee • Avatar can represent specific individual – CO, DI • Surroundings in which avatar occupies can be updated
Video • <insert new video here>
Initial Evaluation • Controlled experiment • 30 diverse students from UIC • study displayed ten 30-second life-size videos of AlexAvatar • Videos paired to compare/contrast: • Maintain eye contact, Head motion, Body motion, Pre-recorded vs computer generated Voice • Baseline comparison to our previous avatar implementation from a year ago. • Test subjects preferred by a significant margin: • Body movement compared to a still body • An avatar with purposeful motions incorporating motion-capture data from Alex • More detailed and realistic textures • No clear preference in pre-recorded vs. computer-generated voice
Summary • Currently on year two of three-year project • Current results are encouraging, several challenges still ahead • Focusing on: • avatar handling conversation with more than one human, • Increasing capacity for understanding spoken language • Expanding capabilities of dialog system, • Evaluating effect of “whiteboard” feature. • Expect to finish second edition of avatar soon for demo in early January at NSF conference • Looking for partnerships with military applications