1 / 23

The Architecture Dream Team

The Architecture Dream Team. Schloss Dagshul, Germany October 2001. Would you build your dream house without a blueprint?. What you hope to get. … what you might get. Today’s Conventional Architecture. Presentation. Dialog Control. Application Interface. Information Applications

yamin
Download Presentation

The Architecture Dream Team

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Architecture Dream Team Schloss Dagshul, GermanyOctober 2001

  2. Would you build your dream house without a blueprint?

  3. What you hope to get

  4. … what you might get

  5. Today’s Conventional Architecture Presentation Dialog Control Application Interface Information Applications People User(s)

  6. CHAMELEON Platform (Intelimedia Workbench) Paul McKevitt NL parser Speech recognizer Dialogue manager Gesture recognizer Black board Topsy Speech synthesizer Frame semantics Laser pointer Microphone array Domain model

  7. Microsoft Derek Jacoby A Typical DrWho App MIPAD Architecture

  8. Harry Bunt Input Interpretation Output Synthesis Context linguistic semantic physical perceptual cognitive social Pending Context ContextManagement DialogueManagement API Application

  9. Art ExplorationOliviero Stock explicit input (e.g., pointing) implicit input (e.g., movement) input analyzer Physical space model interaction history composer engine Hypermedia information visitor models presentation links and image to UI Audio message to headphone

  10. COLLAGEN Sidner et al.

  11. IBM’s Responsive Information Architect (RIA)Michelle Zhou Visual Designer Language Designer Presentation Broker Media Producer Models of: Design Domain User Conversation Environment IRIS Info Server user Conversational Facilitator speech Multimodal Interpreter gesture

  12. InteractKristiina Jokinen Information Storage Input Manager Presentation Manager ASR TTS Language Understanding Dialogue Manager Dialogue Agents/Acts(e.g., Q, A, State) Generator Agents TopicRecognition Task Agents/Acts Database

  13. EMBASSI Conceptual Architecture • Z-Axis: • Underlying HW-Infrastructure • Software-Infrastructure (Agent / Distr. Comp. Middleware) • Functional building blocks of conceptual architecture (Multimodal Assistant Componentware, MAC) • Application-level Assistants (not shown) • XY-Plane of MAC • Dialogic Assistance • Effectual Assistance • Situational Assistance • Explicit and implied generic (= application independent) ontologies, defining component interfaces

  14. SMARTKOM Wolfgang Wahlster

  15. DARPA Galaxy Communicator The Galaxy Communicator Software Infrastructure (GCSI) is a distributed, message-based, hub-and-spoke infrastructure optimized for constructing spoken dialogue systems Language Generation Text-to-Speech Conversion Dialogue Management Audio Server Application Backend Hub Speech Recognition Context Tracking Frame Construction Open source and documentation available at fofoca.mitre.org and sourceforge.net/projects/communicator

  16. Definitions • Abstract Architecture • Components, connections (protocols), and constraints (IEEE definition) • Data/knowledge structures, data flow and protocols, control flow • Consider use cases, e.g., • In-car navigation system • Desktop, kiosk, mobile device interaction • Media conversion

  17. Requirements • Functional • Modality integration (input and output) • Situation (User, task, application) appropriate real-time sensing/response (e.g., supporting barge-in, perceptual sensing/feedback) • Representation of level of granularity (modules and data structures) • Manage feedback - local and global, when/where? • Support incremental processing • Support incremental development (and scaleability) • System/Technical • Support for processing/fusing multimodal input (e.g., parallel processing) • Modular, composable (possibly distributed processing) • Efficient implementation • Time scale, Temporal and spatial resolution • Accessible (even partial) data structures • Open and extensible protocols

  18. Architecture of the SmartKom Agent (cf. Maybury/Wahlster 1998) Media Input Processing Media Interaction Management Media/Mode Analysis Analysis Media Fusion Language Graphics Discourse Modeling Gesture Biometrics Intention Recognition Application Interface Media/Mode Design Design Language User Modeling Graphics Gesture Animated Presentation Agent Presentation Design Application Interface Media Output Rendering Dialog Control Presentation User Model Task Model Domain Model Media Models Discourse Model Representation and Inference Information Applications People User(s)

  19. Architecture Media Input Processing Media Output Rendering Interaction Management Mode Coordination Media/Mode Analysis G Discourse Management T Biometrics Language Multimodal Fusion A Graphics Application Interface ReferenceResolution Gesture Multimodal ReferenceResolution G Context Management Initiate Sound V Media/Mode Design Terminate Lexicon Management Information, Applications, People Presentation Design A User(s) Request Language Intention Recognition Select Content Respond Graphics G Design Action Planning Gesture Integrate A Allocate V Sound Coordinate User Modeling G Animated Presentation Agent Layout User ID Domain Model Task Model User Model Discourse Model Context Model Media Models Application Models Representation and Inference, States and Histories

  20. The Architecture Dream Team Schloss Dagshul, GermanyOctober 2001

  21. Media Fusion Media Media/Mode Analysis Analysis Spoken Language S Media Fusion Media Fusion V Lip Reading V Gesture

  22. COLLAGEN Sidner et al. Speech interpretation Speech Agent Planning and discourse USER Mel ViaVoice Window events Application Student Model

More Related