180 likes | 277 Views
IntelliMedia. TeleMorph & TeleTuras: Bandwidth determined Mobile MultiModal Presentation. Student: Anthony J. Solon Supervisors: Prof. Paul Mc Kevitt Kevin Curran School of Computing and Intelligent Systems Faculty of Engineering University of Ulster, Magee. Aims of Research.
E N D
IntelliMedia TeleMorph & TeleTuras:Bandwidth determined Mobile MultiModal Presentation Student: Anthony J. Solon Supervisors: Prof. Paul Mc Kevitt Kevin Curran School of Computing and Intelligent Systems Faculty of Engineering University of Ulster, Magee
Aims of Research • To develop an architecture, TeleMorph, that dynamically morphs between output modalities depending on available network bandwidth: • Mobile device’s output presentation (unimodal/multimodal) depending on available network bandwidth • network latency and bit error rate • mobile device display, available output abilities, memory, CPU • user modality preferences, cost incurred • user’s cognitive load determined by Cognitive Load Theory (CLT) • Utilise Causal Probabilistic Networks (CPNs) for analysing union of constraints giving optimal multimodal output presentation • Implement TeleTuras, a tourist information guide for city of Derry
Objectives of Research • Receive and interpret questions from user • Map questions to multimodal semantic representation • Match multimodal representation to knowledge base to retrieve answer • Map answers to multimodal semantic representation • Monitor user preference or client side choice variations • Query bandwidth status • Detect client device constraints and limitations • Combine affect of all constraints imposed on system using CPNs • Generate optimal multimodal presentation based on bandwidth constraint data
Wireless Telecommunications • Generations of Mobile networks: • 1G - Analog voice service with no data services • 2G - Circuit-based, digital networks, capable of data transmission speeds averaging around 9.6K bps • 2.5G (GPRS) - Technology upgrades to 2G, boosting data transmission speeds to around 56K bps. Allows packet based “always on” connectivity • 3G (UMTS) - digital multimedia, different infrastructure required, data transmission speeds from 144K-384K-2M bps • 4G - IP based mobile/wireless networks, Wireless Personal Area Networks (PANs), ‘anywhere and anytime’ ubiquitous services. Speeds up to 100Mbps • Network-adaptive multimedia models: • Transcoding proxies • End-to-end approach • Combination approach • Mobile/Nomadic computing • Active networks
Please select a parking place from the Map Mobile Intelligent MultiMedia Systems • SmartKom (Wahlster, 2003) • Mobile, Public, Home/office • Saarbrücken, Germany • Combines speech, gesture and facial expressions on input & output • Integrated trip planning, Internet access, communication applications, personal organising • VoiceLog (BBN, 2002) • BBN technologies in Cambridge, Massachusettes • Views/diagrams of military vehicles and direct connection to support • Damage identified & ordering of parts using diagrams • MUST (Almeida et al., 2002) • MUltimodal multilingual information Services for small mobile Terminals • EURESCOM, Heidelberg, Germany • Future multimodal and multilingual services on mobile networks
Intelligent MultiMedia Presentation • Flexibly generate various presentations to meet individual requirements of: 1) users, 2) situations, 3) domains • Intelligent MultiMedia Presentation can be divided into following processes: • determination of communicative intent • content selection • structuring and ordering • allocation to particular media • realisation in specific media • coordination across media • layout design • Key research problems: • Semantic Representation • Fusion, integration & coordination
Semantic representation - represents meaning of media information • Frame-based representations: - CHAMELEON - REA • XML-based representations: - M3L (SmartKom) - MXML (MUST) - SMIL - MPEG-7 • Fusion, integration & coordination of modalities • Integrating different media in a consistent and coherent manner • Multimedia coordination leads to effective integrated multiple media in output • Synchronising modalities • Time threshold between modalities E.g. Input - “What building is this?”, Output - “This is the Millenium forum” • Not synchronised => side effect can be contradiction • SMIL modality synchronisation and timing elements
Intelligent MultiMedia Presentation Systems • Automatically generate coordinated intelligent multimedia presentations • User-determined presentation: • COMET (Feiner & McKeown, 1991) • COordinated Multimedia Explanation Testbed • Generates instructions for maintenance and repair of military radio receiver-transmitters • Coordinates text and 3D graphics of mechanical devices • WIP (Wahlster et al., 1992) • Intelligent multimedia authoring system • presents instructions for assembling/using/maintaining/repairing devices (e.g. espresso machines, lawn mowers, modems) • IMPROVISE (Zhou & Feiner, 1998) • Graphics generation system • constructive/parameterised graphics generation approaches • Uses an extensible formalism to represent a visual lexicon for graphics generation
Intelligent MultiMedia Interfaces & Agents • Intelligent multimedia interfaces • Parse integrated input and generate coordinated output • XTRA • Interface to an expert system providing tax form assistance • Generates & interprets natural language text and pointing gestures automatically; relies on pre-stored graphics • Displays relevant tax form and natural language input/output panes • Intelligent multimedia agents • Embodied Conversational Agents (e.g. MS Agent, REA) • Natural human face-face communication - speech, facial expressions, hand gestures & body stance • MS Agent • Set of programmable services for interactive presentation • Speech, gesture, audio & text output; speech & haptic input
Project Proposal • Research and implement mobile intelligent multimedia presentation architecture called TeleMorph • Dynamically generates multimedia presentation determined by bandwidth available; also other constraints: • Network latency, bit error rate • Mobile device display, available output abilities, memory, CPU • user modality preferences, cost incurred • Cognitive Load Theory (CLT) • Causal Probabilistic Networks (CPNs) for analysing union of constraints giving optimal multimodal output presentation
Implement TeleTuras, a tourist information guide for city of Derry providing testbed for TeleMorph incorporating: • route planning, maps, spoken presentations, graphics of points of interest & animations • Output modalities used & effectiveness of communication • TeleTuras examples: • “Where is the Millenium forum?” • “How do I get to the GuildHall?” • “What buildings are of interest in this area?” • “Is there a Chinese restaurant in this area?”
Data flow of TeleMorph High level : Media Analysis :
Software Analysis • Client output: • SMIL media player (InterObject) • Java Speech API Markup Language (JSML) • Autonomous agent (MSAgent) • Client input: • Java Speech API Grammar Format (JSGF) • J2ME graphics APIs • J2ME networking • Client device status: • SysInfo MIDlet - (type/memory/screen/protocols/input abilities/CPU speed) • TeleMorph server tools: • SMIL & MPEG-7 • HUGIN (CPNs) • JATLite/OAA
Conclusion • A Mobile Intelligent MultiModal Presentation Architecture called TeleMorph will be developed • Dynamically morphing between output modalities depending on available network bandwidth in conjunction with other relevant constraints • CPNs for analysing union of constraints giving optimal multimodal output presentation • TeleTuras will be used as testbed for TeleMorph • Corpora of questions to test TeleTuras (prospective users/tourists)