Working group on multimodal meaning representation

Working group on multimodal meaning representation Dagstuhl workshop, Oct. 2001 http://www.dagstuhl.de/DATA/Title/01441.html

Scope • What should we consider as meaning? • How the processing of the input should lead to some update of the information state of the system (domain model, discourse model, user model, bla, bla, bla etc.); • Such a representation: • Should support both interpretation and generation; • Should support any kind of multimodal input and output; • Should support the variety of semantic theories;

Objectives • Provide interface formats within a MM dialogue architecture • Incremental construction (reference interpretation etc.) up to a final representation (e.g. fixed frame à la MUC) or system action/feedback; • Should also be a basis for the definition of annotation schemes of MM semantic content. • Specification and comparison of application-specific representations • Towards a framework allowing one to compare existing representations (e.g. M3L) or define a new one, while ensuring some level of interoperability between these.

What it is not • A domain model representation, or an ontology • There is OIL, DAML, Topic Maps etc. • A representation of lower-level linguistic or gestural information (e.g. syntax, etc.) • Some features may be percolated, though, or pointed to… • A representation of the underlying processes • Focus on the output of what is done by a given module

Basic constraints • Uniformity (representation for various types of inputs and outputs) • Incrementality (usable at various stages) • Before/after fusion, semantic/pragmatic aspects • Extensibility • Method for designing schemas (XML or others), rather than one specific schema • Clear and explicit semantics

Methodology • Basic components • Represent the general organization of any semantic structure • Parameterized by • data categories taken from a common registry • application specific data categories • General mechanisms • To make the thing work • General categories • Descriptive categories available to all formats

Basic components (1) • Temporal structures (“events”) • Dialogue turns/utterance • Gestures • Actions on/in the task • Referential structures (“participants”) • Individuals and objects participating in an event • Comprises spatial structures • Propositional content (BG)

Basic components (2) • Restrictions (on temporal and referential structures) • E.g. Gesture types, Linguistic modifiers, Dialogue acts, etc. • Dependency structures (linking events and referential structures) • E.g. Participant roles (cf. AGENT-SOURCE-GOAL), Discourse/rhetorical structure, temporal relations

Example • Typology of gesture types (registry) • Communicative gestures (wave) • Designation gestures • Shrug • Nod • Movement attributes (intensity, etc.) • One given format will choose among these, or even define its own categories

General mechanisms • Links • Internal links • To lower levels (syntactic structures, prosodic cues, gestural trajectories, etc.) • To domain model (types and instances) • Alternatives (cf. ambiguities) • E.g. disjunction of internal links

General categories • Architectural • Producer (consumer?) of the information, confidence, devices • Environmental • Time stamps, spatial information (speaker’s position, graphical configurations, gestural trajectories etc.) • Interactional • Speaker (user state?), other addressees etc.

Combining basic components and data categories Just to illustrate things…

Example Pointer to speaker’s characteristics <semRep id=”rep1”> <event id=“e0”> <cat>utterance</cat> <speaker target=“Peter”/> <adressee target=“System”/> </event> <event id=“e1”> <tense>present</tense> <evtType>wanttogo</evtType> … </event> <participant id=“x”> <num>sing</num> </participant> <relation source=“x” target=“e1”> <role>agent</role> </relation> </semRep> • In black: basic components and mechanisms • (meta-model of semantic representation) • In blue: parameter component chosen from reference registries • Categories • Values Peter: I want to go …

<semRep id=”rep1”> <event id=“e0”> <evtCat>utterance</evtCat> <speaker target=“Peter”/> <adressee target=“System”/> <alt> <dialAct cert=“0.8”>Order</dialAct> <dialAct cert=“0.3”>Inform</dialAct> </alt> </event> <event id=“e1”> <tense>present</tense> <voice>active</voice> <wh>none</wh> <evtType>wanttogo</evtType> … </event> <participant id=“x”> <lex>I</lex> <synCat>Pronoun</synCat> <num>sing</num> <pers>first</num> … </participant> <participant id=“y”> <lex>Paris</lex> <synCat>ProperNoun</synCat> <pers>third</num> … </participant> <participant id=“z”> <lex>Nancy</lex> <synCat>ProperNoun</synCat> <pers>third</num> … </participant> <relation source=“x” target=“e1”> <role>agent</role> </relation> <relation source=“y” target=“e1”> <role>source</role> </relation> <relation source=“y” target=“e1”> <role>goal</role> </relation> </semRep> I want to go from Paris to Nancy

<semRep id=”rep1”> <event id=“e0”> <evtCat>utterance</evtCat> <agent target=“Peter”/> <adressee target=“System”/> <dialAct>Order</dialAct> </event> <event id=“e1”> <tense>present</tense> <voice>active</voice> <wh>none</wh> <evtType>wanttogo</evtType> … </event> <event id=“e2”> <evtCat>gestural</evtCat> <agent target=“Peter”/> <when>2001-11-1:xxxxxx</when> <gestType>designation</gestType> <graphContext target=“ctxt23“> </event> <participant id=“x”> <lex>I</lex> … </participant> <participant id=“y”> <lex>here</lex> <synCat>adverb</synCat> </participant> <participant id=“z”> <lex>there</lex> <synCat>adverb</synCat> </participant> <relation source=“y” target=“e2”> <MMLink>co-designation</MMLink> </relation> </semRep> I want to go from here to there

Future work • SIGSEM Working group on meaning representations (ACL) • Liaison with ISO TC37/SC4 - linguistic resources • Preparation of a working draft • Liaison with Isle • Liaison with SIGMedia and SIGDial • W3C/VoiceXML

Working group on multimodal meaning representation

Working group on multimodal meaning representation

Presentation Transcript

On Meaning

Working Group Managing on Effects

IMIA Working Group 6 Medical Concept Representation

Report on Laboratory Working Group

Meaning Representation and Semantic Analysis

Multimodal Plan Representation for Adaptable BML Scheduling

Working Group on TB Drugs

Working Group on Monitoring

Community Working Group on Health

Working Group on Financial Intermediation

Working Group on Curriculum Development

Working Group on Curriculum Development

Working Group on Curriculum Development

Working Group on Applications

Working Group on Nowcasting Research

Working Group on New Vaccines

Global Working Group on Remittances

Working Group on Port Investment

Working Group on Monitoring

Towards multimodal meaning representation

Working group on Quality Assessment