120 likes | 137 Views
This workshop explores NLG decomposition, STEC modules, funding models, and empirical research in NLG, using GenIE as an example. Topics include discourse, user model, media, controlled studies, and input/output representation challenges. Emphasis on improving lay audience comprehension of arguments.
E N D
NLG STEC WorkshopApril 20-21, 2007Arlington, VA Nancy Green Univ. of North Carolina Greensboro, USA
STEC NLG Pipeline Model & STEC • Pro-STEC Assumptions: • (All/most/worth-funding) NLG can be decomposed into well-defined independent STEC-modules such that improving each one will advance NLG • Input/output representation for STEC is non-controversial
NLG ‘Pipeline’ = Tip of Iceberg Media/ Presentation- related KR&R Discourse KR&R Domain CommunicationKR&R User ModelKR&R Who will pay for NLG research outside of classical pipeline?: essential empirical research, major cost, but afraid it would fall outside of STEC funding model
Example NLG System KR&R GenIE: generates letters to genetics clinic patients; goal to justify medical experts’ conclusions such that all arguments are comprehensible to a lay person • Discourse: argumentation • Domain Communication: conceptual causal model underlying expert-lay communication (not domain model) • User Model: model of appraisal • Media/Presentation: how presentation affects argument comprehension
Lesson from GenIE • NLG Pipeline = global control + sentence planning/realization • can use existing surface realizers, standard domain ontology, and lexical resources • Main cost has been KR&R modules; mainly empirical work: • Goal: find non-domain-specific principles/ guidelines to optimize lay audience’s comprehension of arguments • Corpus studies: very useful but not sufficient • Controlled studies: necessary, and cannot afford to wait for other disciplines (HCI, learning sciences, etc.) to do them for us
GenIE Corpus Studies • Intercoder reliability of content annotation scheme: used to justify domain communication model • Argumentation schemes (non-domain-specific, both normative and affective) • Stylistic (lexical/syntactic) features of author perspective • Argument presentation features (order, cue words, explicitness)
GenIE Controlled Studies • How multimedia layout, cross-media cue words affect comprehension • How argument presentation (explicit vs. implied claim, cue words) affects recognition of argument components (Claim vs. Data) & dependence of final claim on intermediate claims
STEC NLG Pipeline Model & STEC • Pro-STEC Assumptions: • (All/most/worth-funding) NLG can be decomposed into well-defined independent STEC-modules such that improving each one will advance NLG • Input/output representation for STEC is non-controversial
STEC Input/Output Problem • Different input representations needed for different types of output; e.g. compare requirements for: • Fixed-format text (original scope of NLG) • Task-appropriate, user-friendly text format (e.g. line length, paragraphing, headings, font) • Text and (reported or quoted) dialogue in story • Dialogue spoken by animated emoting conversational agent • Integrated text and images or data graphics • Text referring to physical or visual properties of presentation (‘The red line in Fig. 2 shows sales in 2002.’)
Big Challenges Empirical research to test computation- oriented, general theories, principles, guidelines to answer: • What makes a “text” (i.e. including spoken dialogue, MMPs, etc.) • Coherent? In story dialogue, believable? • User-friendly? Task-appropriate? • Comprehensible? Pedagogically effective? • Entertaining (suspenseful, funny, etc.)?
Ex. Challenges (cont.) • How does channel change answer? • E.g. HCI research: cannot assume findings for paper apply to computer screen • How does length change answer? • E.g. learning sciences: 300-word summary vs. 3-page science argument for middle school • How do individual differences matter? • E.g. cognitive impairments, affect
Conclusions • Need some NLG research with massively interdisciplinary view: cognitive science, communication studies, etc. • Need some NLG research motivated by search for answers to general questions such as above • Will STEC approach effectively kill the above kind of NLG research?