1 Center for Computational Language and EducAtion Research (CLEAR)

What’s next? Target ConceptIdentification and SequencingLee Becker1, Rodney Nielsen1,2, IfeyinwaOkoye1,Tamara Sumner1and Wayne Ward1,2 1Center for Computational Language and EducAtion Research (CLEAR) University of Colorado at Boulder 2 Boulder Language Technologies

Goals: • Introduce Target Concept Identification (TCI) • Potentially the most important QG related task • Encourage discussion related to TCI • Define a TCI based shared task • Illustrate viability • via Baseline and straw man systems • Challenge the QG Community to consider TCI

Overview • Define the Target Concept Identification and Sequencing tasks • Describe component and baseline systems • Discuss the utility of these subtasks in the context of the full Question Generation task • Final Thoughts

QG as a Dialogue Process • Question Generation • is much more than surface form realization • depends not only on the text or knowledge source • also depends on the context of all previous interactions

The Stages of Question Generation What to talk about next? Direction of flow - or - Series circuits • How to ask it? • Definition Question • Prediction Question • Hypothesis Question Final natural language output What will happen to the flow of electricity if you flip the battery around?

Target Concept Identification • Out of the limitless number of concepts related to the current dialogue, which one should be used to construct the question? • Inputs: • Knowledge sources • Dialogue Context / Interaction History • Output: • The next target concept • Subtasks • Key Concept Identification • Concept Relation Identification and Classification • Concept Sequencing

Key Concept Identification • Goal: Extract important concepts from a knowledge source (plain text, structured databases, etc…) • Want not just the concepts, but the concepts most critical to learning • Preferably identify core versus supporting concepts

Key Concept Identification: • CLICK - Customized Learning Service for Concept Knowledge [Gu, et al. 2008] • Personalized learning system • Utilizes Key Concept Identification to: • Assess learner’s work • Recommend digital library resources to help learner remedy diagnosed deficiencies • Driven by concept maps • Expert concept map • Automatically derived concept maps

Key Concept Identification: CLICK: Building a gold standard concept map • Source data • 20 Digital library resources • Textbook like web text • collectively considered to contain all the information a high school graduate should know about earthquakes and plate tectonics ___________________________________________________________________________________________ ___________________________________________________________________________________________ ___________________________________________________________________________________________ ___________________________________________________________________________________________

Key Concept Identification: CLICK: Building a gold standard concept map • Experts asked to extract and potentially paraphrase spans of text (concepts) from each resource • Concept 19: Mantle convection is the process that carries heat from the core and up to the crust and drives the plumes of magma that come up to the surface and makes islands like Hawaii. • Concept 21: asthenosphere is hot, soft, flowing rock • Concept 176: The Theory of Plate tectonics • Concept 224: a plate is a large, rigid slab of solid rock ___________________________________________________________________________________________ ___________________________________________________________________________________________ ___________________________________________________________________________________________ ___________________________________________________________________________________________

Key Concept Identification: CLICK: Building a gold standard concept map • Experts link and labeled concepts (i.e. build a map) for each of the 20 resources • Open ended label vocabulary • Discourse-style relations: elaborates, cause, defines, evidence, etc… • Domain specific relations: technique, type of, and indicates, etc… • 10 most frequent labels account for 64% of labels

Key Concept Identification: CLICK: Building a gold standard concept map • Experts individually combined 20 resource maps to span the whole domain • Experts collaboratively combined their individual resource maps to create a final concept map

Key Concept Identification: CLICK: Automated Approach _________ __________________ _________ _________ _________ __________________ _________ _________ _________ __________________ _________ _________

Key Concept Identification: Concept Extraction • COGENT System [De la Chica 2008] • MEAD [Radev et al. 2004] – Multi-document summarizer • Supplemented with additional features to tie into educational goals • Run on 20 digital library resources used to construct expert concept map • Extracted concepts evaluated against expert map concepts • ROUGE-L: F-Measure 0.6001 • Cosine Similarity: 0.8325

Key Concept Identification: Concept Relation ID and Classification • Concept Relation Identification • AKA Link Identification • Given two concepts, determine if they should be linked • Concept Relation Classification • AKA Link Classification • Given a linked pair of concepts, assign a label describing their relationship • This information can be useful both for concept sequencing and question realization • Can potentially comprise a separate task

Key Concept Identification: Concept Relation Identification • Given two concepts, determine if they should be linked • Approach [De la Chica et al. 2008]: • SVM-based classifier • Lexical, syntactic, semantic, and document structure features • Performance • P = 0.2061 • R = 0.0153 • Data set is extremely unbalanced • majority classification (no-link) overwhelmingly dominates • A good starting point for a challenging task worthy of further investigation

Key Concept Identification: Concept Relation Classification • Towards a gold standard • Experts labeled links on concept maps [Ahmad et al. 2008] • Discourse-like labels: cause, evidence, defines, elaborates… • Domain-specific labels: technique, type of, slower than • Vocabulary unspecified • 10 most frequent labels account for 64% of the links • With some refinement could use RST or Penn Discourse labels to create gold standard • Next steps • Create more reliable link classifier • Develop a link relation classifier

Key Concept Identification:Graph Analysis • Given a concept-map (graph) identify the key or central concepts (versus supporting concepts) • Approach: • Graph analysis using PageRank + HITS algorithm • Key concepts are the intersection of: • Concepts selected by PageRank + HITS • Concepts with the highest ratio of incoming vs. outgoing links • Concepts with the highest term density • Evaluation: • No gold standard set of core concepts • Experts asked to identify subtopic regions on concept map • Earthquake types, Tsunamis, theory of continental drift… • 80% core concept coverage of 25 subtopics

Concept Sequencing • Goal: Create a directed acyclic graph, which represents the logical order in which concepts should be introduced in a lesson or tutorial dialogue (w/r to a pedagogy) • Partial Ordering • Example: • Pitch represents the perceived fundamental frequency of a sound. • A shorter string produces a higher pitch. • A tighter string produces a higher pitch. • A discussion of the difference in pitch across each of the strings of a violin and a cello. 2  1 4 3

Concept Sequencing: Straw Man Approach • Aim: Show the viability of a concept sequencing task • Intuition: Concepts that should precede other concepts will exhibit this behavior across the corpus of digital library resources • Issues: • Concepts may not appear in their entirety in a document • Aspects of concepts may show up earlier than the concept as a whole • Approach: Treat concept to document alignment as an information retrieval task

Concept Sequencing:Implementation • Indexed the original 20 CLICK resources at the sentence level using Lucene (Standard Analyzer, similarity score threshold = 0.26) • Concepts are queries • A concept’s position in a resource is the sentence number of the earliest matching sentence Resource 1 Resource 2 Resource 3 1____________ 2____________ 3____________ 4____________ 5____________ 6____________ 1____________ 2____________ 3____________ 4____________ 5____________ 6____________ 1____________ 2____________ 3____________ 4____________ 5____________ 6____________ Concept A Concept B Concept C

Concept Sequencing:Implementation • With concept positions identified and tabulated, compute pairwise comparisons between all concepts’ sentence numbers • If concept does not appear in a resource, do not include it in comparison • Concepts with an identical number of predecessors are considered to be at the same level Resource 1 Resource 2 Resource 3 Total

Concept SequencingResults Concept Sequencing System Output

Concept SequencingEvaluation Data • Currently no canonical concept sequence for CLICK data • Instead derived gold-standard evaluation data using a set of expert provided remediation strategies for individual students essays Remediation Order

Concept SequencingEvaluation Data • Of 55 key concepts • 14 did not occur in any of the remediation strategies • 41 left to define concept sequence evaluation • Used frequency of precedence across remediations to create a first pass concept sequence • Manually removed loops and errant orderings

Concept SequencingEvaluation Data Gold-standard Evaluation Sequence

Concept SequencingEvaluation • F1-Measure • Average Instance Recall (IR) over all gold-standard key concepts that have predecessors • Average Instance Precision (IP) over all of the non-initial system-output concepts that are aligned to gold-standard key concepts • Gi all predecessors of ithgold-standard key concept • Ojall predecessors of jth system output concept

Concept SequencingResults and Discussion • F1=0.526 (P=0.383, R=0.726) • Gold-standard • Multiple initial nodes • System output • One single initial node • Linear hierarchies • All nodes with same number of predecessors at the same level • All inclusive ordering favors recall • Future Work • Utilize pairwise data to produce less densely packed graphs • More sophisticated measures of semantic similarity • Make use of concept map link relationships (cause, define…) • Conduct expert studies to get gold-standard sequences and concepts

Tutorial Dialogue and Question Realization • Dialogue-based ITS • Labor intensive • Effort centers on authoring of dialogue content and flow • Design of dialogue states non-trivial

Tutorial Dialogue and Question Realization • So what does Target Concept Identification buy us? • Critical steps towards more automated ITS creation • Decreased effort • Scalability • Contextual grounding • TCI Mappings to Dialogue Management • Key Concepts = States or Frames • Concept Sequence = Default Dialogue Management Strategy

Tutorial Dialogue and Question Realization • Example: • Concept 486: an earthquake is the sudden slip of part of the Earth’s crust... • Concept 561: …When the stress in a particular location is great enough... an earthquake begins • Suppose student has stated a paraphrase of 486 • ITS can produce: • Now that you have defined what an earthquake is, can you explain what causes them? Caused-by

Final Thoughts • Defined Target Concept Identification • Baseline and past results suggest feasibility of TCI subtasks • Challenge the QG community to continue to think of QG as the product of several tasks including TCI

Acknowledgements • Advisers and colleagues at: • The University of Colorado at Boulder • The Center for Computational Language and EducAtion Research (CLEAR) • Boulder Language Technologies • Support from: • The National Science Foundation. NSF (DRL-0733322, DRL-0733323, DRL-0835393, IIS-0537194) • The Institute of Educational Sciences. IES (R3053070434). Any findings, recommendations, or conclusions are those of the author and do not necessarily represent the views of NSF or IES.

References 1. F. Ahmad, S. de la Chica, K. Butcher, T. Sumner, and J.H. Martin. Towards automatic conceptual personalization tools. In Proc 7th ACM/IEEE-CS joint conference on Digital Libraries. ACM, 2007. 2. I. L. Beck, M. G. McKeown, C. Sandora, L. Kucan, and J Worthy. Questioning the author: A year-long classroom implementation to engage students with text. The Elementary School Journal, 98:385– 414, 1996. 3. B.S. Bloom. Taxonomy of Educational Objectives: The Classification of Educational Goals. Susan Fauer Company, Inc, 1956. 4. S. de la Chica, F. Ahmad, J.H. Martin, and T. Sumner. Pedagogically useful extractive summaries for science education. In Proc CoLing, volume 1, pages 177– 184. Association for Computational Linguistics, 2008. 5. A Graesser, V Rus, and Z Cai. Question classification schemes. In Proc WS on the QGSTEC, 2008 6. Q. Gu, S. Chica, F. Ahmad, H. Khan, T. Sumner, J.H. Martin, and K. Butcher. Personalizing the selection of digital library resources to support intentional learning. In Proc Euro Research and Advanced Technology for Digital Libraries, 2008. 7. P.W. Jordan, B Hall, M Ringenberg, Y Cue, and C Rose. Tools for authoring a dialogue agent that participates in learning studies. In Proc AIED, pages 43–50, Amsterdam, The Netherlands, The Netherlands, 2007. IOS Press. 8. W.C. Mann and S.A. Thompson. Rhetorical structure theory: Toward a functional theory of text organization. Text, 8(3):243–281, 1988. 9. RD. Nielsen. Question generation: Proposed challenge tasks and their evaluation. In Proc WS on the QGSTEC, 2008. 10. RD Nielsen, J Buckingham, G Knoll, B Marsh, and L. Palen. A taxonomy of questions for question generation. In Proc WS on the Question Generation Shared Task and Evaluation Challenge., 2008. 11. R Prasad, N Dinesh, A Lee, E Miltsakaki, L Robaldo, A Joshi, and B Webber. The penn discourse treebank 2.0. In Proc LREC, 2008. 12. R Prasad and Aravind Joshi. A discourse-based approach to generating why- questions from texts. In Proc WS on the QGSTEC, 2008. 13. D. Radev, T. Allison, S. Blair-Goldensohn, J. Blitzer, A. C ̧elebi, S. Dimitrov, E. Drabek, A. Hakim, W. Lam, D. Liu, J. Otterbacher, H. Qi, H. Saggion, S. Teufel, M. Topper, A. Winkel, and Z. Zhang. Mead - a platform for multidocument multilingual text summarization. In Proc. LREC 2004, 2004. 14. C.M. Reigeluth. The elaboration theory: Guidance for scope and sequence decisions. In Instructional-Design Theories and Models: A New Paradigm of Instructional Theory. Lawrence Erlbaum Assoc, 1999. 15. V. Rus, Z. Cai, and A.C. Graesser. Question generation: An example of a multi- year evaluation campaign. In Proc WS on the QGSTEC, 2008. 16. R. Soricut and D. Marcu. Sentence level discourse parsing using syntactic and lexical information. In Proc HLT/NAACL, pages 228–235, 2003. 17. S. Susarla, A. Adcock, R. Van Eck, K. Moreno, A. C. Graesser, and the Tutoring Research Group. Development and evaluation of a lesson authoring tool for autotutor. In V. Aleven, U. Hoppe, R. Mizoguchi J. Kay, H. Pain, F. Verdejo, and K. Yacef, editors, Proc. AIED2003, pages 378–387, 2003. 18. L. Vanderwende. The importance of being important. In Proc WS on the QGSTEC, 2008. 19. Howard Wainer. Computer-Adaptive Testing: A Primer. 2000.

1 Center for Computational Language and EducAtion Research (CLEAR)

1 Center for Computational Language and EducAtion Research (CLEAR)

Presentation Transcript

Potomac Environmental Research and Education Center

Computational Center for Molecular Structure and Interactions

National Center for Construction Education and Research

Center for Education and Research CERFS

Language and thought

Language Resource Center Program

Stephen Crystal Director, Center for Education and Research

National Research Center for Career and Technical Education

Visualization supported by The Center for Computational Research

Research and Education Center in Aquaculture

CKL --- Center for Computational Linguistics

Computational Center for Nanotechnology Innovations

Computational Language

National Research Center for Career and Technical Education

Potomac Environmental Research and Education Center

National Research Center for Career and Technical Education

The Center for Computational Research (CCR): An Overview

Stephen Crystal Director, Center for Education and Research

Center for Language and Culture

Clear Vision Center

Visualization supported by The Center for Computational Research

Center for Interdisciplinary Research on Language and Speech