1 / 36

Data mining with DataShop

Data mining with DataShop. Ken Koedinger CMU Director of PSLC Professor of Human-Computer Interaction & Psychology Carnegie Mellon University. “Knowledge components are the germ of transfer”. Goal of the week: What does Ken mean by this?. Overview. Motivation for data mining

Download Presentation

Data mining with DataShop

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data mining with DataShop Ken Koedinger CMU Director of PSLC Professor of Human-Computer Interaction & Psychology Carnegie Mellon University

  2. “Knowledge components are the germ of transfer” Goal of the week: What does Ken mean by this?

  3. Overview • Motivation for data mining • Better understanding of students => better instructional design • Exploratory Data Analysis • Data Shop demo, Excel • Learning curves & Learning Factors Analysis • Example project from last summer

  4. Data Mining Questions & Methods • What is going on with student learning & performance? • Exploratory data analysis • Summary & visualization tools in DataShop • Tools in Excel: Auto filter, Pivot Tables, Solver • How to reliably model student achievement? • Item Response Theory (IRT) • Basis for standardized tests, SAT, GRE, TIMSS… • Version of “logistic regression”

  5. Data Mining Questions & Methods 2 • What’s the nature of knowledge students are learning? How can we discover cognitive models of student learning that fit their learning curves? • Learning Factors Analysis (LFA) • Extends IRT to account for learning • Search algorithm: Discover cognitive model(s) that capture how student learning transfers over tasks over time • What features of a tutor lead to the most learning? • Learning Decomposition • Extends LFA to explore different rates of learning due to different forms of instruction • How to extract reliable inferences about causal mechanisms from correlations in data? • Causal modeling using Tetrad

  6. Overview • Motivation for data mining • Better understanding of students => better instructional design • Exploratory Data Analysis • Demo: DataShop, Excel • Learning curves & Learning Factors Analysis • Example project from last summer Next

  7. Data Shop Demo …

  8. Before going to DataShop, let’s look at a tutor (1997 version!) that generated the example data set we’ll look at

  9. TWO_CIRCLES_IN_SQUARE problem: Initial screen

  10. TWO_CIRCLES_IN_SQUARE problem: An error a few steps later

  11. TWO_CIRCLES_IN_SQUARE problem: Student follows hint & completes prob

  12. How to get to the DataShop: Go to http://learnlab.org & click … 2 3 1

  13. PSLC’s DataShop • Researchers get data access, visualizations, statistical tools • Learning curves track student learning over time • Discover what concepts & skills students need help with

  14. PSLC’s DataShop • Learning curves reveal over- and under-practiced knowledge components • Rectangle-area has an initial low error rate, but is practiced often

  15. Other DataShop Features • Error Reports • Identify misconceptions by looking for common student errors • When do students ask for hints? • Are there alternative correct strategies? • Performance Profiler • Export Data • Get all or part of the data in tab-delimited file • Use your favorite analysis tools …

  16. Exported File Loaded into Excel

  17. Overview • Motivation for data mining • Better understanding of students => better instructional design • Exploratory Data Analysis • Data Shop demo, Excel • Learning curves & Learning Factors Analysis • Example project from last summer Next

  18. Cognitive Model drives behavior of intelligent tutor systems … • Cognitive Model: expert component of intelligent tutors that models how students solve problems 3(2x - 5) = 9 If goal is solve a(bx+c) = d Then rewrite as abx + ac = d If goal is solve a(bx+c) = d Then rewrite as abx + c = d If goal is solve a(bx+c) = d Then rewrite as bx+c = d/a 6x - 15 = 9 2x - 5 = 3 6x - 5 = 9 • Model Tracing: Follows student through their individual approach to a problem -> context-sensitive instruction

  19. Hint message: “Distribute aacross the parentheses.” Bug message: “You need tomultiply c by a also.” Known? = 85% chance Known? = 45% Cognitive Model drives behavior of intelligent tutor systems … • Cognitive Model: expert component of intelligent tutors that models how students solve problems 3(2x - 5) = 9 If goal is solve a(bx+c) = d Then rewrite as abx + ac = d If goal is solve a(bx+c) = d Then rewrite as abx + c = d 6x - 15 = 9 2x - 5 = 3 6x - 5 = 9 • Model Tracing: Follows student through their individual approach to a problem -> context-sensitive instruction • Knowledge Tracing: Assesses student's knowledge growth -> individualized activity selection and pacing

  20. Cognitive Modeling Challenge • Problem: Intelligent Tutoring Systems depend on Cognitive Model, which is hard to get right • Hard to program, but more importantly … • A high quality cognitive model requires a deep understanding of student thinking • Cognitive models created by intuition are often wrong (e.g., Koedinger & Nathan, 2004)

  21. Significance of improving a cognitive model • A better cognitive model means: • better feedback & hints (model tracing) • better problem selection & pacing (knowledge tracing) • Making cognitive models better advances basic cognitive science

  22. How can we use student data to build better cognitive models? • Cognitive Task Analysis methods • Think alouds, Difficulty Factors Assessment • General lecture Tuesday • Peer collaboration dialog analysis • TagHelper track • Newer: • Data mining of student interactions with on-line tutors

  23. Back to DataShop to illustrate

  24. Use log data to test alternative knowledge representations • Which “knowledge component” analysis is correct is an empirical question! • Log data from tutors provides data to compare different KC analyses • Find which “germ” accounts for student learning behaviors

  25. Not a smooth learning curve -> this knowledge component model is wrong. Does not capture genuine student difficulties.

  26. This more specific knowledge component (KC) model (2 KCs) is also wrong -- still no smooth drop in error rate.

  27. Ah! Now we are getting a smooth learning curve. This even more specific decomposition (12 KCs) better tracks the nature of student difficulties & transfer for one problem situation to another.

  28. Overview • Motivation for data mining • Better understanding of students => better instructional design • Exploratory Data Analysis • Demo: DataShop, Excel • Learning curves & Learning Factors Analysis • Example project from last summer Next

  29. Example project from 2006 • Rafferty (Stanford) & Yudelson (U Pitt) • Analyzed a data set from Geometry • Applied Learning Factors Analysis (LFA) • Driving questions: • Are students learning at the same rate as assumed in prior LFA models? • Do we need different cognitive models (KC models) to account for low-achieving vs. high-achieving students?

  30. Rafferty & Yudelson Results 1 • Different student learning rates? • Yes

  31. Rafferty & Yudelson Results 2 • Is it “faster” learning or “different” learning? • Fit with a more compact model is better for low pre for high learn • Students with an apparent faster learning rate are learning a more “compact”, general and transferable domain model • (Became basis of Anna Rafferty’s masters thesis)

  32. Data Mining-Data Shop Offerings Tomorrow Lectures in 3501 Newell-Simon Hall, activities here (Wean 5202) 1. Educational data mining overview & introduction to using the DataShop • Follow-up activities: • Exercise in using DataShop for exploratory data analysis • Use tutor/course that generated target data set. Begin data export, data scrubbing, exploratory data analysis 2. Learning from learning curves: Item Response Theory, Learning Factors Analysis 3. Other data mining techniques: Learning decomposition, causal models with Tetrad • Define metrics to address driving question, begin analysis

  33. Questions?

  34. What’s next? • Tomorrow: • Do you know which offerings you will go to tomorrow? • Any conflicts -- two you want to go to that are at the same time?

  35. END

More Related