1 / 20

Some Useful Design Tactics for Mining ITS Data

Some Useful Design Tactics for Mining ITS Data. Jack Mostow Project LISTEN ( www.cs.cmu.edu/~listen ) Carnegie Mellon University Funding: National Science Foundation ITS 04 Workshop on Analyzing Student-Tutor Interaction Logs to Improve Educational Outcomes, Maceio, Brazil . Outline.

farhani
Download Presentation

Some Useful Design Tactics for Mining ITS Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Some Useful Design Tactics for Mining ITS Data • Jack Mostow • Project LISTEN (www.cs.cmu.edu/~listen) • Carnegie Mellon University • Funding: National Science Foundation • ITS 04 Workshop on Analyzing Student-Tutor Interaction Logs to Improve Educational Outcomes, Maceio, Brazil 1

  2. Outline • Project LISTEN’s Reading Tutor • Modify tutor to get mineable data • Map data stream to analyzable data set • Mine data set to discover insights 2

  3. Project LISTEN’s Reading Tutor (video) 3

  4. Project LISTEN’s Reading Tutor (video) • John Rubin (2002). The Sounds of Speech (Show 3). On Reading Rockets (Public Television series commissioned by U.S. Department of Education). Washington, DC: WETA. • Available at www.cs.cmu.edu/~listen. 4

  5. Tutoring: Dr. Joseph Beck, mining tutorial data Prof. Albert Corbett, cognitive tutors Prof. Rollanda O’Connor, reading Prof. Kathy Ayres, stories for children Joe Valeri, activities and interventions Becky Kennedy, linguist Listening: Dr. Mosur Ravishankar, recognizer Dr. Evandro Gouvea, acoustic training John Helman, transcriber Programmers: Andrew Cuneo, application Karen Wong, Teacher Tool Field staff: Dr. Roy Taylor Kristin Bagwell Julie Sleasman Grad students: Hao Cen, HCI Cecily Heiner, MCALL Peter Kant, Education Shanna Tellerman, ETC Plus: Advisory board Research partners DePaul UBC U. Toronto Schools Thanks to fellow LISTENers 5

  6. 2003-2004 database: 9 schools > 200 computers > 50,000 sessions > 1.5M tutor responses > 10M words recognized Embedded experiments Randomized trials Project LISTEN’s Reading Tutor: A rich source of experimental data 6

  7. Modify tutor to get mineable data • Log operations at grain size and level of interest • Click <x, y> at time t: motor control • Click “Goldilocks”: item selection • Reify operations to log them analyzably • Handwriting or speech  typed input • Freehand drawing  graphical palette (Geometry Tutor) • Free-form responses  menu selection (Self 88) • Natural language  sentence starters (Goodman 03) • Time student and tutor actions • Time allocation reflects motivation (ITS 02) • Hasty responses indicate guessing (TICL 04) • Latency reflects automaticity (TICL 04) 7

  8. Modify tutor: add relevant data • Randomize tutorial decisions • What skill to test, what help to give • Probe skills • Assess cognitive development (Arroyo 00) • Test vocabulary words (IJAIE 01) • Insert automated comprehension questions (TICL 04) • Import student data • Gender, age, IQ (Shute 96) • Prior knowledge (Corbett 00) • Pretest scores (TICL 04) • Hand-label when appropriate • Transcribe (some) spoken input (FLET 04) 8

  9. Modify tutor: an example • Randomize: explain some new words but not others. • Probe: test each new word the next day. • Did kids do better on explained vs. unexplained words? • Overall: NO; 38%  36%, N = 3,171 trials (IJAIE 01). • Rare, 1-sense words tested 1-2 days later: YES! 44% >> 26%, N = 189. 9

  10. Map data stream to data set:structure data into a single type • Data stream: heterogeneous events over time • Data set: elements with the same features • Segment into shorter episodes • Tutorial action(s) + student response (Beck 00) • Slice into narrower strands • Successive encounters of a specific word (AMLDP 98) • Successive instances of a specific skill (learning curves) • Measure aggregated events • Allocation of time among activities (ITS 02) • Formulate data as experimental trials • Context where the trial occurred • Decision made in this trial • Outcome based on subsequent events 10

  11. Map data stream to data set:Formulate data as experimental trials Student isreading a story ‘People sit down and …’ Student needs help on a word Student clicks ‘read.’ Tutor chooses what help to give Decision (randomized) Student continues reading ‘… read a book.’ Time passes… Student sees word in a later sentence ‘I love to read stories.’ Outcome: read fluently? • Data stream: Context: 11

  12. Map data stream to data set: trials • Context: Decision: Outcome: 12

  13. Mine data set to make discoveries • Count outcome frequency • Success rate of each help type (ICALL 04) • Fit a parametric model • Knowledge tracing (Corbett 95) • Train a model • Statistics, e.g. regression (TICL 04) • Machine learning, e.g. decision trees (AIED 01) 13

  14. Best: Rhymes With 69.2% ± 0.4% Worst: Recue 55.6% ± 0.4% Compare within level to control for word difficulty. Supplying the word helped best in the short term… But rhyming hints had longer lasting benefits. Count outcome frequency: which help types worked best? 14

  15. Summary: modify, map, mine. • Modify tutor to make data mineable. • Log, reify, time, hand-label, import, probe, randomize. • Map data streams to data sets. • Segment, slice, measure. • Mine data set to make discoveries. • Count, fit, train. • See videos, papers, etc. at www.cs.cmu.edu/~listen. • Thank you! Questions? 15

  16. Modify tutor to get mineable data word features 16

  17. Structure of Reading Tutor database Reading Tutor Student Login List readers Session List stories Pick stories Story Encounter Show one sentence at a time Read sentence Sentence Encounter Listens and helps Read each word Word Encounter 17

  18. Context where the trial occurred Decision made in this trial Outcome based on subsequent events Map data stream to data set: formulate data as experimental trials 18

  19. Try to predict subset Grade 1-2 level 1-6 prior encounters Selected data 53 students 175,961 words 29,278 help requests Train predictive model Count help requests 5x Predict other kids’ data 71% accuracy Learning curves for students’ help requests 19

  20. Whole word: 24,841 Say In Context 56,791 Say Word Decomposition: 6,280 Syllabify 14,223 Onset Rime 19,677 Sound Out 22,933 One Grapheme Analogy: 13,165 Rhymes With 13,671 Starts Like Semantic: 14,685 Recue 2,285 Show Picture 488 Sound Effect Which types stood out? Best: Rhymes With 69.2% ± 0.4% Worst: Recue 55.6% ± 0.4% Count outcome frequency(average success rate 66.1%) Example: ‘People sit down and read a book.’ 20

More Related