1 / 23

IERI educational data mining panel

Explore the challenges, solutions, and examples in educational data mining, focusing on data collection, organization, and intervention design for effective analysis and insights.

vickybird
Download Presentation

IERI educational data mining panel

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. IERI educational data mining panel Joseph E. Beck Project LISTEN Center for Automated Learning and Discovery Carnegie Mellon University Funding: National Science Foundation

  2. Discussion question “What do data mining tools and methods provide as output, what do they require in terms of input and expertise, which ones seem especially appropriate for educational data mining, and why?”

  3. Discussion question “What do data mining tools and methods provide as output, what do they require in terms of input and expertise, which ones seem especially appropriate for educational data mining, and why?”

  4. Overview • Focusing on input since first step • Providing three big lessons learned • Work done in context of Project LISTEN’s Reading Tutor

  5. Project LISTEN’s Reading Tutor

  6. Why isn’t educational data mining process smoother? • Problems with data collection • Not collecting sufficiently detailed data • Data are a mess • Data are observational

  7. Problem: not collecting sufficiently detailed data • Solution: Instrument your software! • Record everything you can think of • You’ll probably find a use for it later • Common to think of research questions after the fact • It’s nice to be able to answer them

  8. Examples of data to collect • Start/end times for sessions, modules, help, etc. • Every item the tutor displays/says • Why the tutor did that output • What else the tutor could have done • Student typed input • Student mouse clicks • 7,600 hours of fine grained data required only 15 GB of disk space (320 GB disk is only $140)

  9. Problem: Data are a mess You’ve recorded everything you can think of, and wind up with something like 16466, Notice, "Tue Apr 10 12:30:20.387 2001", 10763200, "CListener::FinalizeUtterance", "EndUtterance" 16467, Notice, "Tue Apr 10 12:30:20.417 2001", 10763200, "CCapture::WriteWaveFile(int)", "Wrote File: d:\\listen\\cd\\Tue-Sep-19-23-44-58.093-2000\\Capture\\fAT6-6-1994-08-01\\dec-fAT6-6-1994-08-01-Apr10-01-12-30-14-902.wav"

  10. Solution: use a database • Database enables tabular representation of data • Greatly speeds analyses • Removes problem of parsing logfiles

  11. Problem: data are observational • Many possible questions • Does an intervention work? • (What is an intervention?) • For whom? In what contexts? • Difficult to answer such questions observationally • Need experimental trials

  12. Solution • Design intervention carefully to answer questions about its effectiveness • Two properties • Assess its own effectiveness • Enable causal conclusions • “Embedded experiments”

  13. Example • Have intervention to teach student to pronounce a word

  14. Student 3rd grade reading proficiency Intervention How to pronounce a word

  15. Student 3rd grade reading proficiency Intervention How to pronounce a word Select good intervention words Travelers, Borrowed

  16. Student 3rd grade reading proficiency Intervention How to pronounce a word Select good intervention words Travelers, Borrowed Flip coin to decide randomly which word to teach

  17. Student 3rd grade reading proficiency Intervention How to pronounce a word Select good intervention words Travelers, Borrowed Flip coin to decide randomly which word to teach Don’t teach “tails” word travelers Teach “heads” word borrowed

  18. Student 3rd grade reading proficiency Intervention How to pronounce a word Select good intervention words Travelers, Borrowed Flip coin to decide randomly which word to teach Don’t teach “tails” word travelers Teach “heads” word borrowed Assess both words “Please pronounce ‘travelers’” “Please pronounce ‘borrowed’”

  19. Student 3rd grade reading proficiency Intervention How to pronounce a word Select good intervention words Travelers, Borrowed Flip coin to decide randomly which word to teach Don’t teach “tails” word travelers Teach “heads” word borrowed Assess both words “Please pronounce ‘travelers’” “Please pronounce ‘borrowed’” Record details of trial Student, words, success on assessment, type of intervention, etc.

  20. Student 3rd grade reading proficiency Intervention How to pronounce a word Select good intervention words Travelers, Borrowed Flip coin to decide randomly which word to teach Don’t teach “tails” word travelers Teach “heads” word borrowed Assess both words “Please pronounce ‘travelers’” “Please pronounce ‘borrowed’” Record details of trial Student, words, success on assessment, type of intervention, etc.

  21. Student 3rd grade reading proficiency Intervention How to pronounce a word Select good intervention words Travelers, Borrowed Flip coin to decide randomly which word to teach Don’t teach “tails” word travelers Teach “heads” word borrowed Assess both words “Please pronounce ‘travelers’” “Please pronounce ‘borrowed’” Record details of trial Student, words, success on assessment, type of intervention, etc.

  22. (Trimmed) Example of recorded data • Can use preferred modeling approach (e.g. logistic regression, decision tree, etc.)

  23. Lessons • Instrument and record everything • You can't analyze what isn’t there • Use a database • Most analysis techniques require tabular format • An intervention should be able to assess itself • So embed randomized controlled trials in it to allow causal inference

More Related