230 likes | 249 Views
Explore the challenges, solutions, and examples in educational data mining, focusing on data collection, organization, and intervention design for effective analysis and insights.
E N D
IERI educational data mining panel Joseph E. Beck Project LISTEN Center for Automated Learning and Discovery Carnegie Mellon University Funding: National Science Foundation
Discussion question “What do data mining tools and methods provide as output, what do they require in terms of input and expertise, which ones seem especially appropriate for educational data mining, and why?”
Discussion question “What do data mining tools and methods provide as output, what do they require in terms of input and expertise, which ones seem especially appropriate for educational data mining, and why?”
Overview • Focusing on input since first step • Providing three big lessons learned • Work done in context of Project LISTEN’s Reading Tutor
Why isn’t educational data mining process smoother? • Problems with data collection • Not collecting sufficiently detailed data • Data are a mess • Data are observational
Problem: not collecting sufficiently detailed data • Solution: Instrument your software! • Record everything you can think of • You’ll probably find a use for it later • Common to think of research questions after the fact • It’s nice to be able to answer them
Examples of data to collect • Start/end times for sessions, modules, help, etc. • Every item the tutor displays/says • Why the tutor did that output • What else the tutor could have done • Student typed input • Student mouse clicks • 7,600 hours of fine grained data required only 15 GB of disk space (320 GB disk is only $140)
Problem: Data are a mess You’ve recorded everything you can think of, and wind up with something like 16466, Notice, "Tue Apr 10 12:30:20.387 2001", 10763200, "CListener::FinalizeUtterance", "EndUtterance" 16467, Notice, "Tue Apr 10 12:30:20.417 2001", 10763200, "CCapture::WriteWaveFile(int)", "Wrote File: d:\\listen\\cd\\Tue-Sep-19-23-44-58.093-2000\\Capture\\fAT6-6-1994-08-01\\dec-fAT6-6-1994-08-01-Apr10-01-12-30-14-902.wav"
Solution: use a database • Database enables tabular representation of data • Greatly speeds analyses • Removes problem of parsing logfiles
Problem: data are observational • Many possible questions • Does an intervention work? • (What is an intervention?) • For whom? In what contexts? • Difficult to answer such questions observationally • Need experimental trials
Solution • Design intervention carefully to answer questions about its effectiveness • Two properties • Assess its own effectiveness • Enable causal conclusions • “Embedded experiments”
Example • Have intervention to teach student to pronounce a word
Student 3rd grade reading proficiency Intervention How to pronounce a word
Student 3rd grade reading proficiency Intervention How to pronounce a word Select good intervention words Travelers, Borrowed
Student 3rd grade reading proficiency Intervention How to pronounce a word Select good intervention words Travelers, Borrowed Flip coin to decide randomly which word to teach
Student 3rd grade reading proficiency Intervention How to pronounce a word Select good intervention words Travelers, Borrowed Flip coin to decide randomly which word to teach Don’t teach “tails” word travelers Teach “heads” word borrowed
Student 3rd grade reading proficiency Intervention How to pronounce a word Select good intervention words Travelers, Borrowed Flip coin to decide randomly which word to teach Don’t teach “tails” word travelers Teach “heads” word borrowed Assess both words “Please pronounce ‘travelers’” “Please pronounce ‘borrowed’”
Student 3rd grade reading proficiency Intervention How to pronounce a word Select good intervention words Travelers, Borrowed Flip coin to decide randomly which word to teach Don’t teach “tails” word travelers Teach “heads” word borrowed Assess both words “Please pronounce ‘travelers’” “Please pronounce ‘borrowed’” Record details of trial Student, words, success on assessment, type of intervention, etc.
Student 3rd grade reading proficiency Intervention How to pronounce a word Select good intervention words Travelers, Borrowed Flip coin to decide randomly which word to teach Don’t teach “tails” word travelers Teach “heads” word borrowed Assess both words “Please pronounce ‘travelers’” “Please pronounce ‘borrowed’” Record details of trial Student, words, success on assessment, type of intervention, etc.
Student 3rd grade reading proficiency Intervention How to pronounce a word Select good intervention words Travelers, Borrowed Flip coin to decide randomly which word to teach Don’t teach “tails” word travelers Teach “heads” word borrowed Assess both words “Please pronounce ‘travelers’” “Please pronounce ‘borrowed’” Record details of trial Student, words, success on assessment, type of intervention, etc.
(Trimmed) Example of recorded data • Can use preferred modeling approach (e.g. logistic regression, decision tree, etc.)
Lessons • Instrument and record everything • You can't analyze what isn’t there • Use a database • Most analysis techniques require tabular format • An intervention should be able to assess itself • So embed randomized controlled trials in it to allow causal inference