250 likes | 287 Views
In-vivo research on learning. Charles Perfetti PSLC Summer School 2007. When is a learning study in-vivo?. Vitro, vivo “On-Line” course? An ITS? A real class; real students; an intervention that counts. Why in-vivo is the gold standard. Noisy, uncontrolled environment
E N D
In-vivo research on learning Charles Perfetti PSLC Summer School 2007
When is a learning study in-vivo? • Vitro, vivo • “On-Line” course? • An ITS? • A real class; real students; an intervention that counts.
Why in-vivo is the gold standard • Noisy, uncontrolled environment • Content of instruction is validated • Built in generalization to classroom learning
Problems faced by an in-vivo researcher • Noisy, uncontrolled environment • As for your experiment: • Students have other things to do • Instructors have other things to do
Examples of in-vivo studies • Algebra, Physics, Chemistry, Geometry, French, Chinese,English • Some with computer tutors in major role • ITS • Practice tutors • Some without tutors or tutors in minor role
Liu, Wang, Perfetti Chinese tone perception study • In-vivo study • Traditional classroom (not online) • Materials from students’ textbook • New materials each week for 8 weeks of term 1 • Term 2 continued this, and added novel syllables unfamiliar to the student • 3 instructional conditions • tone number + pin yin, contour + pin-yin; contour only • Hint system • (CTAT) Tutors presented materials in 3 different instructional interfaces, according to the 3 conditions • Data shop logged individual student data
Learning Measures • Across-session error rates (transfer to new items) • Post-test tone judgments presented by tutor • Two successive syllables heard. Are they same or different in tone? (transfer to different task) • Nature of syllable pairs • Tone same, segments different /duan/3 /liang/3 • Same onset and rime, shi2 -- shi3; • Share rime only, e.g. dao2 – kao3; • Share neither onset nor rime, e.g., duo2 -- gong3.
Studies with major role for a computer tutor • Formative evaluation. How can the tutor be improved? • Summative evaluation. Is the tutor effective? • Both of these apply to all instructional interventions, whether tutor based or not
Formative Evaluation Examples • User interface testing • Early, before the rest of the tutor is built • Engage students and instructors • Get detailed response from students viewing tutor with talk-aloud procedures • Wizard of Oz • Human (the Wizard) in the next room watches a copy of screen • Responds when student presses Hint button or makes an error • User interface evaluation • Does the wizard have enough information? • Can the wizard intervene early enough? • Tutor tactics evaluation. What did the Wiz do when?
Formative Example 3: Snapshot critiques • Procedure: ITS log file • Select student help events from log file • Experts examine context leading up to the help message noting the help they would provide • Examine match between help from experts and that from ITS. • Compare with match between two experts. • Modify ITS help messages according to reliable expert input.
Summative evaluations • Question: Is the tutor (or other instructional intervention) more effective than a control? • Typical design • Experimental group gets the instructional intervention (the tutor). • Control group learns via the “traditional” or “current practice” method • Pre & post tests • Data analysis • Did the tutor group “do better” than the control?
Control conditions • Typical control conditions • Existing classroom instruction • Textbook & exercise problems (feedback?) • Another tutoring system • Human tutoring • Define a control condition early • Study the existing instruction in detail • Results of this study should influence the design of the tutor
Learning Assessments • Pre-test • Immediate post-test (post-pre = Learning) • Delayed post-test (Long-term retention) • How long is long? • Post-test using new dimension (content, presentation mode, response mode, etc/) (Transfer) • Learning measures on new content (Accelerated future learning)
Data from Liu et al tone study Learning curves week-by-week 2nd term transfer items
Multiple kinds of transfer • Liu et al shows 2 kinds of materials transfer • Within term 1, learning sessions, each syllable to be learned was different but familiar. So transfer of learning to familiar items • At second term, there were unfamiliar syllables. So transfer of learning to unfamiliar items. (Not so good.)
Accelerated future learning Score Ordinary transfer Pre Post Physics Training Example of acceleration of future learning (Min Chi & VanLehn) • First probability, then physics. During probability only, • Half students taught an explicit strategy • Half not taught a strategy (normal instruction) Score Pre Post Probability Training
Composing a post-test • General strategy: • Guided by cognitive task analysis (pre-test as well) including learning goals and specific knowledge components • Include some items from the pre-test • Check for basic learning • Some items similar to training items • Measures near-transfer • Some problems dissimilar to training problems • Measures far-transfer
Mistakes to avoid in test design • Tests that are • Too difficult • Too easy • Too long • Tests that • Fail to represent instructed content • Missing content; over sampling from some content • Depend too much on background knolwedge Notice problems in test means Notice variances
Interpreting test results as learning • Post-test in relation to pre-test. 2 strategies: • ANOVA on • gain scores • First check pre-test equivalence • Not recommended if pre-tests not equivalent • Pre-test, post test as within-subjects variable (t-tests for non-independent samples) • ANCOVA. Post-tests scores are dependent variable; pre-test scores are co-variate
Plot learning results • Bar graphs for instructional conditions • Differences due to conditions • Learning Curves • Growth over time/instruction
Learning Curves Error rate Weekly sessions over 2 terms
Learning curves can pin-point intervention effects B3-4: explicit B1-2: implicit EXPLICIT RULES (< B3)