420 likes | 603 Views
Overview. Where Think-Aloud FitsTutor development process -> Cognitive Task Analysis -> Think AloudThink-Aloud StudiesWhy do it
E N D
1. Using Think-Aloud Protocols to Understand Student ThinkingSept 22: Class 3: Lecture 2 Intelligent Tutoring Systems & Cognitive Modeling
Slides modified from Ken Koedinger & Vincent Aleven
2. Overview Where Think-Aloud Fits
Tutor development process -> Cognitive Task Analysis -> Think Aloud
Think-Aloud Studies
Why do it & background
How to collect data
How to analyze
Example
Think aloud in Statistics Tutor design
3. Cognitive Tutor CourseDevelopment Process 1. Client & problem identification
2. Identify the target task & interface
3. Perform Cognitive Task Analysis (CTA)
4. Create Cognitive Model & Tutor
a. Enhance interface based on CTA
b. Create Cognitive Model based on CTA
c. Build a curriculum based on CTA
5. Pilot & Parametric Studies
6. Classroom Use & Dissemination
4. Focus on step #3 in tutordevelopment process ... 1. Client & problem identification
2. Identify the target task & interface
3. Perform Cognitive Task Analysis (CTA)
4. Create Cognitive Model & Tutor
a. Enhance interface based on CTA
b. Create Cognitive Model based on CTA
c. Build a curriculum based on CTA
5. Pilot & Parametric Studies
6. Classroom Use & Dissemination
5. Kinds of Cognitive Task Analysis 2 Kinds of Approaches
Empirical: Based on observation, data, exp.
Analytical: Based on theory, modeling.
2 Kinds of Goals
Descriptive: How students actually solve problems. What Ss need to learn.
Prescriptive: How students should solve problems. What Ss need to know.
4 Combinations ...
6. Kinds of Cognitive Task Analysis
7. Cognitive Analysis & HCI Methods Theory
ACT-R cognitive modeling
Quantitative Experimental data
Difficulty Factors Assessments
Parametric experiments with system alternatives
Qualitative data
Think alouds, tutor-student log files
Classroom observations, contextual inquiry
Participatory Design
Practitioners (teachers) are paid team members, 50% teaching + 50% on design team
8. Steps In Task Analysis What are instructional objectives?
Standards, existing tests, signature tasks
Has someone done the work for you? Dont reinvent the wheel. Do a literature review!
Specify space of tasks
Do either or both:
Theoretical task analysis: Use a theory, like ACT-R, to create a process model that is sufficient to deal with space of tasks
Empirical task analysis: Do Think-Aloud, Difficulty Factors Assessment, ...
9. Overview Where Think-Aloud Fits
Tutor development process -> Cognitive Task Analysis -> Think Aloud
Think-Aloud Studies
Why do it & background
How to collect data
How to analyze
Example
Think aloud in Statistics Tutor design
10. What is aThink-Aloud Study? Basically, ask a users to think aloud as they work...
... on a task you believe is interesting or important
... while you observe (& audio or videotape)
... either in context (school) or in a usability lab equipped for this purpose
... possibly using paper/storyboard/interface you are interested in improving
11. The Roots ofThink-Aloud Usability Studies Think-aloud protocols
Allen Newell and Herb Simon created the technique in 1970s
Applied in 72 book: Human Problem Solving
Anders Ericsson & Herb Simon wrote a book explaining and validating the technique, Protocol Analysis: Verbal Reports as Data 1984, 1993
12. The Cognitive Psychology Theory behindThink-Aloud Protocols People can easily verbalize the linguistic contents of Working Memory (WM)
People cannot directly verbalize:
The processes performed on the contents of WM
Procedural knowledge, which drives what we do, is outside our conscious awareness, it is tacit, implicit knowledge.
People articulate better external states & some internal goals, not good at articulating operations & reasons for choice
Non-linguistic contents of WM, like visual images
People can attempt to verbalize procedural or non-linguistic knowledge, however, doing so:
May alter the thinking process (for better or worse)
May interfere with the task at hand, slowing performance
13. How to Collect Data in aThink-Aloud Study(Gomoll, 1990, is a good guide) 1. Set up the observation
write the tasks
recruit the users
set up a realistic situation
provide instrumentation for the prototype if possible
2. Describe the purpose of the observation in general terms
3. Tell the user that its OK to quit at any time
give the participant the consent form and ask for a signature
4. Talk about and demonstrate the equipment in the room
14. How to Collect Think-Aloud Data(cont) 5. Explain how to think aloud
give a demonstration
give a practice task having nothing to do with the system you are testing, e.g., solitaire
6. Explain that you will not provide help
7. Describe the tasks and introduce the product
8. Ask if there are any questions before you start; then begin the observation
say please keep talking if the participant falls silent for 5 seconds or more
be sensitive to a severe desire to quit
9. Conclude the observation
10.Use the results
15. The Value ofThink-Aloud Studiesin HCI Practice ...may be the single most valuable usability engineering method" (Nielsen, 1993, Usability Engineering: p. 195)
16. Ethics of Empirical Studieswith Human Participants Informed consent
Testing the interface, NOT the participant
Maintain anonymity
Store the data under a code number
Store participants name separately from their data
Voluntary
always make it OK for people to quit
be sensitive to the many ways people express a desire to quit
17. Overview Where Think-Aloud Fits
Tutor development process -> Cognitive Task Analysis -> Think Aloud
Think-Aloud Studies
Why do it & background
How to collect data
How to analyze
Example
Think aloud in Statistics Tutor design
18. Think Alouds in Statistics Tutor Development Task: Exploratory Data Analysis
Given problem description and data set
Inspect data to generate summaries & conclusions
Evaluate the level of support for conclusions
Example Problem
In mens golf, professional players compete in either the regular tour (if theyre under 51 years old) or in the senior tour (if they are 51 or older). Your friend wants to know if there is a difference in the amount of prize money won by the players in the 2 tours. This friend has recorded the prize money of the top 30 players in each tour. The variable money contains the money won by each of the players last year. The variable tour indicates which tour the player competed in, 1=regular, 2=senior. The variable rank indicates player rank, 1=top in the tour.
19. Task Analysis of Major Goals in Statistical Analysis This is an analytic prescriptive form of CTA
ACT-R emphasizes goal-factored knowledge elements
Break down task:
7 major goals
Each goal has involves multiple steps or subgoals to perform
Key productions react to major goals & set subgoals
20. Sample Transcript
21. Observations about this verbal report No evidence for goal 3, characterize the problem
Line 10: student simply jumps to selecting a data representation (goal 4) without thinking about why.
No evidence for goal 7, evaluate evidence
Minor interpretation error
Line 13: student mentions the average when in fact boxplots display the median not the mean
Note: These observations should be indicated in the annotation column of the transcript (I left them off given limited space).
22. Comparing Think Aloud Results with Task Analysis Percentages to the right of each step represent the percentage of students in the think-aloud study who showed explicit evidence of engaging in that step.
Step 3 is totally absent!
A tutor can help students to do & remember to do step 3
23. Inspiration for Production Rules Missing production (to set goal 3):Characterize problemIf goal is to do an exploratory data analysis& relevant variables have been identifiedthenset a subgoal to identify variable types
Buggy production (skipping from goal 2 to 4):Select any data representationIf goal is to do an exploratory data analysis& relevant variables have been identifiedthenset a subgoal to conduct an analysis by picking any data representation
24. Statistics Tutor: Original Goal Scaffolding Plan
25. Statistics Tutor: Transfer Appropriate Goal Scaffolding
27. Summary 4 Kinds of Cognitive Task Analysis
Descrip vs. Prescrip; Empirical vs. Analytic
Empirical CTA Methods
Think aloud & difficulty factors assessment
Think aloud
Get subjects to talk while solving, do not them explaining
Prescrip: What do experts know -- identify hidden thinking skills
Descrip: What is difficult for novices
28. END
29. One thing to note first is that "step skipping" is in the eye of the beholder, if you will. That is, it is relative to the desired level of detail required of a proof or explanation (what I called the "execution space" in the paper). To a logician, for instance, the steps in the *final* proofs given in the paper still skip steps. Logicians would want to see proofs in terms of basic inference rules like modus ponens and these proofs would be much longer. So the question becomes at what grain size are the rules of thinking acquired in a domain? Do we learn detailed logic rules before learning geometry rules? A related question is whether we learn new rules of thinking by composing together existing smaller rules (a deductive process) or by inducing them from experience or both?
The hypothesis made in the paper (and elaborated more in my 1990 paper) is that the rules of geometry thinking (I called them schemas) are, at least in part, induced from experience with various configurations of diagrams and thus can be directly acquired at a more coarse grain than the textbook rules of geometry.
There are a couple pieces to the argument, only one of which is mentioned in the paper, and vaguely so, as you point out! It is the more complex one, so let me start with a simpler one. In the protocols I often observed subjects quickly making inferences during planning that they later struggled much longer to execute the details (e.g., the last step of problem RICK7 shown in Fig 1.4 was common example -- I suspect you may find this inference easy, but harder to formally justify). You might say, well that's just the nature of expertise development -- the some of the steps involved in the intial slow, deliberate process novices go through get compiled away during practice, later skipped, and then eventually forgotten through lack of practice. But, this argument does not fly here because these subjects could not avoid practicing these lower level steps by skipping them -- in practicing proof problems they do need, in the end, to perform these low level steps. Hm, well this argument's not that simple either, but I do stand by it.
The "more complex" argument reference in the paper starts with the observation that there are lots of ways smaller rules can get composed into bigger ones (e.g., rules A, B, & C could compose to AB & C or A & BC). In geometry, there are tons of possibilities. However, what we found in experts is behavior that only corresponds to a small subset of these. A mechanism that composes rules fired "next" to each other would produce a much greater variety of chunks or macro-operators. Inducing schemas from diagrams yields the smaller set we observe in experts.
Have I totally confused you? Let me try what might be a more intuitive argument. Consider this problem and think aloud while you solve it: Bill and his cat Spike get on a scale and together weigh 200 lbs. Bill's friend Ted and Spike also weigh 200 lbs together. Do Bill and Ted weigh the same amount?
How many "thinking steps" did it take you to come to a conclusion? Did you say anything in your think aloud? I suspect that the answer to this question came to you quickly maybe even in one step (without having a chance to say anything), but probably not more than 2 or 3. You might have found yourself justifying your answer after you came to it -- those steps don't count! They are in the execution phase.
This problem is analogous to the following geometry problem (and some others):
A B C D
0------0--0------0
If AC congruent to BD, is AB congruent to CD?
In the Geometry textbooks used at the time of GPT's development, proving this takes 7 steps!
1. AC congruent to BD Given
2. mAC = mBD Def of congruence (1)
(read above as: "measure of AC equals measure of BD")
3. mAC = mAB + mBC Def of betweenness
4. mBD = mBC + mCD Def of betweenness
5. mAB + mBC = mBC + mCD Transitivity (2,3,4)
6. mBC = mBC Reflexive property
7. mAB = mCD Subtraction (5,6)
8. AB congruent to CD Def of congruent (7)
I suspect you did not go through all these steps to solve the problem above. You might not even understand all these steps or recall the associated geometry rules. I think you'd agree, then, that you did not learn whatever thinking rule (or rules) you used to solve the problems above, by composing together these lower level rules.
30. Overview Think-Aloud Studies
Why do it & background
How to collect data
How to analyze
Geometry example (paper)
Statistics example
31. Why do think alouds with domain experts? Learning is the transition from novice to expert
Instruction designers ideally should identify:
Nature of desired knowledge (expert)
Nature of current knowledge (novice)
Nature of transition process (intermediate learner models & learning mechanisms)
Studying experts is a key piece of the larger ideal process
32. Different routes to expertise 1. Part training
Start with pieces of the skill and build larger pieces
Math example: Learn to count, then add single digits, then multiple digits
2. Whole task using a novice strategy first
Start with alternative simpler strategy, move to successively more complex strategies
Skiing: Learn to snowplow, then stem, then parallel
Geometry: GPTs rule focus
3. Whole task using scaffolded expert strategy
Start with whole, target skill but scaffold performance
Skiing: Use short skis, parallel from beginning
Geometry: ANGLEs schema focus
33. ANGLE Development Methodology 1. Identify the execution space
2. Look for implicit planning in verbal reports
3. Model this implicit planning
4. Use model to drive tutor design
5. Test the tutor implementation
34. Execution Space Problem space
{States, operators, initial state, goal state}
Execution space
A problem space made up of states that correspond with the steps that problem solvers conventionally write down or execute in solving problems in a domain.
The basic problem space (N&S, 72)
Other problem spaces may better characterize the thinking in a domain
35. Study execution space through task analysis 27 domain or textbook rules: definitions, postulates, or theorems
Estimate how hard the task is.
What is size of execution space for geometry proof?
Analysis of a problem RICK7, takes 6 steps
Ply 1: How many different rules apply to given state of the problem?
7 but in 45 different ways
Ply 2: How many rules apply to the results of those?
563 different rule instances
Ply 3: 563 different rule instances
Ply 4: >100,000 possible rule instances
...
36. Think-aloud protocol analysis identifies implicit planning (step-skipping) See problem and protocol in paper
Subject only mentions key steps
Analysis by comparing steps mentioned with complete set of execution space steps
Key observation:
Different subjects on different problems skipping same kinds of steps
37. Modeling Implicit Planning Diagram Configuration Schemas
See examples in paper
Expert knowledge for abstract planning organized around concrete domain objects
Knowledge appears to be induced from experience with properties of objects
Though certainly also guided by declarative learning of the textbook rules
Connection with Inductive Support
38. Diagram configuration schemas: Object-based & efficient
39. Model-based tutor design In paper, see Tables 1.3 & 1.4 and Figures 1.4-1.5
Design heuristics:
Use representation in cognitive model
To design interface notations
Vocabulary of tutor advice
Use processes in cognitive model
To design interface actions
Hints & explanations
40. ANGLE: Tutor for geometry proof
41. Tutor Evaluation Overall results: ANGLE vs. GPT
ANGLE somewhat worse on execution
Somewhat better on key planning steps
Formative results are most important
Log file analysis
Devil is in the details
Flexibility issues, bottom-out hint, ...
42. Follow-up Classroom Study with ANGLE As an add-on in classes where tutor use was not well-integrated into the course, ANGLE neither harmed nor helped.
When well-integrated with the curriculum, ANGLE led to a 1 sd effect over comparison classes with the same teacher.
43. Limitations Method is not experimentally validated
Few are, in fact, few methods have been articulated for ITS design
In addition to experts, want to a model of novices and learners along the way
Using think alouds is expensive, can we get similar information through other means?