150 likes | 209 Views
Chapter 22. Planning who, what, when, and where. Intro. We have da strategy: purpose type of data to collect system constraints Now: choose users (population sample) create timetable (and stick to it) prepare task descriptions (script it!) decide where to evaluate (field or lab).
E N D
Chapter 22 Planning who, what, when, and where
Intro • We have da strategy: • purpose • type of data to collect • system • constraints • Now: • choose users (population sample) • create timetable (and stick to it) • prepare task descriptions (script it!) • decide where to evaluate (field or lab)
Choosing participants • Each participant should be: • a real (actual) user, or • representative user (from requirements), or • usability or domain expert • Participants should not be: • chosen at random • (hmm, this is contrary to “traditional” experimental criteria…why?)
Screening or pretesting • To gage whether user fits desired subject profile, may need to screen users • For example, if testing a Spanish language training program, don’t use fluent Spanish language speakers • Questionnaires may be used to record users’ experience levels (can be useful in analysis later, e.g., discard experts’ data)
Working alone or in Pairs? • Usually users are tested alone • When may pairs be a good idea? • working cooperatively, sharing a computer • different culture (e.g., Japanese) • they prefer it (e.g., husband/wife team) • Hire a facilitator / caretaker / custodian? • when working with children, disabled, etc. • when interpreter is needed
How many participants? • Depends on problem and stage of testing • “trivial” or “easy” troublespots will be identified quickly (and often if many users); so need only a few during early stages • OTOH, if a couple of users finds no problems, does that mean UI is acceptable in general? • Key issue: generalizability • Ideally, you’d want to conduct a power analysis of the experiment • But one typically goes with “rule of thumb”: start with 5, go to 10, etc………………..
University participants • The book’s discussion seems to be aimed at practitioners • What about at the Uni? • use the Psych pool, other students, etc. • problems: • restricted age group • users may not have required expertise (e.g., evaluating a Fortran debugger) • motivation may be wanting (e.g., doing it for credit, not so much for science or “good of humanity” :)
Incentive • If possible, compensate users: • extra credit • real credit (e.g., on an e-commerce web site) • food • nick-nacks (mugs, pens, bla bla) • soap (true story :) • money is always good, if you have enough to spare
Global Warming App Users • How they picked users for the Global Warming App study: • email solicitation • experienced users (not novices) • various disciplines (e.g., not CS necessarily) • 10 users • no incentives
Create a timetable • Timetable: • How long do you need per evaluation session? • may need to run quick pilot study to determine • How much time will the whole process take? • quick “back of the envelope” calculation: 100 subjects, 10 minutes each = 16 hours (not counting introductions, filling out questionnaires, lunch, dinner, classes, interruptions, Survivor episodes, etc.) • how many sessions can you run per day? Maybe 4-5 hours’ worth? • so what’s a realistic estimate for 100 subjects? week? two weeks?
Timetable (cont.) • Keep evaluation session to a minimum, try not to exceed 1 hour (subjects will get bored, tired) • Create a timetable “sign up sheet” • very useful for signing up subjects and reserving lab space (e.g., eye trackers) • Allocate time for analysis • 80% of time spent in analysis (true? I dunno, just guessing) • just like debugging code?
Task Descriptions • Create task descriptions • similar to idea of scripts for evaluators (so they know what to say and say the same thing to each participant, thereby reducing bias) • these are scripts for users • I think they’re a good idea (task cards), but so long as they’re not too detailed • case study: my Navy usability study: participants read directions from a script. This was too easy; everyone performed similarly; no clear performance problems were identified
Where to do evaluation? • Field studies • observations in the field: most realistic environment, obviously • lacks control • Controlled studies • in a lab, usually • or some kind of mock scenario (e.g., “shoot house” for cops, SWAT personnel)
Usability Lab • How to build a good usability lab: • http://www.stcsig.org/usability/topics/usability-labs.html • often a separate room is used for participants (one-way mirror) • various logging devices, e.g., cameras, keystroke logging software, etc. • do we have one at Clemson? We should…