480 likes | 570 Views
How to run any kind of Evaluation. 3 /6/14 HCC 729, Human Centered Design Amy Hurst. Getting started. Share inspirations, reading reflections http://hcc729s2014.wordpress.com/student-blogs / Homework check in (paper prototypes). Paper prototypes. Activity (10 minutes)
E N D
How to run any kind of Evaluation 3/6/14HCC 729, Human Centered Design Amy Hurst
Getting started • Share inspirations, reading reflections • http://hcc729s2014.wordpress.com/student-blogs/ • Homework check in (paper prototypes)
Paper prototypes • Activity (10 minutes) • Pair up with another group • Pick one task from your task list • Have other group test your task with prototype, 5 minutes • Switch • What worked? Any changes needed to your paper prototype? Anything missing?
Reflection on paper prototype testing • What did you learn? • Anything important missing from your prototypes? • Any obvious changes to make?
Why User Test? • Any testing is better than none – even a few users! • Saves time and money in development process by preventing errors • Hard to tell how good or bad UI is until people use it! • Examining real users gets us away from the “expert blind spot” • It is hard to predict what actual users will do • User testing mitigates risk • Not necessary to design flawless experiment protocol to get usability measures • Critical to evaluate the IMPORTANT aspects of your design
Expert-based evaluation • Aren’t there experts who can look at your site and identify problems? • Sort of… yeah. • This usually happens too late. “We’re going live in two weeks; do you have time to look over our site?” • Experts don’t always have the characteristics of your users, whom you studied so carefully before starting
Risks of Late User Testing… • Sometimes in software development, users are brought in only at the beta test stage • What are some of the risks of doing this? • By then most of the budget has been spent • It is very much more expensive to correct an error than if it had been caught early • Avoid this and test early and often…
3 Types of Evaluations • Formative: during development (explorative) • Summative: at completion (assessment and validation) • Comparison testing
Usability Methods in Chapter 7 • 7.1 Observation • 7.2 Questionnaires and Interviews • 7.3 Focus Groups • 7.4 Logging Actual Use • Combining Logging with Follow-Up Interviews • 7.5 User Feedback • 7.6 Choosing Usability Methods • Combining Usability Methods
Nielson’s Categories for Usability Methods, Chapter 7 Usability Engineering
Steps for an evaluation • Planning & preparation • Designing the test • Choosing participants • Selecting the task • Running the test • During the session • Collecting the data • Debriefing the subject • Analyzing the data and disseminating your findings
How to Run any EvaluationPlanning and PreparationRunning the Test Analysis and Dissemination
Planning and Preparation: Participants • Select the appropriate participants • Who are the ideal participants? • Who are acceptable participants? • Aim for the actual users of the system If unavailable, aim for the closest approximation • Target population users may have specific characteristics • Domain-specific vocabulary • Often possess particular domain knowledge • Have a history with existing systems, methods, etc. • Note: novices and experts • Why not just novices? • Why not just experts? • User mental models differ if they are novice or expert – system won’t support both if not tested on both • Don’t forget your user analysis, and think about how your design may bias your results
I was at the post office one day, and a student came up to the woman behind the counter and asked “who has the hardest job in the world?” She answered the president of America. He wrote this down, turned to me and asked me “who has the hardest job in world?” What kind of results do you expect this student will get? What would you change about how this student is administering this survey? Who has the hardest job in the world?
Always think about how you are biasing (distorting, impacting, controlling) your results Your goal is to gather data that is reliable, and repeatable. Avoid bias in your evaluations!
How can you prevent bias? 3 simple Factors you can control: Participants Location of evaluation Your Behavior and Actions
“My Roommate thought the buttons were too small” “My mom really liked my color choices” “My Girlfriend found the following Typos” Who should you recruit for your study?
3 Kinds of Bias • Undercoverage • Nonresponse • Voluntary response
3 Kinds of Bias: Undercoverage Undercoverage. Undercoverage occurs when some members of the population are inadequately represented in the sample. • Literary Digest voter survey, which predicted that Alfred Landon would beat Franklin Roosevelt in the 1936 presidential election. The survey sample suffered from undercoverage of low-income voters, who tended to be Democrats.
3 Kinds of Bias: Nonresponse • Nonresponse bias. Sometimes, individuals chosen for the sample are unwilling or unable to participate in the survey. • The Literary Digest survey illustrates this problem. Respondents tended to be Landon supporters; and nonrespondents were Roosevelt supporters. Since only 25% of the sampled voters actually completed the mail-in survey, survey results overestimated voter support for Alfred Landon
3 Kinds of Bias: Voluntary Response • Voluntary response bias. Voluntary response bias occurs when sample members are self-selected volunteers • Call-in radio shows that solicit audience participation in surveys on controversial topics (abortion, affirmative action, gun control, etc.). The resulting sample tends to overrepresent individuals who have strong opinions.
Who are the ideal versus acceptable participants? Aim for the actual users of the system If unavailable, aim for the closest approximation Things to consider: Age Culture Experience Domain-specific vocabulary Often possess particular domain knowledge Have a history with existing systems, methods, etc. Where did you find these people? Others? Select appropriate participants
Where should you conduct your user study? Laboratory vs. Real World studies Remember your environmental Analysis?
What changes if your user is… Waiting for the train at a crowded MARC station Sitting on the grass in the park on a sunny day Curious during a movie In an office that is quiet and dull Working at home Working in a coffee shop Exploring the role of environment
Conduct an “Environmental Analysis” and control the evaluation environment. Understand where your interface will be used This is usually best done through interviews or observations of real world use A few things to consider… Be as faithful to real situations as possible (get creative) Consider more complicated aspects of the environment: include distractions and stress if appropriate (noise/heat) Consider how the environment will effect machine performance (internet lag time, sensors not working, etc) Does this really matter? Ex: speech recognizer achieved 98.7% word accuracy in your user study; but the real world deployment of your system will be on an airport tarmac… Why does this matter?
What is the IRB? What is a consent form? Why do I care?
Informed consent • Main points to include (UMBC has its own forms) • General purpose • Participation is voluntary • Results will be confidential • There is no benefit to you, other than agreed-upon payment • There is no risk to you • 18 or over • Signature and date
IRB Slides • Institutional Review Board • http://www.umbc.edu/irb/ • Human Subjects • Training modules to conduct research • If you aren’t going to publish: Researchers conducting no more than minimal risk research • If you might publish: Social / Behavioral Research
How to Run any EvaluationPlanning and PreparationRunning the Test Analysis and Dissemination
During the session • Write a task script • I literally write down everything I am going to say • Prepare the user • “I am testing the system and not you” • “We expect problems, that’s why we are doing this” • “You can stop at any time, for any reason” • “I need to know what you are thinking as you go” (if appropriate) • Have the task ready • Written down • Give the same verbal instructions each time
Choosing your actors • How many people will be in the room? • What roles will they have? • Should the greeter, facilitator and observer all be the same person? • What kind of persona should they take on? • Manager / task master? • Student / paid worker? • Researcher?
Does my behavior really matter? “Always wear blue in court” “Wearing Green makes people think of money”
Unfortunately, there are some variables that are hard to control Your gender, age, ethnicity Being in a position of “power” In order to avoid bias, you want to control for as many variables as possible. Make the experience the same for each user What are some of the factors of the experimenter that could impact results? Clothing Attitude (are they grumpy, or not paying attention?) What is said to the participant Yes, your behavior matters!
Give each participant has the same experience Make sure they get the same instructions Make sure you ask all questions the same way Helps control evaluation duration Makes it easier for you to repeat the study Write a task script of everything that will happen in the study. Treat this like a script for a play I literally include that happens from “hello” to “goodbye” Ensure consistency: use a task script
Collecting the data • Write down observations • Consider how this may bias the user’s behavior • Record actions • Video/ audio recording • Camtasia or other screen recording • Will have to spend time “coding” data to understand what happened during evaluation • Take detailed notes immediately after session • Best to postpone doing anything else immediately after session • Want to capture everything that is in your head while it is fresh • Risk: you may have forgotten details
Debrief • This is where you usually administer questionnaires • Make sure it happens before any interview or discussion • Ask for any comments the users might have on the system • Ask for clarifications on areas where the participant had trouble • Thank participant and give them a method for contacting you in the future
For next week Assignment Readings
Readings • Required • Controlled experiments • Optional • Statistics in usability research • Usability Testing: current and future
Assignment: Test Paper Prototypes • Use the think aloud protocol to test your paper prototypes • KEEP YOUR PAPER PROTOTYPES (turn them in next week) • Complete appropriate Critical Incident UARs • Example paper prototype test: http://www.youtube.com/watch?v=9wQkLthhHKA&feature=related • (you should probably let the user drive more)
Assignment: Test Paper Prototypes • Perform a think aloud with 3 people who represents someone from your user analysis, and have them use the prototype you created. Have your users perform the 5 tasks you created these paper prototypes for. • Complete CI UARs based on what you saw • Fill out the top part for all users first • Aggregate across all users • Then, complete the bottom half • Write 200 words about what you learned
In-Class Activity • Verify your paper prototypes are complete • Test your other tasks with a different group • Take notes: are your prototypes complete? Any obviously missing parts? • Fix it before you complete the assignment • At the end of your test: • - testers: any obvious changes? • - participants: any bias? Feedback on the procedure?
HE Notes For the final report
Notes about the HE method • Don’t forget that you (the designer) are supposed to AGGREGATE your evaluators UARs into a final set • You search for duplicates • If your evaluators gave you severity ratings, aggregate them • You provide overall severity ratings • You providesolution recommendations