Ch 14. Testing & modeling users

Ch 14. Testing & modeling users Steven PautzLauren Sullivan Jessica HerronChris Moore

The aims • Describe how to do user testing. • Discuss the differences between user testing, usability testing and research experiments. • Discuss the role of user testing in usability testing. • Discuss how to design simple experiments. • Describe GOMS, the keystroke level model, Fitts’ law and discuss when these techniques are useful. • Describe how to do a keystroke level analysis.

User Testing • Developers test whether product is usable by the intended user population • Similar to experimentation, but different from research • Aim is to improve existing design, rather than to discover new knowledge • Not designed to be replicable • Often not published

Testing Medlineplus • User testing was done to evaluate changes made after heuristic analysis (chapter 13) • Goal of study was to identity usability problems in the revised interface • Categorization of information • Users’ navigation strategies

Medlineplus: Testing Setup • 9 participants from washington, DC area • 7 were female, all had some web experience • 5 tasks were developed, from frequently-asked questions by web visitors • 5 scripts were created, one for each stage of the test • Testing was performed in a laboratory setting

Medlineplus:testing • Every participant worked through the 5 tasks • Their progress through the site and “thinking aloud” comments were recorded • After all tasks were completed, a post-test questionnaire was given, and participants were asked their opinions • Many different data items were collected

Medlineplus: Data Analysis • Data was analyzed with a focus on: • Website organization • Browsing efficiency • Search features • Conclusions showed several areas for improvement, which were given to the developers through a presentation and report

Pros: Appropriate size, qualities of user group Adequate briefing material was created Informed consent form Cons: Focused heavily on some user qualities (location, etc) at the expense of others Not much variance among tasks tested Medlineplus: Pros and Cons

Doing User Testing • Central idea: controlling the test conditions. • Careful planning is necessary! • Use DECIDE framework for successful study.

Determine the goals & Explore the questions • User testing is most suitable for testing prototypes. • Focus the study: • “Can users complete a certain task within a certain time, or find a particular item, etc.”

Choose a paradigm • Usability testing paradigm • Involves: • Recording data by combination of video and interaction logging. • User satisfaction questionnaires. • Interviews.

Identifying the practical issues: Design typical tasks • Quantitative measures are obtained. • Types of data produced: • Time to complete a task. • Time to complete a task after a specified time away from the product. • Number and type of errors per task. • Number of errors per unit of time. • Number of navigations to online help or manuals. • Number of users making a particular error. • Number of users completing a task successfully.

Usability Engineering Specifications • Current level of performance. • Minimum acceptable level of performance. • Target level of performance.

Identifying practical issues: Select typical users • Find the ‘typical user’. • Most important characteristic: • Previous experience with similar systems. • Ex. Hutchworld – targeted towards cancer patients.

Identifying practical issues: Prepare the testing conditions • Testing conditions include: • Controlled environment. • Observation room. • Viewing room. • Reception area.

Deal with ethical issues • Consent form. • Point out presence of • One way mirrors • Video cameras • Use of interaction logging.

Evaluate, analyze, and present the data • Performance measures are recorded from video and transaction logs. • User tests involve a small number of participants.Basic descriptive statistics enable the evaluators to compare performance on different prototypes or systems.

Experimental Conditions • In order to test a hypothesis, the experimenter has to set up the experimental conditions and find ways to control other variables that could influence the test result. • Condition 1: read screen of Helvetica • Condition 2: read screen of Times New Roman • Control Condition: read both on printed paper

Experimental Designs • Different participants - single group of participants is allocated randomly to the experimental conditions. • Same participants - all participants appear in both conditions. • Matched participants - participants are matched in pairs, e.g., based on expertise, gender.

Advantages & disadvantages

Other Issues • You need to determine where the experiment will take place as well as how the equipment will be set up. • You also need to determine how the participants will be introduced to the study and what scripts will be needed to standardize the procedure. • Pilot studies are useful to aid in this process

Menu Structure • Study was designed to determine whether breadth is preferably to depth in organizing navigable menus. • 3 Experimental Conditions: (top level, sub level, content level) • 8 x 8 x 8 (slowest) • 16 x 32 (fastest) • 32 x 16 (middle) • 19 experienced web users were given 8 random and unique searches for each condition. • Breadth IS preferable to Depth

Predictive Models • Provide a way of evaluating products or designs without directly involving users • Psychological models of users are used to test designs • Less expensive than user testing • Usefulness limited to systems with predictable tasks - e.g., telephone answering systems, mobiles, etc. • Based on expert behavior • GOMS is the most well-known predictive modeling technique in human-computer interaction along with its “daughter”, the keystroke level model

GOMS • Goals - the state the user wants to achieve e.g., find a website • Operators - the cognitive processes & physical actions performed to attain those goals, e.g., decide which search engine to use • Methods - the procedures for accomplishing the goals, e.g., drag mouse over field, type in keywords, press the go button • Selection rules - determine which method to select when there is more than one available

Keystroke Level Model GOMS has also been developed further into a quantitative model - the keystroke level model. This model allows predictions to be made about how long it takes an expert user to perform a task.

Response times for keystroke level operators

Benefits and Limitations • Benefits • Allows comparative analyses for different interfaces or computer systems • Example: Project Ernestine • Viewed 2 systems to determine which would be better • Findings showed that a new system would slow down production • Limitations • Not good for evaluations • Does not allow for errors to be modeled, only expert performance • Does not reflect multi-tasking (non-sequential procedures)

Fitts’ Law (Paul Fitts 1954) • The law predicts that the time to point at an object using a device is a function of the distance from the target object & the object’s size. • The further away & the smaller the object, the longer the time to locate it and point. • Useful for evaluating systems for which the time to locate an object is important such as handheld devices like mobile phones

Summary & Key Points • User testing is a central part of usability testing • Testing is done in controlled conditions • User testing is an adapted form of experimentation • Experiments aim to test hypotheses by manipulating certain variables while keeping others constant • The experimenter controls the independent variable(s) but not the dependent variable(s) • There are three types of experimental design: different-participants, same- participants, & matched participants • GOMS, Keystroke level model, & Fitts’ Law predict expert, error-free performance • Predictive models are used to evaluate systems with predictable tasks such as telephones

Ch 14. Testing & modeling users