Evaluation

Evaluation Chapter 9

Evaluation • A very significant aspect of HCI design that separates it from SE • To test usability and usefulness • Can be done in the lab and/or in the field • Evaluate the design (early) and the implementation (later) • Evaluate the implementation during design (formative) and after it has been deployed and used by customers (summative)

Goals • Assess the functionality and usefulness of the interactive system: • Does it match requirements specifications? • Does it match the expectations of designers and users? • Does it meet the set out performance goals? • Assess the effect of the interface on the user: usability • Identify specific problems with the system and its interface, and develop remedies.

Styles • Laboratory Studies • Advantages • specialized equipment can be used • environment can be controlled • Disadvantages • Cost • Unnatural/intimidating • Difficult to observe users in their “natural state” • Use when • collecting data in actual environment is impractical • if environment needs to be controlled • if experimental psychology techniques are to be used

Styles • Field Studies • Advantages • Observe users in their “natural” setting • Can observe effects of context on system use • For longitudinal studies that require several days/weeks/months • Disadvantages • Cost • Can’t control environment: noise, interruptions etc. • Use when • A longitudinal study is required • Data on actual use conditions is desired • Strict control of the experimental condition unimportant

Techniques • Cognitive Walkthrough • Heuristic Evaluation • Review-based Evaluation • Model-based Evaluation • Usability Evaluation • Observational Methods • Query Techniques • Experimental Evaluation

Cognitive Walkthrough • Like code inspection/walkthrough in SE • Evaluates the design through a prototype - how well does it support the user in learning how to do the task - through exploration by experts • Usually performed by an expert who “walks through” the design to identify potential problems

Cognitive Walkthrough • Starts with the expert being provided with: • A prototype of the system • Task, action and user descriptions developed during the design • For each task identified in task analysis, CW considers the following: • What impact will interaction have on the user? • What cognitive processes are required? • What learning/interaction problems may occur?

Cognitive Walkthrough • For each action the user needs to carry out to accomplish a goal, the expert asks: • Can the user recognize the correct action that is required? • Is the action (i.e. how to carry it out) visible at the interface? • Is there a difference between its intended effect (what the user wants to happen) and its actual effect (what the system does)? • Is the user able to carry out the action successfully? • Can the user then successfully interpret the feedback provided by the system? • A negative answer indicates a potential usability problem • Understand the example in 11.4.1

Heuristic Evaluation • A set of usability criteria (called heuristics) are identified • see the list on p. 413 • E.g. • System behaves in a predictable way in response to all user actions • System behaves in a consistent way • System provides feedback for all correct and incorrect actions • Then the design and/or the prototype is examined by experts to see if these are violated • This is called a “usability inspection technique”

Heuristic Evaluation • Select the heuristics from those proposed in the literature by experts like Jakob Nielsen • Develop system and task specific questions to verify if the design/prototype satisfies each selected heuristic • Have multiple experts independently evaluate the system/prototype using these questions

Other Evaluation Techniques • Review-based Evaluation • Review the technical HCI literature to see if similar designs/systems have been evaluated • Not commonly practiced • Model-based evaluation • An analytical approach in which models developed during the design process, such as ATN, GOMS, TDH etc, are analyzed to discover potential problems • Usability Specification & Evaluation • Selecting, setting target levels, and measuring specific usability attributes: usability specification table • Already covered; Read Chapter 8 of Reference if you haven’t yet done so!

Observational Methods • A class of techniques called “protocol analyses” • Experimenter note-taking • cheap but limited • User notebooks • subjective, coarse level data • but useful user insights • good for longitudinal studies • beepers/PDAs used for reminding • Audio-taping • may miss actions, gestures etc. • transcription difficult

Protocol Analyses • Video-taping • more complete record • but special equipment needed • obtrusive • transcription difficult • Computer-logging • automatic and unobtrusive • but voluminous data • Some combination of these is typically used

Concurrent Think-aloud Protocols • Method • User observed while doing the task • User asked to talk aloud what he/she is doing, why, and what he/she is thinking/expecting to happen, etc. • Advantages • Simple technique • Can provide insights into user’s cognitive processes • Can reveal causes of errors • Disadvantages • Highly subjective • Voluminous raw data • Talking may alter performance

Cooperative Protocols • Method • Variation of think-aloud in which the subject cooperates with the experimenter in asking and answering questions • Advantages • Advantages of think-aloud • Less constrained than think-aloud • User is encouraged to criticize the system and provide clarifications • Disadvantages • Disadvantages of think-aloud

Retrospective Protocols • Also called Post-task Walkthrough • Method • User reflects on what happened after doing the task. • User asked questions to fill in details. • Sometimes combined with a during-task protocol collection • Advantages • Experimenter can focus on relevant incidents • Task interruption due to talking is avoided • Disadvantages • Memory limitations • Post-hoc interpretation of what happened is likely to be subjective

Query Techniques • Advantages: informal, cheap, simple • Main disadvantage: subjective • Techniques • Structured Interviews • Experimenter questions each user after working with the system using prepared questions. • Written/Oral • Advantages: • Different questions for different users and tasks • Issues can be explored fully • Provides significant user input • Disadvantages • Time consuming

Query Techniques • Questionnaires/Surveys • Fixed, typically multiple choice, written questionnaire given to users to fill out. • Careful design of questions and data analysis methods needed. • Advantages: • Quick, useful for large numbers of users • Can quantify data and statistical analyses possible • Provides significant user input • Disadvantages • Less flexible than interviews • Less deeply probing

Query Techniques • Questionnaires/Surveys • Careful design of questions and data analysis methods needed. • Question Styles: • General: to characterize the subject • Open-ended: to elicit opinions/suggestions • Scalar: Likert Scale • Multiple-choice • Ranked • Examples • See worked exercise p. 433 of text • See p. 228 of reference

Evaluation

Evaluation

Presentation Transcript

evaluation

Evaluation

Evaluation

Evaluation

EVALUATION

Evaluation

Evaluation

Evaluation

Evaluation

Evaluation

Evaluation

Evaluation

Evaluation

Evaluation

EVALUATION

Evaluation

Evaluation

Evaluation

Evaluation

Evaluation Economic Evaluation

Evaluation

Evaluation