240 likes | 528 Views
Evaluation. CS2391 Lecture n+1: Robert Stevens . Introduction. You’ve gathered requirements, designed your system, built the artefact, …But does it fulfil the user‘s requirements? Basic usability Basic evaluation Evaluation styles Design evaluation Implementation evaluation.
E N D
Evaluation CS2391 Lecture n+1: Robert Stevens http://img.cs.man.ac.uk/stevens
Introduction • You’ve gathered requirements, designed your system, built the artefact, …But does it fulfil the user‘s requirements? • Basic usability • Basic evaluation • Evaluation styles • Design evaluation • Implementation evaluation http://img.cs.man.ac.uk/stevens
Usability Basics • Allowing users to achieve a goal with efficiency, effectiveness and satisfaction • Utility is the functionality of a system • Utility without usability, but not vice versa • Worthy, but unhelpful • Have paradigms of good usability, e.g. GUI • Also need theory to know why something is usable • Really want principles to guide developers – engineering not craft http://img.cs.man.ac.uk/stevens
Execution and Evaluation Input presentation observation System User Output performance articulation http://img.cs.man.ac.uk/stevens
Execution & Evaluation (2) • Presentation: How the system renders state and allows the user to evaluate state and alteration to the state • Observation: What the user notices of the presentation; Can he/she see what they need to? • Articulation: Expression of a user’s execution plan • Performance: the system’s execution of a plan, the results of which are presented to the user http://img.cs.man.ac.uk/stevens
Usability Principles • Visibility of system status System should always keep users informed • Match between system and the real world System should speak the user's language • System functions chosen by mistake need a clear 'emergency exit' • Consistency and standards Avoid ambiguity • Error prevention • Recognition rather than recall • Flexibility and efficiency of use • Aesthetic and minimalist design • recognize, diagnose and recover from errors • Help and documentation http://img.cs.man.ac.uk/stevens
What is Evaluation? • Do the design and implementation behave as we expect and fulfil the user’s requirements? • Not just an add on at the end! • Assess the design at various times during the life cycle • Assess implementation prototypes, alpha and beta versions • Evaluation saves time and money • Many types of evaluation and the trick is to choose the appropriate one • Purpose is to uncover usability problems http://img.cs.man.ac.uk/stevens
Usability Thoughts • Recall and recognition • Making a system easier to use makes it more powerful • Humans can switch topics fast – think of more than one thing at once • Computer system should be able to do the same • Complex syntax often hides the task – need directness of interaction http://img.cs.man.ac.uk/stevens
Styles of Evaluation Evaluation • Design Evaluation • Cognitive walkthrough • Heuristic evaluation • Review-based evaluation • The use of models • Implementation Evaluation • Empirical • Observational • Query http://img.cs.man.ac.uk/stevens
Evaluation Styles (2) • Cheaper to evaluate design, before the expense of implementation • Tends not to involve the end-users, except as consultants • Evaluation of an implementation does involve end-users • Design evaluation techniques can be used to evaluate implementation • The former are often paper based and involve experts • The latter are time consuming, difficult and expensive and can involve numbers of end-users http://img.cs.man.ac.uk/stevens
Types of User • Not all users are Computer Scientists • Different users have different needs • Remember: Managers, system administrators and trainers • Use end-users where possible and appropriate • Important to have evaluatees that are representative of end-users • Balance between under use and over use: Users need a reward for their time http://img.cs.man.ac.uk/stevens
Hawthorn Effect • Users like to please the evaluator • People respond well to having someone interested in them • Simply by evaluating an artefact, experience of that artefact improves • Investigation of light levels in factories showed the investigation itself was the most important factor • Not much to be done about it – be aware http://img.cs.man.ac.uk/stevens
Goals of Evaluation • Does the system have the correct functionality? Does it match the users task? • A clerk used to searching by post-code, should be able to search by post-code • Can the functionality be used: What is the effect on the user? • What are the problems with the system? • The last is part of the other two, but negative aspects drawn out http://img.cs.man.ac.uk/stevens
Laboratory Techniques • A usability lab: One way mirror; Video and audio recorders • Logging of system • Lacks context; unnatural for end-users and natural collaborative work difficult • Does allow close study, particularly of specialist task or particular UI notion • Good for single user tasks http://img.cs.man.ac.uk/stevens
Field Techniques • See the user in context • Allows a user to interact with all people, objects and actions involved in a task • Collaborative work can take place • Noisy, difficult to record, etc • Can lack detail possible in laboratory http://img.cs.man.ac.uk/stevens
Cognitive Walk Through • Bring psycology theory into informal and subjective walk through • Need a design: not necessarily complete, but location and wording helpful • A description of the task: Should be representative • A list of actions the user makes to perform the task • A description of the users and the experience expected of them • given to experts, who step through actions and make an assessment of usability • Are the users performing the task described by the action? • Can the users see the object of interaction (button etc)? • Can the user tell that it is the right action? • Once performed, does the user get appropriate feedback? • End of execution & evaluation cycle http://img.cs.man.ac.uk/stevens
Heuristic Evaluation • A set of heuristics (rules of thumb) developed by Jakob Nielsen and Rolf Molich • Each heuristic used to critique an interface • A set of independent experts use the heuristics • Problems found following a Poisson distribution – 5 experts find about 75% of problems • Usability questions used to guide and stimulate • Essentially a check list http://img.cs.man.ac.uk/stevens
Review Based Evaluation • Principles from experimental psychology and HCI literature used to provide evaluation criteria • E.g., menu design, naming items, icon design and language design and memory attributes • Cheaper than performing the experiment, but beware of context in which a study was performed • Like all expert based methods, it is all about stimulating basic questions to be asked • Try and ensure independence of experts • Performance, using scales and comment fields should be used http://img.cs.man.ac.uk/stevens
Empirical Evaluation • Evaluating the implementation (can also use Design Evaluation methods here) • Empirical studies concentrate on end-users, rather than experts • The controlled experiment technique • Measure some attribute, while controlling other attributes of system • Various experimental conditions, which differ only in the value of some variable • Independent (manipulated) and dependent (measured) variable • Difference in behaviour attributed to different values of independent variable that provide the different conditions (interface style, pointing device, wording, etc.) • Dependent variable must be measurable in some way – speed, mouse clicks, satisfaction etc. • Use both subjective and objective Measures http://img.cs.man.ac.uk/stevens
Empirical Techniques (2) • A hypothesis is framed in terms of the variables • A change in the independent variable causes a change in the dependent • The experiment attempts to prove this relationship • Achieved by disproving null hypothesis; that is, no relationship of variables • Use statistics to show that any differences seen could not have happened by chance • Experimental design: Between groups and within groups • Between Groups: Subjects assigned to experimental and control groups; latter ensures it is the independent variable that counts • Each subject only does one condition, so avoiding learning effects; but prone to variation • Within Groups: Subject performs in all conditions; Vary condition order to avoid learning http://img.cs.man.ac.uk/stevens
Empirical Evaluation (3) • Good for evaluation of individual design decisions: Colour, dialogue, wording, etc. • Less good for overall usability – systems and humans too complex for controlled experiment • Difficult to design • Expensive in time, money and users http://img.cs.man.ac.uk/stevens
Observational Techniques • Think aloud & Co-operative evaluation • Observing the user’s actions in work context – the whole task • Usually pre-determined, representative tasks and users explain what they are doing (think aloud) • Experimenter interacts with participant (subject) to elicit more information • Everything recorded (notes, system log, audio, video) • Protocols analysed • Post-experiment walk through http://img.cs.man.ac.uk/stevens
Query Based Techniques • Ask the user can be very informative • Simple, but highly subjective • Interviews and questionnaires (see earlier lectures) • Good for large numbers and high-level • Good for exploring alternative strategies, particularly in context • Less systematic, more subjective http://img.cs.man.ac.uk/stevens
Summary • Need to test appropriateness of functionality • Also that functionality can be used • Efficiency, effectiveness and satisfaction • Evaluation of design and its implementation • Choose your users with care • HCI: Dix, Findlay, Abowd & Beale; Chapter 11 http://img.cs.man.ac.uk/stevens