1 / 28

User Interface Evaluation in the Real World: A Comparison of Four Techniques

User Interface Evaluation in the Real World: A Comparison of Four Techniques. By Ashley Karr, M.s ., AUXP www.ashleykarr.com. The Four Techniques. Robin Jeffries James R. Miller Cathleen Wharton Kathy M. Uyeda Hewlett-Packard Laboratories

edward
Download Presentation

User Interface Evaluation in the Real World: A Comparison of Four Techniques

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. User Interface Evaluation in the Real World:A Comparison of Four Techniques By Ashley Karr, M.s., AUXP www.ashleykarr.com (c) 2013 Ashley Karr

  2. The Four Techniques Robin JeffriesJames R. MillerCathleen WhartonKathy M. Uyeda Hewlett-Packard Laboratories http://www.miramontes.com/writing/uievaluation/ January 1991(Proceedings of CHI'91, New Orleans, April 28 - May 3, 1991.) (c) 2013 Ashley Karr

  3. Abstract • Software product user interface (UI) evaluated prior to release by 4 groups & 4 interface evaluation techniques • Heuristic evaluation • Software guidelines • Cognitive walkthroughs • Usability testing • Result: • Heuristic evaluation by several UI specialists found the most serious problems with the least amount of effort • However, they reported many low-priority problems • Advantages/disadvantages of each technique • Suggestions to improve techniques (c) 2013 Ashley Karr

  4. Interface Evaluation Techniques Requiring UI Expertise • Heuristic Evaluation • UI specialists study interface in depth • Look for properties that they know, from experience, will lead to usability problems • Usability Testing • Interface studied under real-world or controlled conditions • Evaluators gather data on problems that arise during use • Observes how well interface supports user work environment (c) 2013 Ashley Karr

  5. Interface Evaluation Techniques Requiring UI Expertise • Heuristic & Usability Testing Limiting Factors • People with adequate UI experience scarce • Techniques difficult to apply before an interface exists • Recommendations come at a late stage in development, often too late for substantive changes • May not be aware of design’s technical limitations or why certain decisions were made • Technical and organizational gulfs can arise between the development team and the UI specialists • Usability testing is generally expensive and time-consuming (c) 2013 Ashley Karr

  6. Alternative Means of Evaluating Interfaces • Guidelines • Provide evaluators with specific recommendations about interface design • Cognitive walkthrough • Combines software walkthroughs with a cognitive model of learning by exploration • Interface developers walk through the interface in context of core tasks users must accomplish • Interface actions and feedback are compared to user goals and knowledge • Discrepancies between user expectations & steps required in interface noted (c) 2013 Ashley Karr

  7. Alternative Means of Evaluating Interfaces • Benefits • Interface developers can evaluate the interface • Potentially increase number of people who can do evaluations • Avoid limitations mentioned earlier • Limitations • Little is known about • How well they work, especially in comparison to one other • The types of interface problems they are best-suited to detect • Whether non-UI specialists can actually use them • Cost/benefit terms (c) 2013 Ashley Karr

  8. The Experiment • Studied the effects of interface evaluation type on the number, severity, benefit/cost ratio, & content analysis of interface problems found. • Between subjects design • 1 IV: Evaluation type • 4 levels • Heuristic Evaluation • Used researchers in HP labs who perform heuristic evals • Usability Test • Used HP’s usability tests for this product • Software Guidelines • Cognitive Walkthrough (c) 2013 Ashley Karr

  9. Results – Data Refinement • DV: Reported problems found on interface • Common form • Numbers and kinds of problems detected by the different groups notes • Results later compared by raters • 268 problem report forms • 4 categories of problems • Underlying system • Problems caused by conventions or requirements of one of the systems HP-VUE is built on: UNIX, X Windows, and Motif • Evaluator errors • Misunderstandings on the part of the evaluator • Non-repeatable/system-dependent • Problems that could not be reproduced or were due to aspects of a particular hardware configuration • Other • Reports that did not refer to usability defects (c) 2013 Ashley Karr

  10. The Interface • Beta-test of HP-VUE • Visual interface to the UNIX operating system • Provides graphical tools for • Manipulating files • Starting and stopping applications • Requesting and browsing help • Controlling the appearance of the screen, etc. (c) 2013 Ashley Karr

  11. Total Problems Found by Problem Type & Evaluation Technique (c) 2013 Ashley Karr

  12. Results – Problem Identification • More than 50% of total problems found by heuristic evaluators • All problems found by heuristic evaluators were found by heuristic evaluation, not side effect • Few problems found by side effect during cognitive walkthrough and usability testing • Problems found by guidelines fell equally into all 3 categories • May indicate that guidelines-based approach valuable • Forces careful examination of interface • Large number of problems found by guidelines evaluators may be bc 2 of the 3 evaluators had worked with HP-VUE prior to evaluation (c) 2013 Ashley Karr

  13. Core Problems Found by Technique & How Problem Found (c) 2013 Ashley Karr

  14. Results – Severity Analysis • 7 raters • 4 UI specialists & 3 people with moderate HCI experience • 206 core problems • Scale: 1 (trivial) to 9 (critical) • Impact of problem • Frequency it would be encountered • Relative number of users affected. • Mean ratings varied significantly • f(3,18)=5.86, p<.01 • Ordered by mean rated severity & splitting into thirds • Most severe: 3.86 or more • Least severe: 2.86 or less • Note: 1/3 of most severe problems credited to heuristic evaluators, but so can 2/3 of least severe (c) 2013 Ashley Karr

  15. Number of Problems Found By Technique & Severity (c) 2013 Ashley Karr

  16. Results – Benefit/Cost Analysis • Value = summed severity scores the core problems from each evaluation • Cost = # person-hours spent by evaluators for each technique • 3 parts • time spent on the analysis itself • Times spent on learning the technique • Time spent becoming familiar with HP-VUE • Computations • 1st set of ratios = severity / sum of all times noted • Heuristic evaluation has 4-1 advantage • 2nd set of ratios = severity / time for cognitive walk through & guidelines - time spent on HP-VUE familiarization • Heuristic evaluation still has 2-to-1 advantage (c) 2013 Ashley Karr

  17. Benefit/Cost Ratios (c) 2013 Ashley Karr

  18. Results – Content Analysis • 3 analyses carried out to understand content of problem reports • Consistency: Did the problem claim that an aspect of HP-VUE was in conflict with some other portion of the system? • 25% of the problems raised consistency issues • 6% of the problems identified by usability testing were consistency problems • Recurring: Is this problem one that only interferes with the interaction the first time it is encountered, or is it always a problem? • 70% found by guidelines and usability testing were recurring • 50% found by heuristic evaluation and cognitive walkthroughs were recurring • General: Did this problem point out a general flaw that affects several parts of the interface, or was the problem specific to a single part? • 40% overall were general • 60% found by guidelines were general • Usability testing found equal numbers of both types • Heuristic evaluation and cognitive walkthroughs found a greater number of specific problems than general (c) 2013 Ashley Karr

  19. Discussion • Heuristic evaluation technique produced overall best results • +Found the most problems • +Found the most serious problems • +Had the lowest cost • -UI specialists scarce & need more than one to replicate results in this study • -No individual heuristic evaluator found more than 42 core problems. • -Large number of specific, one-time, and low-priority problems found and reported. • Usability testing • +Found serious problems • +Good at finding recurring and general problems • +Avoided low-priority problems • -Most expensive • -Failed to find many serious problems • Guidelines evaluation • +Best at finding recurring and general problems • +Well-designed set of guidelines beneficial • +Focus • +Evaluators take a broad look at the interface • +Developers can also follow • -Missed many severe problems • Cognitive walkthrough • Roughly comparable in performance to guidelines • Problems found were less general & less recurring than those found by other techniques (c) 2013 Ashley Karr

  20. Discussion • Guidelines & cognitive walkthroughs • Used by software engineers to ID usability problems when UI specialists not available • Heuristic evaluation & usability testing • Great advantages over other techniques • Draw much of their strength from the skilled UI professionals who use them • Find most severe problems • Can run opposite risk of finding too many problems • Irrelevant “problems” • To decide which technique to use, consider • Goals of evaluation • Kinds of insights sought • Resources available (c) 2013 Ashley Karr

  21. Summary of Findings (c) 2013 Ashley Karr

  22. Discussion Questions? (c) 2013 Ashley Karr

  23. The Techniques & The Evaluators • Heuristic Evaluation • 4 heuristic evaluators • Members of HCI research group • Backgrounds in behavioral science and experience in providing usability feedback to product groups. • Technique • 2-week period for evaluation (had other job-related tasks) • Spent whatever amount of time they chose within that period • Reported the time spent conclusion of their evaluation • Usability Tests • Conducted by a human factors professional • Product usability testing is a regular part of job • Six subjects took part in the test • Regular PC users not familiar with UNIX • Spent about three hours learning HP-VUE • 2 hours doing a set of 10 user tasks defined by the usability testing team (c) 2013 Ashley Karr

  24. The Techniques & The Evaluators • Guidelines and Cognitive Walkthroughs • Could not use ACTUAL developers for HP-Vue • Used teams of 3 software engineers • Researchers at HP Laboratories with product experience • Substantial familiarity with UNIX and X Windows (the computational platform for HP-VUE) • Designed and implemented at least one graphical UI • All of the evaluators spent time familiarizing themselves with HP-VUE before doing the evaluation. (c) 2013 Ashley Karr

  25. The Techniques & The Evaluators • Guidelines • Used a set of 62 internal HP-developed guidelines • Based on established human factors principles and sources • Can be applied across a wide range of computer and instrument systems • Meant to be used by software developers and evaluators. (c) 2013 Ashley Karr

  26. The Techniques & The Evaluators • Cognitive Walkthrough • Task-based method • The experimenters selected the walkthrough tasks and provided them to the evaluators • Pilot cognitive walkthrough experiment • Refined procedure and tasks prior to the actual experiment (c) 2013 Ashley Karr

  27. The Problem & Report Form • Special Report Problem Form • Standardize reporting of UI problems across techniques • All evaluator/team reported every usability problem on form • Usability test evaluator and each heuristic evaluator submitted separate forms • Guidelines and cognitive walkthrough groups submitted team reports • Defined a usability problem as "anything that impacts ease of use” • Evaluators were asked to briefly describe the problem • Encouraged to report problems found even if technique being used did not lead to the problem • Note how they found the problem (c) 2013 Ashley Karr

  28. Results – Data Refinement • 3 Raters categorized • Worked independently • Reconciled differences as a group • Overall, 45 (17%) of the problems were eliminated • Approximately equally across groups. • 223 core problems remained • Addressed usability issues in HP-VUE • 3 raters looked for duplicate problems within evaluation groups • Conflicts were resolved in conference • 17 sets of within-group duplicates produced • Analysis that follows based on 206 problems that remained after categorization, reconciliation, conference (c) 2013 Ashley Karr

More Related