1 / 32

The evolution of evaluation

Explore the history and changes in evaluation methods in Human-Computer Interaction (HCI), from the perspectives of engineers, computer scientists, experimental psychologists, cognitive scientists, HCI professionals, and CSCW researchers. This talk also considers evaluation for experience-focused HCI.

ronaldburns
Download Presentation

The evolution of evaluation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The evolution of evaluation Joseph ‘Jofish’ Kaye Microsoft Research, Cambridge Cornell University, Ithaca, NY jofish @ cornell.edu

  2. What is evaluation? Something you do at the end of a project to show it works… … so you can publish it. A tradition in a field A way of defining a field A process that changes over time A reason papers get rejected

  3. HCI Evaluation: Validity “Methods for establishing validity vary depending on the nature of the contribution. They may involve empirical work in the laboratory or the field, the description of rationales for design decisions and approaches, applications of analytical techniques, or ‘proof of concept’ system implementations” CHI 2007 Website

  4. So… How did we get to where we are today? Why did we end up with the system(s) we use today? How can our current approaches to evaluation deal with novel concepts of HCI, such as experience-focused (rather than task focused) HCI?

  5. Experience focused HCI (a question to think about during this talk) What does it mean when this is your evaluation method?

  6. A Brief History and plan for the talk • Evaluation by Engineers • Evaluation by Computer Scientists • Evaluation by Experimental Psychologists & Cognitive Scientists • Evaluation by HCI Professionals • Evaluation in CSCW • Evaluation for Experience

  7. A Brief History and plan for the talk • Evaluation by Engineers • Evaluation by Computer Scientists • Evaluation by Experimental Psychologists & Cognitive Scientists • Case study: Evaluation of Text Editors • Evaluation by HCI Professionals • Case Study: The Damaged Merchandise Debate • Evaluation in CSCW • Evaluation for Experience

  8. 3 Questions to ask about an era Who are the users? Who are the evaluators? What are the limiting factors?

  9. Evaluation by Engineers Users are engineers & mathematicians Evaluators are engineers The limiting factor is reliability

  10. Evaluation by Computer Scientists Users are programmers Evaluators are programmers The speed of the machine is the limiting factor

  11. Evaluation by Experimental Psychologists& Cognitive Scientists Users are users: the computer is a tool, not an end result Evaluators are cognitive scientists and experimental psychologists: they’re used to measuring things through experiment The limiting factor is what the human can do

  12. Evaluation by Experimental Psychologists& Cognitive Scientists Perceptual issues such as print legibility and motor issues arose in designing displays, keyboards and other input devices… [new interface developments] created opportunities for cognitive psychologists to contribute in such areas as motor learning, concept formation, semantic memory and action. In a sense, this marks the emergence of the distinct discipline of human-computer interaction. (Grudin 2006)

  13. Case Study: Text Editors Roberts & Moran, 1982, 1983. Their methodology for evaluating text editors had three criteria: objectivity thoroughness ease-of-use

  14. Case Study: Text Editors objectivity “implies that the methodology not be biased in favor of any particular editor’s conceptual structure” thoroughness “implies that multiple aspects of editor use be considered” ease-of-use (of the method, not the editor itself) “the methodology should be usable by editor designers, managers of word processing centers, or other nonpsychologists who need this kind of evaluative information but who have limited time and equipment resources”

  15. Case Study: Text Editors objectivity “implies that the methodology not be biased in favor of any particular editor’s conceptual structure” thoroughness “implies that multiple aspects of editor use be considered”. ease-of-use (of the method (not the editor itself), “the methodology should be usable by editor designers, managers of word processing centers, or other nonpsychologists who need this kind of evaluative information but who have limited time and equipment resources.”

  16. Case Study: Text Editors Text editors are the white rats of HCI Thomas Green, 1984, in Grudin, 1990.

  17. Case Study: Text Editors Text editors are the white rats of HCI Thomas Green, 1984, in Grudin, 1990. …which tells us more about HCI than it does about text editors.

  18. Evaluation by HCI Professionals Usability professionals They believe in expertise (e.g. Neilsen 1984) They’ve made a decision decision to decide to focus on better results, regardless of whether they were experimentally provable or not.

  19. Case Study: The Damaged Merchandise Debate

  20. Damaged Merchandise Setup Early eighties: usability evaluation methods (UEMs) - heuristics (Neilsen) - cognitive walkthrough - GOMS - …

  21. Damaged Merchandise Comparison Studies Jefferies, Miller, Wharton and Uyeda (1991) Karat, Campbell and Fiegel (1992) Neilsen (1992) Desuirve, Kondziela, and Atwood (1992) Neilsen and Phillips (1993)

  22. Damaged Merchandise Panel Wayne D. Gray, Panel at CHI’95 Discount or Disservice? Discount Usability Analysis at a Bargain Price or Simply Damaged Merchandise

  23. Damaged Merchandise Paper Wayne D. Gray & Marilyn Salzman Special issue of HCI: Experimental Comparisons of Usability Evaluation Methods

  24. Damaged Merchandise Response Commentary on Damaged Merchandise Karat: experiment in context Jefferies & Miller: real-world Lund & McClelland: practical John: case studies Monk: broad questions Oviatt: field-wide science MacKay: triangulate Newman: simulation & modelling

  25. Damaged Merchandise What’s going on? Gray & Salzman, p19 There is a tradition in the human factors literature of providing advice to practitioners on issues related to, but not investigated in, an experiment. This tradition includes the clear and explicit separation of experiment-basedclaims from experience-based advice. Our complaint is not against experimenters who attempt to offer good advice… the advice may be understood as research findings rather than the researcher’s opinion.

  26. Damaged Merchandise What’s going on? Gray & Salzman, p19 There is a tradition in the human factors literature of providing advice to practitioners on issues related to, but not investigated in, an experiment. This tradition includes the clear and explicit separation of experiment-basedclaims from experience-based advice. Our complaint is not against experimenters who attempt to offer good advice… the advice may be understood as research findings rather than the researcher’s opinion.

  27. Damaged Merchandise Clash of Paradigms Experimental Psychologists & Cognitive Scientists (who believe in experimentation) vs. HCI Professionals (who believe in experience and expertise, even if ‘unprovable’) (and who were trying to present their work in the terms of the dominant paradigm of the field.)

  28. Evaluation in CSCW A story I’m not telling CSCW vs. HCI Not just groups, but philosophy (ideology!) Member-created, dynamic, not cognitive, modelable Follows failure of ‘workplace studies’ to characterize IE Plans and Situated Actions vs. The Psychology of Human-Computer Interaction

  29. Evaluation of Experience Focused HCI • A possibly emerging sub-field: • Gaver et. al. • Isbister et. al. • Höök et. al. • Sengers et. al. • Etc. • How to evaluate?

  30. Epistemology How does a field know what it knows? How does a field know that it knows it? Science: experiment… But literature? Anthropology? Sociology? Therapy? Art? Theatre? Design?

  31. Epistemology Formally: The aim of this work is to recognize the ways in which multiple epistemologies, not just the experimental paradigm of science, can and do inform the hybrid discipline of human-computer interaction.

  32. Shouts To My Homies Maria Håkansson Lars Erik Holmquist Alex Taylor & MS Research Phoebe Sengers & CEmCom Cornell S&TS Department Many discussions over the last year… and this one to come.

More Related