1 / 33

The Evolution of Evaluation

The Evolution of Evaluation. CHI 2007 alt.chi 30 April 2007. Joseph ‘Jofish’ Kaye Phoebe Sengers Cornell University, Ithaca NY jofish @ cornell.edu sengers @ cs.cornell.edu. What is evaluation?. Part of the practice of HCI Part of the design-build-evaluate iterative design cycle

druce
Download Presentation

The Evolution of Evaluation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The EvolutionofEvaluation CHI 2007 alt.chi 30 April 2007 Joseph ‘Jofish’ Kaye Phoebe Sengers Cornell University, Ithaca NY jofish @ cornell.edu sengers @ cs.cornell.edu

  2. What is evaluation? • Part of the practice of HCI • Part of the design-build-evaluate iterative design cycle • A comparison of ‘built’ to ‘planned’ • A place to reflect on both this and the next design • And…. • A way of defining a field • The space where a discipline validates the knowledge it creates.

  3. What is evaluation? • Something you do at the end of a project to show it works… • … so you can publish it. • A reason papers get rejected Which, again, are other ways of saying: • A way of defining a field • The space where a discipline validates the knowledge it creates.

  4. HCI Evaluation: Validity “Methods for establishing validity vary depending on the nature of the contribution. They may involve empirical work in the laboratory or the field, the description of rationales for design decisions and approaches, applications of analytical techniques, or ‘proof of concept’ system implementations” CHI 2007 Website

  5. So… How and why did we end up with the system(s) we use for HCI evaluation today? How can our current approaches to evaluation deal with novel concepts of HCI, such as third-wave/paradigm or experience-focused (rather than task focused) HCI? And in particular…

  6. The Virtual Intimate Object (VIO) • A device for couples in long distance relationships to communicate intimacy • When one partner clicks, the other’s circle lights up, and then fades over time. www.intimateobjects.org Kaye. I just clicked to say I love you. alt.chi, Ext. Abs. CHI 2006.

  7. Evaluation of the VIO • It’s about the experience; it’s not about the task • How can we measure intimacy and the transmission thereof? Kaye, Levitt, Nevins, Golden & Schmidt. Communicating Intimacy One Bit at a Time. Ext. Abs. CHI 2005. Kaye. I just clicked to say I love you. alt.chi, Ext. Abs. CHI 2006.

  8. Understanding how we got to where we are today • Evaluation by Engineers • Evaluation by Computer Scientists • Evaluation by Experimental Psychologists & Cognitive Scientists • Evaluation by HCI Professionals • Evaluation for Experience

  9. (with case studies) • Evaluation by Engineers • Evaluation by Computer Scientists • Evaluation by Experimental Psychologists & Cognitive Scientists • Evaluation of Text Editors • Evaluation by HCI Professionals • Damaged Merchandise • Evaluation for Experience

  10. Why does evaluation evolve? Evolution is adaptation to fit changing conditions. What changes? • Who are the users? • Who are the evaluators? • What are the limiting factors? p.s. note historical chunking and simplification

  11. Evaluation by Engineers Users are engineers & mathematicians Evaluators are engineers The limiting factor is reliability

  12. Evaluation by Computer Scientists Users are programmers Evaluators are programmers The speed of the machine is the limiting factor

  13. Evaluation by Computer Scientists • First uses of… • Human-computer interaction • “It seems that when a system encourages close human-computer interaction, it also encourages close human-human and human-computer-human interaction” (Schwartz 1965) • Computer-human interaction • “PLANIT A Flexible Language Designed for Computer-Human Interaction” (Feingold 1967)

  14. Evaluation by Experimental Psychologists& Cognitive Scientists Users are users: the computer is a tool; often in offices. Evaluators are cognitive scientists and experimental psychologists: they’re used to measuring things through experiment The limiting factor is what the human can do

  15. Case Study of ExPsych / CogSci Evaluation: Text Editors Roberts & Moran, 1982, 1983. Their methodology for evaluating text editors had three criteria: objectivity thoroughness ease-of-use

  16. Case Study: Text Editors objectivity “implies that the methodology not be biased in favor of any particular editor’s conceptual structure” thoroughness “implies that multiple aspects of editor use be considered” ease-of-use (of the method, not the editor itself) “the methodology should be usable by editor designers, managers of word processing centers, or other nonpsychologists who need this kind of evaluative information but who have limited time and equipment resources”

  17. Case Study: Text Editors objectivity “implies that the methodology not be biased in favor of any particular editor’s conceptual structure” thoroughness “implies that multiple aspects of editor use be considered”. ease-of-use (of the method (not the editor itself), “the methodology should be usable by editor designers, managers of word processing centers, or other nonpsychologists who need this kind of evaluative information but who have limited time and equipment resources.”

  18. Case Study: Text Editors Text editors are the white rats of HCI Thomas Green, 1984, in Grudin, 1990.

  19. Evaluation by HCI Professionals They believe in expertise over experiment (Nielsen 1984) They’ve made a decision to decide to focus on better results, regardless of whether they were experimentally provable or not.

  20. Evaluation by HCI Professionals Evaluators are usability professionals (often with Exp.Psych/CogSci backgrounds) Users are (often) white collar, using computers to accomplish their jobs The limiting factor is the time of the worker accomplishing their job

  21. Case Study: The Damaged Merchandise Debate

  22. Damaged Merchandise Setup Early eighties: usability evaluation methods (UEMs) - heuristics (Neilsen) - cognitive walkthrough - GOMS - …

  23. Damaged Merchandise Comparison Studies Jeffries, Miller, Wharton and Uyeda (1991) Karat, Campbell and Fiegel (1992) Nielsen (1992) Desuirve, Kondziela, and Atwood (1992) Nielsen and Phillips (1993)

  24. Damaged Merchandise Panel Wayne D. Gray, Panel at CHI’95 Discount or Disservice? Discount Usability Analysis at a Bargain Price or Simply Damaged Merchandise

  25. Damaged Merchandise Paper Wayne D. Gray & Marilyn Salzman Special issue of HCI: Experimental Comparisons of Usability Evaluation Methods

  26. Damaged Merchandise Response Commentary on Damaged Merchandise Karat: experiment in context Jeffries & Miller: real-world Lund & McClelland: practical John: case studies Monk: broad questions Oviatt: field-wide science MacKay: triangulate Newman: simulation & modelling

  27. Damaged Merchandise Clash of Paradigms Experimental Psychologists & Cognitive Scientists (who believe in experimentation) vs. HCI Professionals (who believe in experience and expertise, even if ‘unprovable’) (and who were trying to present their work in the terms of the dominant paradigm of the field.) Kuhn (1972) Structure of Scientific Revolutions

  28. Damaged Merchandise Clash of Paradigms In this particular work, we’re not talking about who’s right It’s about recognizing what paradigm clashes look like in HCI It’s about the need to present work in the terms of the dominant paradigm of the field It’s thinking about how to recognize and re-think our own approaches to knowing and doing HCI: an HCI that recognizes how it knows what it knows

  29. Experience Focused HCI A possibly emerging sub-field, drawing from traditions and disciplines outside the field Emphasis on the experience, not [just] the task Thinking about technology as more like… a car than a text editor Wright & McCarthy, Gaver, Blythe, Höök, Taylor & Swan, Bødker, Peterson, Isbister…

  30. Experience Focused HCI • For example… • How can you evaluate a car? • Why do you drive what you drive? • Grad-student-chic? • Eco-chic?s • Machismo? Safety? Gay? Speed? • For users, ‘HCI’ is cultural as well as technological • We’ll fail if we evaluate purely on task

  31. Experience Focused HCI The users are people choosing to use technology for the joy of it, & to do what they want in everyday life. The evaluators are us… and ethnographers and designers and documentary filmmakers and writers and playwrights The limiting factor might be how to express oneself, how to be and be seen (or not)

  32. Why the evolution of evaluation matters New paradigms require new ways of knowing and new ways of evaluation Difficulties come when one paradigm tries to present work in the manner of another paradigm We need to actively recognize and call attention to when this happens, both as researchers and reviewers

  33. An evolving discussion SIG: Evaluation of Experience-focused HCI Thursday, 9am, Room C4 Joseph ‘Jofish’ Kaye jofish@cornell.edu (paper & talk at jofish.com) Phoebe Sengers sengers@cs.cornell.edu Research sponsored in part by the NSF and Microsoft Research Cambridge Thanks to the Culturally Embedded Computing Group, BostonCHI, Alex Taylor, Ken Wood, Richard Harper, Abi Sellen, Shahram Izadi, Lorna Brown & the CMLG, Microsoft Cambridge, Apala Lahiri Chavan & Eric Schaffer, HFI, CHI Bangalore, CHI Mumbai, BostonCHI, the Cornell S&TS Department, Maria Håkansson & IT University Göteborg, Louise Barkhuus, Barry Brown & University of Glasgow, Mark Blythe & University of York, Andy Warr & the Oxford E-Research Center, Susanne Bødker, Marianne Graves Petersen & The University of Aarhus, Terry Winograd, Wendy Ju, Scott Klemmer & The Stanford HCI Seminar, Jonathan Grudin, Liam Bannon, Gilbert Cockton, William Newman, Kirsten Boehner, Jeff Hancock, Bill Gaver, Janet Vertesi, Kia Höök, Jarmo Laaksolahti, Anna Ståhl, Helen Jeffries, Paul Dourish, Jen Rode, Peter Wright, Ryan Aipperspach, Bill Buxton, Michael Lynch, Seth ‘Beemer’ McGinnis & Katherine Isbister,

More Related