430 likes | 576 Views
Evaluating ubicomp applications in the field. Gregory D. Abowd, Distinguished Professor School of Interactive Computing and GVU Center, Georgia Tech. Ubicomp evaluation in the field. Weiser (CACM 93): “ Applications are of course the whole point of ubiquitous computing. ”
E N D
Evaluating ubicomp applications in the field Gregory D. Abowd, Distinguished ProfessorSchool of Interactive Computing and GVU Center, Georgia Tech
Ubicomp evaluation in the field • Weiser (CACM 93): “Applications are of course the whole point of ubiquitous computing.” • Applications are about the real world • Design: finding appropriate ubicomp solutions for real-world problems. • Evaluation: Demonstrating that a solution works for its intended purpose. • But these activities are intertwined. Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Defining evaluation terms • Formative and Summative • Who is involved in evaluation • Empirical, Quantitative, and Qualitative Evidence • Approaches to evaluation Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Formative and Summative Formative • assess a system being designed • gather input to informdesign Summative • assess an existing system • Summary judgement of success criteria • Which to use? • Depends on • maturity of system • how evaluation results will be used • Same technique can be used for either Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Who is involved with evaluation? • End users, or other stakeholders • The designers or HCI experts Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Form of data gathered Empirical: based on evidence from real users Quantitative: objective measurement of behavior Qualitative: subjective recording of experience Mixed methods: a combination of quant and qual Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Approach • Predictive modeling • Controlled experiment • Naturalistic study Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Predictive Modeling • Try to predict usage before real users are involved • Conserve resources (quick & low cost) • Model based • Calculate properties of interaction • Fitts’ Law, Keystroke Level Model • Review based • HCI experts (not real users) interact with system, find potential problems, and give prescriptive feedback • Best if they: • Haven’t used earlier prototype • Familiar with domain or task • Understand user perspectives Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Controlled experimentation Lab studies, quantitative results • Typically in a closed, lab setting • Manipulate independent variables to see effect on dependent variables • Replicable • Expensive, requires real users and lab • Can use follow-up interviews for qualitative results Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Naturalistic Study Or the field study • Observation occurs in “real life” setting • Watch process over time • “Ecologically valid” contends with controlled and “scientific” • What is observed can vary tremendously We will focus on this form of evaluation Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Why is this so hard? Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Some examples of ubicomp technologies in the field Historical overview Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Examples of field studies • Xerox PARC • PARCTab • Liveboard • Georgia Tech • Classroom 2000 • Digital Family Portrait • Personal Audio Loop • Abaris, CareLog and BabySteps • Cellphone proximity • SMS for asthma Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Xerox PARC • Computing at different scales • Inch • Foot • Yard • They were pretty successful at inch and yard scales with two very different approaches to evaluation Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Xerox PARC inch scale • The PARCTab • Location-aware thin client • Deployed in CSL building • Built devices and programmable context-awareness • Gave it to community to see what they would produce Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Xerox PARC yard scale • The Liveboard • Pen-based electronic whiteboard • Deployed in one meeting room • Designed solution for IP meetings • Supporting work of one individual Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Georgia Tech Classroom 2000 • The Liveboard in an educational setting • Establishing theme of automated capture • Room takes notes on behalf of student • 4-year study of impact on teaching and learning experience. • The living laboratory Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Georgia Tech Aware Home • What value is it to have a home that knows where its occupants are and what they are doing? • A living laboratory? • Great for feasibility studies and focus groups • Never anyone’s “home” Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Georgia Tech Digital Family Portrait • Great example of a formative study done in the wild • Sensing replaced by phone calls • Similar to Intel Research CareNet Display Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Georgia Tech Technology and Autism • Ubicomp got very personal for me in 2002 • Improving data collection Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
From whimsical to inspirational…Dec. 1998 Aidan, 18 months
From whimsical to inspirational…July 1999 Aidan, 26 months
Detailed Scoring Manual Calculations Hand Plotting
Julie Kientz, Ph.D. Speech detection to timestamp beginning of trial Record handwriting using Anoto digital pen to collect grades and timestamp end of trial Abaris: Embedding Capture Leverages basic therapy protocol to minimize intrusion
Abaris: Embedding Access Julie Kientz
Abaris: Study • 4 month real use deployment study • Case Study: Therapy team for one child • 52 therapy sessions (50+ hours of video) • 6 team meetings • Data collected • Video coding and analysis of team decisions during sampled meetings • Meetings without Abaris: 39 decision points across 3 meetings • Meetings with Abaris: 42 decision points across 3 meetings • Interviews with team members • Software logging of Abaris Full study details: Chapter 5 Julie A. Kientz, Georgia Tech
Results: Easier Data Capture • Therapists were able to learn to use the Abaris system with very quick training • Therapists spent less time processing paperwork Julie A. Kientz, Georgia Tech
Results: Access to Data Percentage of decision points in which a given artifact was used. Oftentimes, therapists used multiple artifacts. * p < .01 ** p < .05 For full details, see Chapter 5 Julie A. Kientz, Georgia Tech
Results: Improving Collaboration p < .01 • Analysis of decision points in team meetings indicated an increase in collaboration • Interviews with therapists after meeting confirmed these numbers For full details, see Chapter 5 Julie A. Kientz, Georgia Tech
Gillian Hayes (with Gillian Hayes (GT), Juane Heflin (Georgia State), Cobb County Special Ed. and Behavior Imaging Solutions, Inc.) Collecting rich behavioral data in the unstructured natural environment Retroactively saving important video Conscious selection of relevant video episodes
Examples of field studies • Xerox PARC • PARCTab • Liveboard • Georgia Tech • Classroom 2000 • Digital Family Portrait • Personal Audio Loop • Abaris, CareLog and BabySteps • Cellphone proximity • SMS for asthma Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Others • Intel Research • CareNet Display • Reno • UbiFit, UbiGarden • EQUATOR • Mixed reality games Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Lessons from our ancestors • Evaluation takes time • The experience of ubicomp does not come overnight • The “abowd” unit of evaluation; nothing substitutes for real use • Bleeding edge technology means people must keep system afloat • The users cannot be expected or bothered with installation or maintenance (C2K) Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
More lessons • Sometimes you want to evaluate in the field without any working system • The idea of experience prototypes (or paratypes), or humans as sensors Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
3 Types of Field Studies (Ch. 4 by Brush) • Or, why do the study? • In somewhat chronological order: • Understand current behavior • Proof of concept • Experience using a prototype
Understanding Behavior Insight into current practice, baseline of behaviors • To inform new designs • To use as comparison at some later date • Brush and Inkpen (2007) • Patel et al. (2006)
Proof of Concept Bleeding-edge prototypes but not in the lab. • Context-Aware Power Management (2007) • TeamAwear (2007)
Experience Prolonged use that is not about feasibility of the technology, but about the impact on the everyday experience. • CareNet (2004) • Advanced User Resource Annotation
Study Design • Study Design • Data collection techniques • Surveys, interviews, field notes, logging/instrumentation, experience sampling, diaries • How long should the study be? • The “abowd” as a unit of time for a field study.
Participants • Ethics • Selection of the right participants • Number of participants • Compensation
Analysis of data • Quantitative data • Relevant statistical methods based on data collection • Qualitative data • Often unstructured and must be processed (or coded) to be understood and analyzed further
Let’s reflect on the other readings… • Provide summary based on: • Type of study (Brush taxonomy) • What was its purpose? • How study was designed • Data to be collected, conditions of study, number and kind of participants, length of study • How results were analyzed • Did the study meet its purpose?