190 likes | 296 Views
Controlled User studies. HCI - 4163/6610 Winter 2013. Usability Experiments. Predict the relationship between two or more variables. Independent variable is manipulated by the researcher. Dependent variable depends on the independent variable.
E N D
Controlled User studies HCI - 4163/6610 Winter 2013
Usability Experiments • Predict the relationship between two or more variables. • Independent variable is manipulated by the researcher. • Dependent variable depends on the independent variable. • Typical experimental designs have one or two independent variable. • Validated statistically & replicable. 2
True Experiment • Experimental control • Control as many potential threats to validity as possible • Random assignment of participants/data to conditions • Could be within-subjects or between-subjects
Control • True experiment = complete control over the subject assignment to conditions and the presentation of conditions to subjects • Control over the who, what, when, where, how • Control of the who => random assignment to conditions • Only by chance can other variables be confounded with IV • Control of the what/when/where/how => control over the way the experiment is conducted
Quasi-Experiment • When you can’t achieve complete control • Lack of complete control over conditions • Subjects for different conditions come from potentially non-random pre-existing groups (smokers vs nonsmokers)
It’s a matter of control True Experiment Quasi Experiment • Random assignment of subjects to condition • Manipulate the IV • Control allows ruling out of alternative hypotheses • Selection of subjects for the conditions • Observe categories of subjects • If the subject variable is the IV, it’s a quasi experiment • Don’t know whether differences are caused by the IV or differences in the subjects
Other features • In some instances cannot completely control the what, when, where, and how • Need to collect data at a certain time or not at all • Practical limitations to data collection, experimental protocol
Validity • Internal validity is reduced due to the presence of controlled/confounded variables • But not necessarily invalid • It’s important for the researcher to evaluate the likelihood that there are alternative hypotheses for observed differences • Need to convince self and audience of the validity
External validity • If the experimental setting more closely replicates the setting of interest, external validity can be higher than a true experiment run in a controlled lab setting • Often comes down to what is most important for the research question • Control or ecological validity?
Terminology • Factors: Independent Variables (Ivs) of an experiment • Level: particular value of an IV • Condition: a group or treatment (technique) • e.g., Condition 1: old system, Condition 2: new system • Treatment: a condition of an experiment • Subject: participant (can also think more broadly of data sets that are ‘subjected’ to a treatment)
Factors to Treatments • At least 1 Factor (IV) has to vary to have an experiment • Effect of screen size and input technique on performance (speed, accuracy) • An IV must always have at least 2 levels • Condition refers to a particular way that subjects are treated • Between subject: experimental conditions are the same as the groups • Within subjects: only 1 group, that experiences every condition (can be many conditions in an experiment)
Good Experimental Design • Two-Group, Post-Test Design • Two conditions • Two groups: • Between subjects: random allocation • Treatment • Post-test: measure the DV • What’s really important?
Experimental designs • Between subjects: Different participants - single group of participants is allocated randomly to the experimental conditions. • Within subjects: Same participants - all participants appear in both conditions. • Matched participants - participants are matched in pairs, e.g., based on expertise, gender, etc. 13
Within-subjects • Similar to the one-group pre-test-post-test design • It solves the individual differences issues • But raises other problems: • Need to look at the impact of experiencing the two conditions • Will they get tired? Gain practice? Learn what is expected? • Need to control for order and sequence effects?
Order Effects • Changes in performance resulting from (ordinal) position in which a condition appears in an experiment (always first?) • Arises from warm-up, learning, fatigue, etc. • Effect can be averaged and removed if all possible orders are presented in the experiment and there has been random assignment to orders
Sequence effects • Changes in performance resulting from interactions among conditions (e.g., if done first, condition 1 has an impact on performance in condition 2) • Effects viewed may not be main effects of the IV, but interaction effects • Can be controlled by arranging each condition to follow every other condition equally often
Counterbalancing • Controlling order and sequence effects by arranging subjects to experience the various conditions (levels of the IV) in different orders • Self-directed learning: investigate the different counterbalancing methods • Randomization • Block Randomization • Reverse counter-balancing • Latin squares and Greco squares (when you can’t fully counterbalance) • http://www.experiment-resources.com/counterbalanced-measures-design.html
Key points 1 • Usability testing is done in controlled conditions. • Usability testing is an adapted form of experimentation. • Experiments aim to test hypotheses by manipulating certain variables while keeping others constant. • The experimenter controls the independent variable(s) but not the dependent variable(s). 19