210 likes | 221 Views
Learn how to analyze and present usability test results effectively. Understand terms like qualitative and quantitative data, statistical tests, t-tests, and significance levels. Explore examples and techniques for reporting findings accurately.
E N D
Day 10 Analysing usability test results
Objectives • To learn more about how to understand and report quantitative test results • To learn about some basic statistical terms • To learn about t-tests • To learn whether obtained results are “significant”
Analysing and presenting results - qualitative • Qualitative data • You could group comments from participants that seem to go together and explain how many had what problem • For example, • 4 stated that they did not realise you would click on “see it” to find the price of an item • 2 stated that they searched the whole site and didn’t find the price of the item • In your report, you could give quotations to back up what you are saying (this shows more clearly that this is not your subjective feeling … it makes it objective)
Analysing and presenting results- quantitative • Quantitative data • You might want to report things like: • how many clicks people took on a task compared to an expert • how much time people took on a task with design A compared with design B • how many errors novice users made compared with another group who are experienced users of certain types of software • But, just reporting the numbers is not enough!
The garden.com study from yesterday • http://www.cit.gu.edu.au/~mf/uidweek10/ergosoft.pdf • For each task, they reported • the means (calculated for the participants) • whether the mean for the participants differed from an expert (Y/N) • The Standard Deviation (a measure of how much the data is dispersed around the mean … how consistent the data is) • But, this is not enough!
Statistical tests • Notice that they also said they did statistical tests “to determine whether real differences exist” between the participants and the expert • They should have given more details • what statistical test • the values obtained from the statistical tests
Statistical test – some background • The normal curve • The standard deviation • Types of “experiments” • p values • t-tests • single sample, with hypothesised mean • independent samples • correlated samples
The standard deviation • A measure of the variability of the data about the mean • A large standard deviation means the values obtained from the subjects vary a lot from the mean • A small standard deviation means the values obtained from the subjects vary little from the mean
Why is the standard deviation important? • Table 7 of the garden.com study http://www.cit.gu.edu.au/~mf/uidweek8/ergosoft.pdf • Compare task 1 and task 3 • Statistical tests can take this SD difference into account • The appropriate statistical test is the t-test
The t-test • The t-test will tell you whether one set of means are really different from another set • That is, it is a statistical test to compare means • There are really 3 kinds of t-tests • Single sample • when you are comparing participant means with an expert (we’ll call this the hypothesised mean) • Independent samples • when you are comparing performance by two groups • Correlated samples • when you are comparing one group tested in two different situations
Single sample test • Where you have one group of subjects and test them against one mean • for example • one value is obtained from one expert, which is then assumed to be the mean for some expert group • or it could be more like a benchmark, and you compare the means of the participants to some benchmark
Condition Condition Group 2 members Group 1 members Independent samples • This is where you have data from different groups of subjects • for example • you have novices and experienced users and you are comparing the means of the two groups
Condition 1 Condition 2 Group members Correlated samples • This is where you use the same subjects for two different measures and want to compare them • for example • you give subjects 2 tasks and see if they found one harder than the other
The p-value • When you run a statistical test, you get a p-value • p-value stands for probability value • The aim of statistical tests is to determine whether the results could have occurred by chance • If it is very unlikely that certain results occurred by chance then there is probably some other reason; for example, maybe novice users get more confused than expert users
The importance of p < .05 • Usually, if results could be obtained by chance less than 5 times in a hundred, we say the results are significant • When you do a statistical test, you will get a p-value expressed as a decimal; for example p=.04 (the probability of getting the results by chance is just 4 in 100) • Any p < .05 is significant: you can assume your observed differences are significant
One and two tailed t-tests • one-tailed test: used when you have predicted the direction of the difference; for example, novices will use more clicks than experts • two-tailed test: used when you have predicted a difference, but have not stated the direction of the difference; for example, there will be a difference in performance between males and females
Today’s lab • We will run some t-tests on some fake data • http://faculty.vassar.edu/lowry/VassarStats.html