430 likes | 851 Views
Beginners statistics. Assoc Prof Terry Haines. 5 simple steps. Understand the type of measurement you are dealing with Understand the type of question you are asking Select a test Focus today on tests of difference Check assumptions where relevant Run the test. Measurement.
E N D
Beginners statistics Assoc Prof Terry Haines
5 simple steps • Understand the type of measurement you are dealing with • Understand the type of question you are asking • Select a test • Focus today on tests of difference • Check assumptions where relevant • Run the test
Measurement • Assigning numerals to variables • Nominal • Ordinal • Interval • Ratio • Count
Nominal • Categories without order • Gender • Male / Female • Diagnosis • Orthopaedic / neurological / cardiorespiratory
Nominal • Entering categorical data on a spreadsheet • Binary / dichotomous data • Eg. gender • One column (female=0, male=1) • Polytomous data • Eg. Diagnosis • Can have one column (ortho=0, neuro=1, cardio=1) • Risk that the numeric values will be misused • Can have three “dummy” variables / columns • Ortho (no=0, yes=1) • Neuro (no=0, yes=1) • Cardio (no=0, yes=1)
Ordinal • Categories with order, but we don’t know how much better one place is than another • Finishing order in a race • 1st, 2nd, 3rd • Likert scaled surveys • Strongly agree, agree, undecided, disagree, strongly disagree • Entering data • One column – make sure you record what numbers mean
Interval • Equal intervals between numbers, but not a true zero • Eg. Degrees centigrade, IQ test scores, calendar years AD • Entering data • Input the number
Ratio • Equal intervals between numbers, a true zero • Eg. Distance, age, time, weight • Entering data • Input the number
Count • Whole, non-negative numbers indicating the frequency of an event • Eg. Number of falls, number of steps, number of therapy sessions
Manipulating data • Can turn a higher level of measurement into a lower level, but not vice versa • Eg. IQ scores • 0-50 below average • 51-100 average • 100-150 above average • This leads to a “loss” of data and can conceal the true relationship between two variables This converts interval data to ordinal
Nominal, ordinal, interval, ratio, count Can manipulate data down this scale but not up Be careful in doing this Loss of data Would need a really good reason to do so Questions on measurement scales? Measurement
What sort of question is being asked? • Is A≠B? • Is A>B? • Is A<B? • Is A=B? • Is A~B? Difference Agreement / reliability / prediction Correlation
Difference A B A B A B
Difference • The confusing thing is that we test a null hypothesis. • What is the probability that there is no difference in the broader population • For the one null hypothesis, there are three alternate hypotheses possible • Is A≠B? • Is A>B? • Is A<B? • The magnitude of difference can also be measured
Agreement / reliability / prediction • To what extent do two variables tell us exactly the same thing, or can one variable predict a later variable? A B A B
Agreement / reliability / prediction • The statistical procedures of agreement / reliability / prediction test a null hypothesis • What is the probability that the amount of agreement / reliability / prediction observed occurred by chance? • The magnitude of agreement can also be described
Correlation • To what extent do two variables co-relate to each other • They do not have to agree in order to co-relate • The statistical procedures of correlation test a null hypothesis • What is the probability that the amount of association observed occurred by chance? • The magnitude of correlation can also be described
Understand the question • Any questions on • Difference • Agreement / reliability / prediction • Correlation
Statistical testing • Why do it? • Eg. The average height of men in this room is 179 cms, the average height of women is 163 cms. • I know the men in this room are taller by 16 cms – Why do a test?
Statistical testing • We normally want to extrapolate the results from our sample to a broader population • It is the nature of the relationship between A and B in the broader population that is of greatest interest than what is going on just inside this room
Select a test • Tests will vary depending on • Measurement scale of variable A and variable B • The type of question being asked • Whether there are repeated measures or correlated samples involved
Selecting a test: Correlation • First check visually, then • Pearson’s R • Can also use linear regression for further description of the correlation
Correlation • Height vs weight • Pearson’s r
Y = bX + c What do these numbers mean? For each one unit increase in weight, there is a 0.87 increase in height. Height = 0.87*weight + 104.04
Does this work when one variable is dichotomous? Height = 13.3*gender(0,1) + 166.7
Some tricky questions • Can we have: • A is different to B, but A correlates with B? • A agrees with B, and A correlates with B? • A is not different to B, and A does not correlate with B?
A and B are different, but highly correlated Confidence intervals so narrow and p-value so low they can’t be calculated
But is there really no relationship here? Linear regression only looks for linear (straight line) relationships. Data transformations or other forms of regression are needed here.
Checking assumptions • Many assumptions surround most statistical tests • Need to check to make sure you are doing the right thing by your data • There are specific tests to check assumptions • When in doubt, use visual examination of your data
Run the tests • Can use Excel for some tests • Gives you a single number output • We have been using Stata today • Lot’s more output to help you interpret your data
Any questions? • Next month – 31st March • Starting small and research question development • Dr Elizabeth Skinner