760 likes | 933 Views
Review of yesterday. Review of yesterday. Nice biological thinking! Don’t FOCUS too much on error sources! Your study may be correct in finding no difference or no relationship… Use introduction for hypothesis about biology, not theory of statistics.
E N D
Review of yesterday • Nice biological thinking! • Don’t FOCUS too much on error sources! • Your study may be correct in finding no difference or no relationship… • Use introduction for hypothesis about biology, not theory of statistics. • English. Do spell check. Help each other. • web resources?
Review of yesterday • Do read the basic manuals • Open and Save in R • Alternative 1: xls.ReadWrite read.xls() • Alternative 2: copy read.delim() • Save graphs • Ctrl + C (in graph) Ctrl + V (in word) bitmap • Ctrl + W (in graph) Ctrl + V (in word) • windows meta file
Continuous variables • If possible Import as Excel with xlsReadWrite • If not be ware of 1,5 or 1.5 • comma use read.delim2(”clipboard”) • period use read.delim(”clipboard”) • Also: Which is your response and which is your explanatory? • Avoid Length..in.mm. Write length
Copy paste statistics Only change green fat code! plot(y~x,xlab=“Seed size”)
R tricks Ctrl + X copy AND paste arrow up last line history(Inf) all commands without> plot(y~x)
word pricks col="red" goes col=“red” Change this Computer tricks manual Or try Notepad++
16 14 12 10 8 6 4 Red ants Black ants Logistic regression 2 2 tables Categoric 1.0 Melica 0.8 0.6 Prob. of choosing Melica 0.4 0.2 0.0 Response variable Luzula 4.5 5.5 6.5 7.5 Ant size Regression Anova Continuous - - Seed size Continuous Categoric Explanatory variable
16 14 12 10 8 6 4 Red ants Black ants Response variable Regression Anova Continuous - - Seed size Continuous Categoric Explanatory variable
R tricks plot(y~x,cex=2,cex.lab=1.5,cex.axis=2) Change the size of the graph window. Want an extra graph window? x11()
Birch– reproductive cost Change the size of the graph window in R, not word.
Outliers • Causes: • Typingerrors. • Data points affected by unwanted stuffs. • Biologically relevant data points • Butperhaps given unproportionallylargeeffect on result… • Tell the reader if you have removed any!
Lichens Standardised study design. Only lichens between 0.5 and 1.5 meters? Only on trunk? Only…
Can you really trust your studies? • Risk of chance effects. • By chance you maybe happened to get those special individuals…
Permutations • DecoupleResponse AND explanation: • Take all x-values. • Put them in a box and shake. • Pour them back in the x-column • Now there should be no relationship or no difference. Right? • But how large differences can you get by chance (with a 95 % probability)?
Lego shrimps I • Does shrimp size depend on water quality? • Red piece = shrimp size(y, response) • Blue or green piece = clean or polluted water (x, explanatory)
Lego shrimp I • Does shrimp size depend on water quality? • Red piece = shrimp size(y, response) • Blue or green piece = clean or polluted water (x, explanatory) • If we shuffle the x variable (red or green pieces) what difference may we get by chance? How large?
Area under the curve 95% No. random samples 2,5% 2,5% Difference
Risk of by chance only < 5 % 95% No. random samples 2,5% 2,5% Difference
Risk by chance = 1,4 +1,4 = 2,8 % 95% No. random samples 1,4% 1,4% Difference
p-value • 2,8 % is the probability (p) that there is NO real difference. • p = 0,028 means that there is a 2,8% chance that the groups do not really differ, but that we by chance get the data points that we collected. • 2,8 % probability is ridiculously small! We don’t believe in that! • If it’s is not just due to chance, it must depend on something else… • … e.g., water quality. The habitats differ significantly. • p < 0,05 counts as ridiculously small
What will affect the p-value? • The difference between groups • (… between their means) • The variation within groups • The sample size • (unreliability of group means) • Variation • Sample size
t-test in R • t.test(y~x,var.equal=T)
Under the hood • Competent Drivers vs. Mechanics
Difference between means • Female mean = 9 • Male mean = 13 • Difference = 13-9=4Soft! • But the unreliability?
Measure of variation? • Variation ≈ red lines! • Mean red line length? • Nope! • absolute values hard • Instead: • ≈ Mean squared red lines!
Variance • ≈ Mean squared red lines!
Degrees of freedom • For a group variance the df = n-1 • Why? – • To calculate a variance the mean is required! (y-mean(y))2 • But given a mean, only n-1 data point variations (y-mean(y)) can freely change and be used to estimate the variance. • If we independently change n-1 deviations, the last one can't be independent. • It must sum up with the rest to zero. • It's "freedom" is locked, used, by the mean!
Variance & Standard deviation • Standard deviation = SD • Sometimes used to show variability in graphs • ±1 SD = 68% of data points • ±1.96 SD = 95% of data points var(y) = 3.3 sd(y) = 1.8
How much is 3.1? 1% 1% 2.5% 2.5%
How much is 3.1? 1% 1% 2.5% 2.5%