1 / 51

Cold shitty weather. Sorry! Get warm with Glögg!

Cold shitty weather. Sorry! Get warm with Glögg!. Power recap. Power recap. It is good to fake data BUT p-values of 1 fake data is crap!. Power recap. A 1000 simulation Power analyses is not crap! BUT Power depends on: Sample size Effect size Variation.

binh
Download Presentation

Cold shitty weather. Sorry! Get warm with Glögg!

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cold shitty weather. Sorry! Get warm with Glögg!

  2. Power recap

  3. Power recap • It is good to fake data • BUT • p-values of 1 fake data is crap!

  4. Power recap • A 1000 simulation Power analyses is not crap! • BUT • Power depends on: • Sample size • Effect size • Variation

  5. Project considerations I • Make graphs • Check for outliers • Check assumptions • Decide if you want to transform y and or x • Check VIF • Are your assumtions still f**k*d up. •  Well, that’s for today.

  6. Project considerations II • Interpret interactions first! • If they are significant: Are main effects still interpretable? • Distinguish between: y ~ x1 and y ~ x1 given x2 • – Simplify your models!

  7. 16 14 12 10 8 6 4 Red ants Black ants Logistic regression 2  2 tables Categoric 1.0 Melica 0.8 0.6 Prob. of choosing Melica 0.4 0.2 0.0 Response variable Luzula 4.5 5.5 6.5 7.5 Ant size Regression Anova Continuous - - Seed size Continuous Categoric Explanatory variable

  8. 16 14 12 10 8 6 4 Red ants Black ants Response variable Regression Anova Continuous - - Seed size Continuous Categoric Explanatory variable

  9. Assumptions for parametric tests with continuous response i.e., also linear models!! About the same variation in all groups or along a continuous variable or along fitted values Pretty normal residuals (= noice)

  10. The residuals… … are the noice that is not explained by the explanatory variable(s) In a regression the residuals are the distance from the data points to the regression line In an Anova the residual are the distance to the group mean In a linear model the residuals are the distance from the data points to the fitted values.

  11. Residuals

  12. Residuals

  13. Assumption check

  14. Assumption check

  15. Assumption check

  16. Solutions • Poisson for counts (generalized linear model) • Non-parametric tests • Resampling methods • Permutation • Bootstrap • Binarize your response

  17. quasipoisson fit on Xanthoria

  18. Log Xanthoria apothecia

  19. Poisson distribution • Response = Numbers (not true continuous) • Examples • Are there more maple seedlings close to a maple? • Response = number per square • m1<-glm(number~distance,family=Poisson)

  20. Poisson distribution • Usually log(y) also works fine. • Poisson excells: • small means • many zeroes • Many zeroes  Hurdle models

  21. break?

  22. Non-parametric tests • Based on ranked values instead of actual data.

  23. Non-parametric tests • Still often in use. • Questionable with modern computers. •  In principle permutions of ranked values •  But worse than ”real” permutations, because information about actual data values is discarded.

  24. Non-parametric tests • Still often in use. • Questionable with modern computers. •  In principle: permutions of ranked values •  But worse (than ”real” permutations) because information about actual data values is discarded. BENEFIT: Calm dow outliers! 

  25. 16 14 12 10 8 6 4 Red ants Black ants Response variable Regression Anova 2 groups: also t-test Continuous - - Seed size Continuous Categoric Explanatory variable

  26. 16 14 12 10 8 6 4 Red ants Black ants Response variable Kendall rank correlation also: Spearman rank Kruskal-Wallis also: Mann-Whitney U-test Paired: Sign test(=binomial) Continuous - - Seed size Continuous Categoric Explanatory variable

  27. Permutations • Does not require normal distribution • BUT, does require distributions to be equal if your hypothesis is not true. •  Example: • If the lichens are equally large in the city as they are at campus, they must have the same variation and e.g., skewness. >(cf. non-par!) • In principle a test of if the distributions differ.

  28. Ash seed dispersal

  29. Acer twigs – plasticity

  30. Birch – cost of reproduction

  31. break?

  32. bootstrap • to pull oneself up by one's bootstraps • to succeed only on one's own effort or abilities.

  33. shrimp-booting...

  34. Rumex crispus Rumex longifolius 300 250 250 200 200 150 150 100 100 50 50 0 0 1.0 1.1 1.2 1.3 1.4 1.5 1.25 1.30 1.35 1.40

  35. Confidence intervals • …shows how sure we are of a group mean. • The confidence interval will contain the ”true” mean in 95 % of the time. • The larger our sample size the more sure (= confident!) we are of our sample mean  the confidence interval decreases • And (of course…), the more variation within groups, the less sure we get  confidence interval increases

  36. Bootstrap for tests 120 80 No. boot-samples 60 40 20 0 -5 0 5 10 15 20 25 boot.difference

  37. Bootstrap • Does not require normal distribution of residuals. • Does not require the same variation. • Only requirement is that what you bootstrap (e.g., means) are the same if your hypothesis is not correct. • And, in practice, a large, representative sample

  38. moss.shoot ~ forest type 2000 1500 1000 500 0 0 5 10 15 Bootstrapped difference in moss shoot length

  39. Bootstrap • We use the functionsample(row.names(d),replace=T) • More advanced (and better):library(boot)?boot?boot.ci

  40. Binarize your response • If all other efforts sucks: • Binarize your response • Nothing vs Something • Above the median vs Below the median • bin.y<-ifelse(y < median(y),0,1) • bin.y<-factor(bin.y) •  Then do a logistic regression, 2×2, or a generalized linear model

  41. Friday Morning 08.00

  42. Friday Lunch 10.30

  43. Friday Afternoon 14.00

  44. Mail me and your "opponent"! • Your handout/abstract • Before 14.30. • One page. Uno. Odjin. • Mail me your Powerpoint before 17.00 or bring it on USB memory stick. • Compress images to reduce size!

  45. Max 15 min presentation Before you make your powerpoint: Watch this film: http://www.davidairey.com/how-not-to-use-powerpoint/

  46. Mail me your data! • excel file • Help option booking list

  47. Computer exercise • Use yor own data (if cont resp!). • Or old data. • Use either a continuous or categorical explanatory. • Possible also for many explanatories? • Non-parametric  Well, usually not • Permutation  Yes, but hard • Bootstrap  Yes, easy • Binarizing  Yes, easy

  48. Exam • Read Learning goals • Read book in relation to learning goals • E.g., no GAM, Survival, Bayesian • Check lecture powerpoints in relation to learning goals • Practice on understanding the excercises (they ARE in the learning goals)

  49. Lunch? or

More Related