1 / 70

Modeling Wim Buysse RUFORUM 1 December 2006

Modeling Wim Buysse RUFORUM 1 December 2006. Research Methods Group. Part 1. General Linear Models. Research Methods Group. General Linear Models. Dataset from. Research Methods Group. General Linear Models. Dataset from p. 89 - 95. Research Methods Group. General Linear Models.

Download Presentation

Modeling Wim Buysse RUFORUM 1 December 2006

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ModelingWim BuysseRUFORUM 1 December 2006 Research Methods Group

  2. Part 1. General Linear Models Research Methods Group

  3. General Linear Models Dataset from Research Methods Group

  4. General Linear Models Dataset from p. 89 - 95 Research Methods Group

  5. General Linear Models Effects of three levels of sorbic acid (Sorbic) and six levels of water activity (Water) on survival of Salmonella typhimurium (Density) Water density = log(density/ml) Research Methods Group

  6. General Linear Models ANOVA approach Research Methods Group

  7. General Linear Models Results Research Methods Group

  8. General Linear Models The same data, but each treatment is presented as a ‘dummy variable’. (Warning: for educational purposes only.) Research Methods Group

  9. General Linear Models Regression with a first independent variable. Research Methods Group

  10. General Linear Models We add a second independent variable. Research Methods Group

  11. General Linear Models We add a third one. Research Methods Group

  12. General Linear Models We add a fourth one. Research Methods Group

  13. General Linear Models We continue to construct the model. Research Methods Group

  14. General Linear Models Finally, the results. Research Methods Group

  15. General Linear Models Comparison of the two approaches. Research Methods Group

  16. General Linear Models • Comparison of the two approaches: • They give the same results (in terms of SS.) • The approach to choose depends on what you want to know. • The regression approach still works when the ANOVA approach is not possible anymore (for instance when there are missing values). Research Methods Group

  17. Example: modelling approach with normally distributed data. Protocol and dataset. Research Methods Group

  18. Example: modelling approach with normally distributed data. Data: Screening of suitable species for three-year fallow file = Fallow N.xls Protocol: p. 13 Research Methods Group

  19. Example: modelling approach with normally distributed data. The analysis approach is written down in chapter 19 of ‘Good statistical practice for natural resources research’ Research Methods Group

  20. Modelling approach: general • 5 steps: • (Visual) exploration to discover trends and relationships • Choose a possible model: • The trend you see • Knowledge of the experimental design • Biological/scientific knowledge of the process • Fitting = estimation of parameters • Check = assessing the ‘fit’ • Interpretation to answer the objectives. Research Methods Group

  21. Expanding the model • ANOVA and regression • Same calculations • Data • = pattern + noise • = systematic component + random component • Same assumptions • Systematic components are additive • Variability of the groups is similar • The random component is (rather) normally distributed. The random variability of “y” around the systematic component is not affected by this systematic component. Research Methods Group

  22. GENERAL LINEAR MODELS Research Methods Group

  23. GENERAL LINEAR MODELS Research Methods Group

  24. GENERAL LINEAR MODELS Data = pattern + noise Pattern: is explained by a linear combination of the independent variables (Data ≈ N(m,v) and the variance is rather constant across the different groups) Noise: N(0,1) and the variance is rather constant across the different groups Research Methods Group

  25. Expanding the model • If the data are not normally distributed or if the variance of the different groups is not similar: • Possible approach = transformation of the data = « linearising » the model • Problems: • You don’t work anymore on a scale that has a biological meaning. • Retransforming the standard errors back to the original scale is not possible anymore. Research Methods Group

  26. Expanding the model Better solution: GENERAL LINEAR MODELS => GENERALIZED LINEAR MODELS • Less restrictions; two essential differences: • Data can be distributed according to the family of exponential distributions = Normal, Binomial, Poisson, Gamma, Negative binomial • Link function: the link between E(Y) and the independent variables is not longer a linear combination of the independent variables. It is also possible that the linear combination of the independent variables is a function of can also be a linear combination of a function of E(Y). (We don’t transform the dependent variables but include the transformation into the model). Research Methods Group

  27. Expanding the model Better solution: GENERAL LINEAR MODELS => GENERALIZED LINEAR MODELS • Also: • - The systematic component (linear combination of independent variables) can include both continuous and categorical variables and even polynomials • But still: • The variance is constant across the different groups (or has become constant because of the transformation through the link function) Research Methods Group

  28. Generalised linear models Statistical theory is more difficult, but the menus in GenStat and the way you can interpret the output is very similar to what we know from ANOVA and regression. Research Methods Group

  29. = = Research Methods Group

  30. Example 1. Logistic regression Example: cardio-vascular disease according to age age and chd.xls Research Methods Group

  31. Example 1. Logistic regression Example: same data but according to age group Research Methods Group

  32. Example 1. Logistic regression Example: the linear regression is not an appropriate model and the predictions at the extremes will not be correct Research Methods Group

  33. Example 1. Logistic regression Example: test χ2 test: limited information Research Methods Group

  34. Example 1. Logistic regression • Bernoulli process: an (independent) event that can have two possible outcomes (1 – 0, success-failure, …); with a given probability of succes • Tossing a coin: head or tail; p = 0,5 • Throwing 6 with a dice (success) compared to throwing any other number; p = 1/6 • Conducting a survey: is the head of the household male or female?; calculate p from the proportion found in the collected data • Screening of cardio-vascular diseases. p disease = 43 out of 100 individuals = 0.43 Research Methods Group

  35. Example 1. Logistic regression • In GenStat Research Methods Group

  36. Example 1. Logistic regression • Logistic function Research Methods Group

  37. Example 1. Logistic regression • Logistic function • Sigmoid form • Linear in the middle • The probability is restricted between 0 et 1 • Small values: flatten towards 0; large values: flatten towards 1 Research Methods Group

  38. Example 1. Logistic regression • GenStat output • Similar, but ‘deviance’ instead of ‘variance’ and test χ2 instead of F Research Methods Group

  39. Example 1. Logistic regression • GenStat output • model • Logit(CHD) = -5,31 + 0,1109 AGE Research Methods Group

  40. Example 1. Logistic regression • Logit(CHD) = -5,31 + 0,1109 AGE Research Methods Group

  41. Example 1. Logistic regression Research Methods Group

  42. Example 1. Logistic regression • Binomial distribution: when we repeat the Bernoulli process, the order of success or failure can change • Example: head of household in a survey Research Methods Group

  43. Example 1. Logistic regression • Calculation of probabilities if success = female headed household with p = 0,2 Research Methods Group

  44. Example 1. Logistic regression • Calculated probabilities for obtaining success • We can now construct a frequency distribution of obtaining success • Probability = long-run frequency = frequency when very many data • = binomial distribution Research Methods Group

  45. Example 1. Logistic regression • Binomial distribution • Counts of a categorical variable • Example: experiment of survival of trees from different provenances • File: survival trees.xls Research Methods Group

  46. Example 1. Logistic regression • Several approaches possible 1 Research Methods Group

  47. Example 1. Logistic regression • Several approaches possible 1 Research Methods Group

  48. Example 1. Logistic regression • Several approaches possible 2 Research Methods Group

  49. Example 1. Logistic regression • Several approaches possible 2 Research Methods Group

  50. Example 1. Logistic regression • Several approaches possible 3 Research Methods Group

More Related