Review of last week

Review of last week

Variables • Response variable – y • Explanatory variable – x • [ today one of each ] • Continuous variables • Categorical variables (binary…)

16 14 12 10 8 6 4 Red ants Black ants Logistic regression 2  2 tables Categoric 1.0 Melica 0.8 0.6 Prob. of choosing Melica 0.4 0.2 0.0 Response variable Luzula 4.5 5.5 6.5 7.5 Ant size Regression Anova Continuous - - Seed size Continuous Categoric Explanatory variable

16 14 12 10 8 6 4 Red ants Black ants One continuous response variable& one or more explanatory variables Generel linear model Regression Anova + Continuous Response variable - - Seed size Continuous Categoric Explanatory variable

Generel linear models with: Many continuousexplanatoriesare usually called multiple regression Many categorical explanatoriesare usually called multiway ANOVA One continuousexplanatory and one (or sometimes many) categorical explanatories are usually called ANCOVA.

Test tools t-test F-test (anova) Chi-2

Test tools t-test  F-test (anova) Chi-2

Mail me the data, please.

Anova table on pollination Anova(lm(Seed.number~colour*poll.treat)) Sum Sq Df F value Pr(>F) colour 3122.3 1 19.8440 2.846e-05 *** poll.treat 1693.1 1 10.7604 0.001567 ** col:pol 829.8 1 5.2737 0.024406 * Residuals 11958.0 76

Assumptions for parametric tests with continuous response i.e., also linear models!! About the same variation in all groups or along a continuous variable or along fitted values Pretty normal residuals (= noice)

About the same variation? Forest Meadow

Pretty normal residuals Histogram of residuals Histogram of response variable seed size 14 20 Meadow 12 15 10 Forest No. species No. species 8 10 6 4 5 2 0 0 -1 -0,5 0 +0,5 0 1 2 3 Seed size in mm Distanse in mm from respective group mean

[,1] [,2] [,3] [,4] [,5] [,6] [1,] 2 3 4 5 6 7 [2,] 3 4 5 6 7 8 [3,] 4 5 6 7 8 9 [4,] 5 6 7 8 9 10 [5,] 6 7 8 9 10 11 [6,] 7 8 9 10 11 12

[,1] [,2] [,3] [,4] [,5] [,6] [1,] 1 2 3 4 5 6 [2,] 2 4 6 8 10 12 [3,] 3 6 9 12 15 18 [4,] 4 8 12 16 20 24 [5,] 5 10 15 20 25 30 [6,] 6 12 18 24 30 36

200 500 Plus effect or percent-effect 400 Plus effect From 0-5: 100 + 100 = 200 From 5-10:200 + 100 = 300 300 No. of Aphids 300 y 200 100 100 0 0 1 2 3 4 5 6 7 8 9 10 Weeks x

200 500 500 Plus effect or percent effect 400 Plus effect From 0-5: 100 + 100 = 200 From 5-10:200 + 100 = 300 300 No. of aphids 300 y 200 100 Percent effect From 0-5: 80  2,5 = 200 From 5-10:200  2,5 = 500 100 80 0 0 1 2 3 4 5 6 7 8 9 10 Weeks x

Non transformed Log transformed 500 500 200 400 No. of aphids No. of aphids 50 300 20 200 10 5 100 2 1 0 0 1 2 3 4 5 6 7 8 9 10 0 2 4 6 8 10 Weeks Weeks

500 400 300 200 100 0 0 1 2 3 4 5 6 7 8 9 10 Plus effect or percent per percent 400 Seed weight in μg Plus effect From 0-5: 100 + 100 = 200 From 5-10:200 + 100 = 300 300 y 100 100 Percent per percent From 2,5 till 5 = 200%: 100  200% = 200 From 5 till 10 = 200%: 200  200% = 400 Leaf length in cm x

Non transformed Log Log transformed 500 Seed weight in μg Seed weight in μg 200 400 50 300 200 10 5 100 2 1 0 0 1 2 3 4 5 6 7 8 9 1 2 5 10 Leaf length in cm Leaf length in cm

A lichen size study

5 possible models • Lichen size only depends on the total mean. • Lichen size depends on what site the lichen grows (city vs university). • Lichen size depends on the tree size (≈ age?). • Lichen size depends both on site AND tree size. • Lichen size depends on tree size, but the relationship between tree size and lichen size differs between the sites (city / univ).

Check your data import > names(d) [1] "tree.circum" "lich.diam" "tree.spec" "site” > is.numeric(lich.diam) [1] TRUE > is.numeric(tree.circum) [1] TRUE > levels(site) [1] "city" "uni"

Check your data import > names(d) [1] "tree.circum" "lich.diam" "tree.spec" "site" > is.numeric(lich.diam) [1] TRUE > is.numeric(tree.circum) [1] FALSE > levels(site) [1] "city" "ciyt" "uni"

Assumption plots

Should you log your lichen sizes? • Does it look so bad that your test may be incorrect? • Does a log transformation improve the model assumptions? •  Constant variation most important. • Does it make biological sence that the explanatory variables affect the percent increase in lichen size rather than the increase in mm?

Assumption plots on logged lichen sizes

Should you log your lichen sizes? • Does it look so bad that your test may be incorrect? – Naa, probably not. • Does a log transformation improve the model assumptions? – YES! • Does it make biological sence with a percent increase? – Well I guess so. • OK, let’s use the logged values!

Logged lichen size

The lichen size study

A B C D E F G Mainland Most: Fewest

5 possible models • Log lichen size only depends on the total mean. • Log lichen size depends on what site the lichen grows (city vs university). • Log lichen size depends on the tree size (≈ age?). • Log lichen size depends both on site AND tree size. • Log lichen size depends on tree size, but the relationship between tree size and log lichen size differs between the sites (city / univ).

Models log.lich.diam<-log10(lich.diam) log.mod.int<-lm(log.lich.diam~tree.circum+site +tree.circum:site) log.mod.both<-lm(log.lich.diam~tree.circum+site) log.mod.tree.circum<-lm(log.lich.diam~tree.circum) log.mod.site<-lm(log.lich.diam~site) log.mod.null<-lm(log.lich.diam~1)

Anova table on logged lichens Anova(lm(log.lich.diam~tree.circum+site+ tree.circum:site)) = Anova(log.mod.int) Response: log.lich.diam Sum Sq Df F value Pr(>F) tree.circum 0.5808 1 9.7584 0.002826 ** site 0.5431 1 9.1238 0.003797 ** tree.circum:site 0.0047 1 0.0784 0.780444 Residuals 3.3332 56

Model comparison

Test interaction! anova(log.mod.int,log.mod.both) Model 1: log.lich.diam ~ tree.circum + site + tree.circum:site Model 2: log.lich.diam ~ tree.circum + site Res.Df RSS Df Sum of Sq F Pr(>F) 1 56 3.3332 2 57 3.3378 -1 -0.0047 0.0784 0.7804

Test site! anova(log.mod.both,log.mod.tree.circum) Model 1: log.lich.diam ~ tree.circum + site Model 2: log.lich.diam ~ tree.circum Res.Df RSS Df Sum of Sq F Pr(>F) 1 57 3.3378 2 58 3.8809 -1 -0.5431 9.2737 0.003516 **

Test tree circumference! anova(log.mod.both,log.mod.site) Model 1: log.lich.diam ~ tree.circum + site Model 2: log.lich.diam ~ site Res.Df RSS Df Sum of Sq F Pr(>F) 1 57 3.3378 2 58 3.9187 -1 -0.5808 9.9188 0.002605 **

Conclusion: • Log lichen size depends both on site AND tree size. • Lichens are larger at the University than in the city (p = 0.0035 given the effect of tree size). • Lichen size decreases with increasing tree size (p = 0.0026 given the effect of site)

Review of last week

Review of last week

Presentation Transcript

Review—urban planning from last week

Last Week

Review of Last Week

Last Week

Last week

Last week

In Review from Last Week

Bellwork – Review of last week

Review from Last Week

Review of Last Week

Review from Last Week

Review – Last Week

Last week:

REVIEW LAST WEEK

Last week…

Last Week

Last Week

Last Week

Review of Last Week

last week . . .

Review last week

Last Week.