430 likes | 550 Views
Blocks and pseudoreplication. This lecture will cover:. Blocks Experimental units (replicates) Pseudoreplication Degrees of freedom. Good options for increasing sample size : More replicates More blocks False options for increasing sample size : More “repeated measurements”
E N D
This lecture will cover: • Blocks • Experimental units (replicates) • Pseudoreplication • Degrees of freedom
Good options for increasing sample size: • More replicates • More blocks • False options for increasing sample size: • More “repeated measurements” • Pseudoreplication
Ecological rule #1: the world is not uniform! Medium patch Poor patch Good patch
3 options in assigning treatments: • Randomly assign • Systematic • Randomized block Poor patch Medium patch Good patch
1. Randomly assign Poor patch Medium patch Good patch Statistically robust Pros? Cons? With small n, chance of all in a bad patch
1. Randomly assign Poor patch Medium patch Good patch What’s the chance of total spatial segregation of treatments? Pros? Cons?
2. Systematic Poor patch Medium patch Good patch No clumping possible Pros? Cons? Violates random assumption of statistics…but is this so bad?
3. Randomized block Poor patch Medium patch Good patch BLOCK B BLOCK C BLOCK A
3. Randomized block BLOCK B BLOCK C BLOCK A • Note: • Do not have to know if patches differ in quality • Must have all treatment combinations represented in each block • If WANT to test treatment x block interaction, need replication within blocks
How to analyze a blocked design in JMP (Method 1) • Basic stats> Oneway. • Add response variable, treatment (“grouping”) and block. • Click OK
How to analyze a blocked design in JMP (Method 2) • Open fit model tab. Enter y-variable. • Add treatment, block and –if desired- treatment x block to “effects”. • Click on block in effects box and change attributes to random. 4. Change Method option to EMS (not REML)
Good options for increasing sample size: • More replicates • More blocks • False options for increasing sample size: • More “repeated measurements” • Pseudoreplication
Experimental unit Scale at which independent applications of the same treatment occur Also called “replicate”, represented by “n” in statistics
Experimental unit Example: Effect of fertilization on caterpillar growth
Experimental unit ? + F + F - F - F What is our per treatment sample size? What is our treatment n? n=2
Experimental unit ? + F - F n=1
Pseudoreplication Misidentifying the scale of the experimental unit; Assuming there are more experimental units (replicates, “n”) than there actually are
Example 1. Hypothesis: Insect abundance is higher in shallow lakes
Example 1. Experiment: Sample insect abundance every 100 m along the shoreline of a shallow and a deep lake
Example 2. What’s the problem ? Spatial autocorrelation
Example 2. Hypothesis: Two species of plants have different growth rates
Example 2. • Experiment: • Mark 10 individuals of sp. A and 10 of sp. B in a field. • Follow growth rate • over time If the researcher declares n=10, could this still be pseudoreplicated?
Example 2. time
Temporal pseudoreplication: Multiple measurements on SAME individual, treated as independent data points time time
Spotting pseudoreplication • Inspect spatial (temporal) layout of the experiment • Examine degrees of freedom in analysis
Degrees of freedom (df) Number of independent terms used to estimate the parameter = Total number of datapoints – number of parameters estimated from data
Example: Variance If we have 3 data points with a mean value of 10, what’s the df for the variance estimate? Independent term method: Can the first data point be any number? Yes, say 8 Can the second data point be any number? Yes, say 12 Can the third data point be any number? No – as mean is fixed ! Variance is (y – mean)2 / (n-1)
Example: Variance If we have 3 data points with a mean value of 10, what’s the df for the variance estimate? Independent term method: Therefore 2 independent terms (df = 2)
Example: Variance If we have 3 data points with a mean value of 10, what’s the df for the variance estimate? Subtraction method Total number of data points? 3 Number of estimates from the data? 1 df= 3-1 = 2
Example: Linear regression Y = mx + b Therefore 2 parameters estimated simultaneously (df = n-2)
Example: Analysis of variance (ANOVA) A B C a1 b1 c1 a2 b2 c2 a3 b3 c3 a4 b4 c4 What is n for each level?
Example: Analysis of variance (ANOVA) A B C a1 b1 c1 a2 b2 c2 a3 b3 c3 a4 b4 c4 df = 3 df = 3 df = 3 n = 4 How many df for each variance estimate?
Example: Analysis of variance (ANOVA) A B C a1 b1 c1 a2 b2 c2 a3 b3 c3 a4 b4 c4 df = 3 df = 3 df = 3 What’s the within-treatment df for an ANOVA? Within-treatment df = 3 + 3 + 3 = 9
Example: Analysis of variance (ANOVA) A B C a1 b1 c1 a2 b2 c2 a3 b3 c3 a4 b4 c4 If an ANOVA has k levels and n data points per level, what’s a simple formula for within-treatment df? df = k(n-1)
Spotting pseudoreplication An experiment has 10 fertilized and 10 unfertilized plots, with 5 plants per plot. The researcher reports df=98 for the ANOVA (within-treatment MS). Is there pseudoreplication?
Spotting pseudoreplication An experiment has 10 fertilized and 10 unfertilized plots, with 5 plants per plot. The researcher reports df=98 for the ANOVA. Yes! As k=2, n=10, then df = 2(10-1) = 18
Spotting pseudoreplication An experiment has 10 fertilized and 10 unfertilized plots, with 5 plants per plot. The researcher reports df=98 for the ANOVA. What mistake did the researcher make?
Spotting pseudoreplication An experiment has 10 fertilized and 10 unfertilized plots, with 5 plants per plot. The researcher reports df=98 for the ANOVA. Assumed n=50: 2(50-1)=98
Why is pseudoreplicationa problem? Hint: think about what we use df for!
How prevalent? Hurlbert (1984): 48% of papers Heffner et al. (1996): 12 to 14% of papers