300 likes | 382 Views
Friday: Lab 3 & A3 due Mon Oct 1: Exam I this room, 12 pm Please, no computers or smartphones Mon Oct 1: No grad seminar Next grad seminar: Wednesday, Oct 10 Type II error & Power. Today. Table 7.1 Generic recipe for decision making with statistics
E N D
Friday: Lab 3 & A3 due • Mon Oct 1: Exam I this room, 12 pm Please, no computers or smartphones • Mon Oct 1: No grad seminar Next grad seminar: Wednesday, Oct 10 Type II error & Power
Table 7.1 Generic recipe for decision making with statistics • State population, conditions for taking sample • State the model or measure of pattern……………………………ST • State null hypothesis about population……………………………H0 • State alternative hypothesis……………………………………… HA • State tolerance for Type I error…………………………………… α • State frequency distribution that gives probability of outcomes whenthe Null Hypothesis is true. Choices: • Permutations: distributions of all possible outcomes • Empirical distribution obtained by random sampling of all possibleoutcomes when H0 is true • Cumulative distribution function (cdf) that applies when H0 is trueState assumptions when using a cdf such as Normal, F, t or chisquare • Calculate the statistic. This is the observed outcome • Calculate p-value for observed outcome relative to distribution of outcomes when H0 is true • If p less than α then reject H0 in favour of HAIf greater than α then not reject H0 • Report statistic, p-value, sample sizeDeclare decision
Table 7.2 Key for choosing a FD of a statistic • Statistic of the population is a mean • If data are normal or cluster around a central value • If sample size is large(n>30)……....…………Normal distribution • If sample size is small(n<30)……....…………t distribution • If data are Poisson………………………………..Poisson distribution • If data are Binomial………………………………Binomial distribution • If data do not cluster around central value, examine residuals • If residuals are normal or cluster around a central value • If residuals are normal or cluster around a central value • If sample size is large(n>30)……....…………Normal distribution • If sample size is small(n<30)……....…………t distribution • If residuals are not normal………………………Empirical distribution • Statistic of the population is a variance • If data are normal or cluster around a central value……...Chi-square • If data do not cluster around a central value • If sample size is large(n>30)……....… …Chi-square distribution • If sample size is small(n<30)……....…………Empirical
Table 7.2 Key for choosing a FD of a statistic - continued • Statistic of the population: ratio of 2 variances (ANOVA tables) • If data are normal or cluster around a central value…………….F dist • If data do not cluster around central value, calculate residuals • If residuals are normal or cluster around a central value……….F dist • If residuals do not cluster around a central value • If sample size is large(n>30)……....………………F distribution • If sample size is small(n<30)……....………………..…Empirical • Statistic is none of the above • Search statistical literature for apropriate distribution or confer with a statistician • If not in literature or can not be found…....………………..…Empirical
Example: jackal bones - revisited 1. 2. 3. 4. 5.
Example: jackal bones - revisited 6. Key 7.
Example: jackal bones - revisited 8. Calculate p from t dist
Example: jackal bones - revisited Is your data normal?
Example: jackal bones - revisited Is your data normal? It really does not matter! The assumption is that the residuals follow a normal distribution
Example: jackal bones - revisited Are your residuals normal?
Example: roach survival Data: Survival (Ts) in days of the roach Blatella vaga when kept without food or water Females n=10 mean(Ts)=8.5 days var(Ts)=3.6 days Males n=10 mean(Ts)=4.8 days var(Ts)=0.9 days Is the variation in survival time equal between male and female roaches? Data from Sokal & Rohlf 1995, p 189
Example: roach survival 1. 2. 3. 4. 5.
Example: roach survival 6. Key 7. 8.
Example: roach survival 9. 10.
Parameters Formal models (equations) consist of variable quantities and parameters Parameters have a fixed value in a particular situation Parameters are found in functional expressions of causal relations statistical or empirical functions theoretical frequency distributions Parameters are obtained from data by estimation
Parameters - examples • Functional relationship. Scallops density • Mscal=k1 if R=5 or 6 • Mscal=k2 if R not equal to 5 or 6 • Mscal = kg caught pr unit area of seafloor • R = sediment roughness from 1 (sand) to 100 (cobble) • k = mean scallop catch Red for params, blue for variables
Parameters - examples • Statistical relationship. Morphoedaphic equation • Mfish=1.38 MEI0.4661 • Mfish= kg ha-1 yr-1 fish caught per year from lake • MEI = ppm m-1 dissolved organics/lake depth • 0.4661 • 1.38 kg ha-1 ppm-0.4661 m0.4661 Red for params, blue for variables
Parameters - examples • Frequency distribution. Normal distribution Y X μ = mean σ = standard deviation Red for params, blue for variables
Parameter estimates • Scallops density • Mscal= μ1 if R=5 or 6 • Mscal= μ2 if R not equal to 5 or 6 • Theoretical model to calculate μ1 and μ2? • Non-existent • estimate from data recorded in 28 tows • Mscal= μ1=mean(MR=5,6) n=13 • Mscal= μ2=mean(MR<>5,6) n=15
Parameter estimates • Ryder’s morphoedaphic equation • pM = α MEIβ • ln(pM) = + population = + ln(MEI) sample
Statistical Inference • Two categories: • Hypothesis testing • Make decisions about an unknown population parameter • 2. Estimation • specific values of an unknown population parameter
Parameters • Estimation: • Analytic formula • e.g. slope, mean • 2. Iterative methods • criterion: maximize the likelihood of the parameter • common ways to measure the likelihood: • sums of squared deviations of data from model • G-statistic (Poisson, binomial)
Parameters Uncertainty: Confidence limit: 2 values between which we have a specified level of confidence (e.g. 95%) that the population parameter lies