290 likes | 433 Views
Statistics. continued. CPS 807. 2 k r Factorial Designs with Replications. r replications of 2 k Experiments 2 k r observations. Allows estimation of experimental errors Model: y = q 0 + q A x A + q B x B +q AB x A x B +e e= Experimental error. Computation of Effects.
E N D
Statistics continued CPS 807
2kr Factorial Designs with Replications • r replications of 2k Experiments • 2kr observations. • Allows estimation of experimental errors • Model: y = q0 + qAxA + qBxB +qABxAxB+e e=Experimental error
Computation of Effects • Simply use means of r measurements Effects: q0 = 41, qA = 21.5, qB = 9.5, qAB = 5
Estimated Response: yi = q0 + qAxAi + qBxBi +qABxAixBi • Experimental Error = Estimated-Measured eij = yij - yi = yij - q0 - qAxAi - qBxBi - qABxAixBi ^ ^ • Sum of Squared Errors: SSE = Estimation of Experimental Errors
Estimated Response: y1 = q0 - qA - qB + qAB = 41 -21.5 -9.5 +5 = 15 Experimental errors: e11 = y11 - y1 = 15 - 15 = 0 ^ ^ Estimated Measured Effect ^ SSE = 02 + 32 + (-3)2 + (-3)2 + ... + 42 = 102 Experimental Errors: Example
Total variation or total sum of squares: SST = SST = SSA + SSB + SSAB + SSE Allocation of Variation
Model: Since x’s, their products, and all errors add to zero Derivation
Mean response: Squaring both sides of the model and ignoring cross product terms: SSY = SS0 + SSA + SSB + SSAB + SSE Derivation (cont’d)
Total Variation: One way to compute SSE: Derivation (cont’d)
Example: Memory-Cache Study(cont’d) SSA + SSB + SSAB + SSE =5547 + 1083 + 300 + 102 = 7032 = SST Factor A explains 5547/7032 or 78.88% Factor B explains 15.40% Interaction AB explains 4.27% 1.45% is unexplained and is attributed to errors.
Review: Confidence Interval for the Mean • Problem: How to get a single estimate of the population mean from k sample estimates? • Answer: Get probabilistic bounds. • Eg., 2 bounds, C1 & C2 There is a high probability, 1-, that the mean is in the interval (C1, C2 ): • Pr {C1 C2} = 1 - • Confidence interval (C1, C2 ) • Significance Level • 100 (1-) Confidence Level • 1- Confidence Coefficient.
Confidence Interval for the Mean (cont’d) Note: Confidence Level is traditionally expressed as a percentage (near 100%); whereas, significance level , is expressed as a fraction & is typically near zero; e.g., 0.05 or 0.01.
Example: • Given sample with: • mean = x = 3.90 • SD = s = 0.95 • n = 32 • A 90 % CI for the mean = 3.90 + (1.645)(0.95)/ • = (3.62, 4.17), used the central limit theorem. • Note: A 90 % CI => We can state with 90 % confidence that the population mean is between 3.62 & 4.17. The chance of error in this statement is 10 % Confidence Interval for the Mean (cont’d)
Testing for a Zero Mean • Difference in processor times of two different implementations of the same algorithms was measured on 7 similar workloads. The differences are: • {1.5, 2.6, -1.8, 1.3, -0.5, 1.7, 2.4} • Can we say with 99 % confidence that one implementation is superior to the other
Sample size = n = 7 • mean = x = 1.03 • sample variance = s2 = 2.57 • sample deviation = s = 1.60 • CI = 1.03 tx 1.60/ = 1.03 0.605t • 100 (1- ) = 99, = 0.01, 1- /2 = 0.995 • From Table, the t value at six degrees of freedom is: • t[0.995; 6] = 3.707 & the 99% CI = (-1.21, 3.27). • Since the CI includes zero, we can not say with 99% confidence that the mean difference is significantly different from Zero. Testing for a Zero Mean (cont’d)
In testing a NULL, hypothesis, the level of significance is the probability of rejecting a true hypothesis. HYPOTHESIS Actually True Actually False DECISION Error (Type II) To Accept Correct Error (Type I) To Reject Correct Note: The letters & denote the probability related to these errors Type I & Type II Errors
Confidence Intervals For Effects Effects are random variables. Errors ~ N(0,σe) => y ~ N( y.., σe) Since q0 = Linear combination of normal variables => q0 is normal with variance Variance of errors:
Confidence Intervals For Effects (cont’d) Denominator = 22(r - 1) = # of independent terms in SSE => SSE has 22(r - 1) degrees of freedom. Estimated variance of q0 : Similarly, Confidence intervals (CI) for the effects: CI does not include a zero => significant
Example For Memory-cache study: Standard deviation of errors: Standard deviation of effects: For 90% Confidence :
Example (cont’d) Confidence intervals: No zero crossing => All effects are significant.
Confidence Intervals for Contrasts Contrast Linear combination with coefficients = 0 Variance of hiqi: For 100 ( 1 - ) % confidence interval, use
Example: Memory-cache study u = qA + qB -2qAB Coefficients = 0,1,1, and -2 => Contrast Mean u = 21.5 + 9.5 - 2 x 5 = 21 Variance Standard deviation t[0.95;8] = 1.86 90% Confidence interval for u : (16.31, 25.69)
CI for Predicted Response ^ Mean response y : y = q0 + qA xA + qB xB + qABxA xB The standard deviation of the mean of m response: ^ neff = Effective deg of freedom = Total number of runs ^ 1 + Sum of DFs of params used in y
100 ( 1 - ) % confidence interval: A single run (m = 1) : Population mean CI for Predicted Response (cont’d)
Example: Memory-cache Study • For xA = -1 and xB = -1: • A single confirmation experiment: • y1 = q0 - qA - qB + qAB • = 41 - 21.5 - 9.5 + 5 = 15 • Standard deviation of the prediction: ^ Using t[0.95;8]=1.86, the 90% confidence interval is:
Example: Memory-cache Study (cont’d) • Mean response for 5 experiments in future: The 90% confidence interval is: • Mean response for a large number of experiments in future: The 90% confidence interval is:
Example: Memory-cache Study (cont’d) • Current mean response: Not for future. • (Use the formula for contrasts): 90% confidence interval: Notice: Confidence intervals become narrower.
Assumptions 1. Errors are statistically independent. 2. Errors are additive. 3. Errors are normally distributed 4. Errors have a constant standard deviation e. 5. Effects of factors are additive. => observations are independent and normally distributed with constant variance.