210 likes | 288 Views
Data Analysis. Basic Problem. There is a population whose properties we are interested in and wish to quantify statistically: mean, standard deviation, distribution, etc. The Question – Given a sample, what was the random system that generated its statistics?. Central Limit Theorem.
E N D
Basic Problem • There is a population whose properties we are interested in and wish to quantify statistically: mean, standard deviation, distribution, etc. • The Question – Given a sample, what was the random system that generated its statistics?
Central Limit Theorem • If one takes random samples of size n from a population of mean m and standard deviation s, then as n gets large, approaches the normal distribution with mean m and standard deviation • s is generally unknown and often replaced by the sample standard deviation s resulting in , which is termed the Standard Error of the sample.
Confidence Interval for Mean(small sample size, t-distribution) OR
Comparing Population Means Unequal Variance Pooled Variance
Hypothesis Testing (t-test) • Null Hypothesis – differences in two samples occurred purely by chance • t statistic = (estimated difference)/SE • Test returns a “p” value that represents the likelihood that two samples were derived from populations with the same distributions • Samples may be either independent or paired
Tails • One tailed test – hypothesis is that one sample is: less than, greater than, taller than, • Two tailed test – hypothesis is that one sample is different (either higher or lower) than the other
Paired Test • Samples are not independent • Much more robust test to determine differences since all other variables are controlled • Analysis is performed on the differences of the paired values • Equivalent to Confidence interval for the mean
BMP Performance Comparison • Commonly expressed as a % reduction in concentration or load • Highly dependent on influent concentration • Potentially ignores reduction in volume (load) • May lead to very large differences in pollutant reduction estimates • Preferable to compare discharge concentrations
Exercise • Calculate average concentrations for each constituent for the two watersheds • Determine whether any concentrations are significantly different, report p value for null hypothesis • Calculate average effluent concentrations for the two BMPs and determine whether they are different from the influent concentrations – p values • Compare effluent concentrations for the two BMPs and determine whether one BMP is better than the other for a particular constituent.