210 likes | 228 Views
Elementary Statistics Professor K. Leppel. Introduction and Data Collection. Definitions. Population : All observations of interest in a given context Sample : A subset of a population. Example 1 Suppose you are the president of Widener University. Population : All Widener students.
E N D
Definitions • Population: All observations of interest in a given context • Sample: A subset of a population
Example 1Suppose you are the president of Widener University. • Population: All Widener students. • Sample: All Widener students taking classes in the School of Business Administration.
Example 2Suppose you are the Dean of the School of Business Administration (SBA) • Population: All Widener students taking SBA classes. • Sample: All Widener students taking Professor Leppel’s classes.
More Definitions • Population parameter or parameter:numerical characteristics of a population • Sample statistics:numerical characteristics of a sample
Deductive vs. Inductive Reasoning • Deductive: • population sample • general specific • Probability • Example: Suppose you have a bowl with 2 red marbles & 3 green ones. If you pick one, what is the probability that the marble is green?
Deductive vs. Inductive Reasoning • Inductive: • sample population • specific general • Statistics • Example: If you take a poll & note the voting preferences of this sample, we will be able to draw some conclusions about the votes of the population.
Sampling with & without Replacement • Sampling without replacement: once an element of a population has been selected as part of a sample, it cannot be selected again. • Sampling with replacement: an element of a population that has been selected as part of a sample can be selected again.
Random sampling vs. non-random sampling • Random sampling or probability sampling:sampling in which the probability of inclusion of each element in the population is known. • Non-random sampling or judgment sampling:sampling in which judgment is exercised in deciding which elements of population to include in the sample.
Simple Random Sample • A sample of n elements is a simple random sample if sampling is performed such that every combination of n elements has an equal chance of being the sample selected.
Two Types of Studies • 1. Observational or comparative studies • The analyst examines historical relationships among variables of interest. • Problem: Deriving cause & effect relationships from historical data is difficult because important environmental factors are generally not controlled & not stable.
2. Direct experimentation or controlled studies The investigator directly manipulates factors that affect a variable of interest.
Control Group • To understand the effect of a “treatment,” we need to compare a group that received a treatment with a group that received no treatment. The “no-treatment” group is the control group.
Two types of errors • 1. Systematic errors or bias: • These errors cause measurement to be incorrect in some systematic way. • They are caused by inaccuracies or deficiencies in the measuring instrument. • Systematic errors persist even when the sample size is increased.
2. Random error or sampling error: • These errors arise from a large number of uncontrolled factors - chance. • Random errors decrease on average as the sample size is increased.
Some of the variables with which you will work are qualitative and some are quantitative. Qualitative variables are categorical and can be subdivided into nominal and ordinal measures. Quantitative variables are numerical and can be subdivided into interval and ratio measures.
Qualitative (categorical) variables that are nominal have no order to them. Example 1: U.S. citizenship (yes, no) Example 2: On what continent were you born? (N. America, S. America, Africa, Antarctica, Asia, Australia, Europe) Sex (male, female) is sometimes considered as a nominal variable. However, if you take into consideration intersex individuals, who can have any of a variety of anatomical conditions that don’t fit the typical definitions of female or male, you no longer have a simple nominal measure.
Qualitative (categorical) variables that are ordinal have an implied ranking of a characteristic. Example 1: student class (freshman, sophomore, junior, senior) Example 2: customer service satisfaction(very dissatisfied, somewhat dissatisfied, neither satisfied nor dissatisfied, somewhat satisfied, very satisfied)
Switching to quantitative(numerical) variables, interval variables are a bit tricky. They are measured on an ordered scale in which the difference between measurements is meaningful. However, there is no true zero point where there is none of a specific characteristic. Also, if the measure is twice as large, that does not imply that there is twice as much of the characteristic. Example 1: Intelligence A person with an IQ of 150 is much more intelligent than a person with an IQ of 100, while a person with an IQ of 140 is somewhat more intelligent than a person with an IQ of 125. However, what would an IQ of 0 mean? And a person with an IQ of 200 is not twice as smart as one with an IQ of 100. Example 2: SAT scores
Quantitative variables (numerical) that are ratio variables have true zero points and ratios work in the expected way. Example 1: Income A person with zero income has no earnings or other source of money. And someone who has income of $100,000 has twice as much money coming in as someone who has income of $50,000. Example 2: Age