E N D
1. Sociology 601(Martin)Lecture 1 - 2: September 2 - 4 2008. Syllabus
Some basic terms
Statistical inference (Chapter 1.1 to 1.2)
Variables and scales (Chapter 2.1)
Randomization (2.2)
Sampling and nonsampling variability (2.3)
Sampling techniques (2.4)
Patterson, “The last sociologist”
Jacobs, D., J.T. Carmichael, and S.L. Kent (2005): “Vigilantism, Current Racial Threat, and Death Sentences.” American Sociological Review 70(4): 656 – 677.
2. Chapter 1.1 – 1.2 definitions (pages 3-5) Descriptive Statistics: summary descriptions of a collection of data
Statistical Inferences: predictions or generalizations made from data
Population: total set of subjects of interest in a study
Sample: subset of the population of which the study collects data
Statistic: a numerical summary of sample data
Parameter: a numerical summary of a population
3. Choosing a Population: an example The student government at the University of Wisconsin conducts a study about alcohol abuse among students. One hundred of the 40,000 member of the student body are sampled and asked to complete a questionnaire. One question is “On how many days in the past week did you consume at least one alcoholic drink?”
Q: What is the population of interest?
4. Chapter 2.1 definitions (pages 12-17) variable: a characteristic that can vary among subjects in a population
race, age, sex, educational attainment
Q: is a characteristic a variable if it is fixed for an individual? (e.g. race)
constant: a characteristic that cannot vary among subjects in a population
One constant is the quality of being a member of the population (duh! - yet statistically important)
5. Chapter 2.1, more definitions (pages 12-17) Scale: the possible values of a variable
Nominal scale: unordered, discrete categories,
(religious affiliation)
Ordinal scale: naturally ordered, discrete categories
(social class -- upper, middle, lower)
Interval scale: variables whose values have a specific distance from one another.
(income is a continuous variable with an interval scale)
(number of times married is a discrete variable with an interval scale)
6. Chapter 2.1, more definitions (pages 12-17) Qualitative variables: = nominal scale of measurement.
Quantitative variables: = interval scale of measurement.
Categorical variables: includes all variables with nominal or ordinal scales
some variables with an interval scale are also technically categorical variables.
Dichotomous variable: = any variable with a scale consisting of two categories. Very flexible because that scale is simultaneously nominal, ordinal, and interval.
7. Practice with variable scales
8. 2.2: Definitions for sampling methods (pp. 17 – 20) Probability sampling methods: methods that can specify the probability that a given sample will be selected.
Randomization: a technique for insuring that any member of a population has an equal chance of appearing in a sample.
With randomization, sample statistics will on average have the same values as the population parameters.
Simple random sample: each possible sample of a given size has the same likelihood of being selected.
9. How to select a simple random sample 1.) list all the subjects in a population
2.) assign a number to each subject
3.) pick numbers from a list of random numbers
4.) put the corresponding subjects in the sample.
Cost and feasibility can be problems, especially at steps 1 and 4.
10. Nonprobability sampling (pp. 20 – 21) Nonprobability sampling methods cannot specify the probability that a given sample will be selected.
Example: snowball sampling methods (Edin and Lein)
Why use such methods?
They are often inexpensive
They can provide information about groups that are difficult to sample.
Some social variables and their relationships are universal, which makes sampling method irrelevant!
This is assumed for many psychology studies and medical studies.
11. Common quantitative research designs(pp. 21 – 22) Experimental design
Subjects are randomly assigned to treatments (=variables) by the researcher
Random sampling from the population less important
Observational design
Subjects are not randomly assigned to variables; assignment is by outside processes or self-selection.
Random sampling is important.
Many observational studies “look like” experiments.
Quasi-experimental design
Observational studies where self-selection in unlikely.
12. Critiquing a research design Refer to: Jacobs, D., J.T. Carmichael, and S.L. Kent (2005): “Vigilantism, Current Racial Threat, and Death Sentences.” American Sociological Review 70(4): 656 – 677.
Thinking about this paper as an experiment…
What is a “treatment”?
What is an outcome?
How far from random assignment are the treatments?
Thinking about this paper as an observational analysis
What is the sample?
What is a population of interest?
Define each variable as what is being measured
Define each variable as a concept
13. 2.3: Sampling and nonsampling variability(pp. 22 – 24)
We ideally like sample statistics to be as close as possible to population parameters, but several factors can cause variability:
Sampling error: the difference between a sample statistic and a corresponding population parameter.
Random sampling protects us against systematic bias in the sampling error, and allows us to estimate the typical size of the sampling error.
Nonsampling error: comes from other sources, can be systematically biased, and is difficult to estimate.
Sources of nonsampling error include undercoverage, nonresponse, and response bias.
14. Examples of sampling error in the General Social Survey (GSS) Source: SDA archive http://sda.berkeley.edu/archive.htm
Variables to examine in codebook:
VOTE04 (compare to actual turnout using) http://elections.gmu.edu/turnout_rates_graph.htm
RINCOM06
Variables to crosstabulate
DIVORCE and YEAR, then restrict to AGE(40-44)
15. 2.4: Other probability sampling methods(pp. 25 – 28) Systematic random sample:
pick a random case from the first k cases of a sample
select every kth case after that one
Stratified random sample:
divide a population into groups, then select a simple random sample from each stratum
Cluster sampling:
divide the population into groups called clusters
take a random sample of the clusters
Multistage sampling:
combine stratified and cluster sampling techniques, often including several levels of clusters
16. Examples of sampling in typical surveys National Longitudinal Survey of Youth (NLSY)
12,686 men and women ages 14-22 in 1979.
includes a multistage sample designed to be nationally representative.
includes oversamples of hispanic women and men, black nonhispanic women and men, poor white women and men, plus military subsamples, along with sampling weights.
A reinterview every two years loses some respondents (nonrandomly) to attrition.
Current Population Survey: http://www.census.gov/prod/2000pubs/tp63.pdf, section 14, especially Table 14-5 for DEFF