270 likes | 372 Views
Soc 3155 . Review Terms from Day 1 Descriptive Statistics. Review I. Variable = any trait that can change values from case to case. Must be: Exhaustive: variables should consist of all possible values/attributes
E N D
Soc 3155 Review Terms from Day 1 Descriptive Statistics
Review I • Variable = any trait that can change values from case to case. Must be: • Exhaustive:variables should consist of all possible values/attributes • Mutually Exclusive: no case should be able to have 2 attributes simultaneously • Attribute = specific value on a variable • The variable “sex” has two attributes (female and male) • Independent (X) and Dependent (Y) variables • X (poverty) Y (child abuse)
Review II • Levels of Measurement • Nominal • Only ME&E (categories cannot be ordered) • Sex, type of religion, city of residence, etc. • Ordinal • Ability to rank categories (attributes) • Anything using Likert type questions (e.g., sa, a, d, sd) • Interval/ratio • Equal distance between categories of variable • Age in years, months living in current house, number of siblings, population of Duluth… • This level permits all mathematical operations (e.g., someone who is 34 is twice as old as one 17)
In Class Assignment • Bust into groups: 2-3 per group • Put names on top of assignment and write legibly • Develop Six Survey items that could be included as part of general survey of UMD students. • Must include 2 examples of each type of variable: • Nominal (NOT gender or race/ethnicity) • Ordinal • Interval - Ratio (NOT age) • Make sure to include all attributes of each • Remember: Mutually exclusive & exhaustive
Review III • Types of Statistics • Descriptive Statistics • Data reduction (Univariate) • Measures of Association (Bivariate) • Inferential Statistics • Are relationships found in sample likely true in population? • Trick is finding correct statistic for particular data (level of measurement issues)
Basic Descriptive Statistics • All about data reduction and simplification • Organizing, graphing, describing…quantitative information • Researchers often use descriptive statistics to describe sample prior to more complex statistics • Proportions/percentages • Ratios and Rates • Percentage change • Frequency distributions • Cumulative frequency/percentage • Charts/Graphs
Data Reduction • Unavoidably: Information is lost • Example: Study of textbooks • 2 hypotheses: • Textbook prices are rising faster than inflation. • Textbooks are getting bigger (& heavier!) with time • Still, useful & necessary: • To make sense of data & • To answer questions/test hypotheses
Descriptive Statistics • Percentages & proportions: • Most common ways to standardize raw data • Provide a frame of reference for reporting results • Easier to read than frequencies • Formulas • Proportion(p) = (f/N) • Percentage (%) = (f/N) x 100
Descriptive Statistics • Example: Prisoners Under Sentence of Death, by Region, 2006
Descriptive Statistics • Example: Prisoners Under Sentence of Death, by Region, 2006 BASE OF 1 BASE OF 100
Comparisons between distributions are simpler with percentages • Example: Distribution of violent crimes in 2 different cities
Comparisons between distributions are simpler with percentages • Example: Distribution of violent crimes in 2 different cities
Descriptive Statistics • Misconceptions arise with misuse of summary stats: • Example: A town of 90,000 experienced 2 homicides in 2000 and 4 homicides in 2001 • This is a 100% increase in homicides in just one year! • …But, the difference in raw numbers is only 2!
Descriptive Statistics • Ratio – precise measure of the relative frequency of one category per unit of the other category Ratio= f1 f2 • Ratios are good for showing the relative predominance of 2 categories
Example: ratio of prisoners on death row, South compared to Midwest • 1,750 / 276 = 6.34 = roughly 6:1 or “six to one”
Making Your Argument w/Stats… • Example 2: Suppose that… • Company A increased its sales volume from one year to the next from $10M to $20M • Company B increased its sales from $40M to $70M • You could make two comparisons of sales progress (based on above info): • A increased its sales by $10M & B increased its sales by $30M, 3 times that of A (a ratio of 3:1!). • A increased its sales by 100%. B increased its sales by 75%, three-fourths the increase of A. Which is correct?
Descriptive Statistics • Rate – proportion (p) multiplied by a useful “base” number with a multiple of 10 • Example: As of the end of 2007: • MN had 9,468 prisoners • WI had 23,743 • TX had 171,790 • TX rate per 100,000 = 171,790 x 100,000 = 719 23,904,380 • MN and WI rate per 100,000? • MN Population = 5,263,610 • WI Population = 5,641,581
Descriptive Statistics • Frequency distributions: • Tables that summarize the distribution of a variable by reporting the number of cases contained in each category of that variable
NOMINAL-LEVEL • Frequency distributions – Examples: ORDINAL-LEVEL • Valid Percent – percent if you exclude missing values • Cumulative Percent – how many cases fall below a • given value?
Descriptive Statistics • Example: Homogeneity of attributes – how much detail is too much? • TOO MUCH? (too many categories?)
Descriptive Statistics • Too little?
Descriptive Statistics • Just right:
Homework #1 (Group Assignment) • Groups of 2 to 3 • Due next Wednesday (2/1) • Assignment has an SPSS component • Also involves searching for table of data on the Web • This will be the ONLY ASSIGNMENT where you turn in the same paper for a group
Interpreting Tables (Part B of HW) • Locating tables • Sourcebook of Criminal Justice Statistics • “Minnesota Milestones” Page • Addressing questions the HW asks • Contents of table: • Who collected data? What population does it represent? How many cases is the table based on? • Who might be interested in this information? What relevance might it have to policy? • Description of variables: Name each variable & its level of measurement.
SPSS (for Part C of HW) • Obtain copy of the 2010 GSS data set in SPSS format… • Go to: • Soc 3155 Homepage • Edit Options click on “Display Names” & “Alphabetical” • SPSS procedures we’re covering today: • Running a frequency (getting a frequency distribution) • Recoding a variable
Recoding Exercise • From class survey data (off web site) • From the “nfl” variable, create the variable “packer” • Variable label = whether or not a person is a packer fan • Values: • 1 = Yes • 0 = No • From the “sibs” variable create the variable “large fam” • Variable label = whether or not a person has large family (3 or more siblings) • Values • 1 = Yes • 0 = No