1 / 20

Sociology 601(Martin) Lecture for week 2: September 9 - 11

Learn about measures of central tendency, variation, and creating charts in sociology research using STATA. Understand how to interpret data distributions and calculate statistical indicators.

pwilkerson
Download Presentation

Sociology 601(Martin) Lecture for week 2: September 9 - 11

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Sociology 601(Martin)Lecture for week 2: September 9 - 11 • Chapter 3.1: • Making Charts • Chapter 3.2 – 3.5 (if time permits) • Measures of central tendency • Measures of variation • Walk-through of the STATA graphic user interface.

  2. Definitions for charts • frequency distribution: a graph listing intervals of possible values for a variable (on the x-axis), and number of observations in each interval (on the y-axis). • relative frequency distribution: as above, but the y-axis has the percent or proportion of observations in each interval. • bar graph: the variable is ordinal or nominal scale. • The bars should not touch • histogram: the variable is interval scale. • The bars should touch

  3. General Rules for Relative Frequency Distributions • Whether you are making a bar graph or histogram: • Make sure each observation is in one and only one category. • Use categories of equal width. • Choose an appealing number of categories. • Decide whether to provide labels • Double-check your graph. • If you use fewer bars to describe the distribution of a variable, you lose information but gain clarity.

  4. Example from Text, p. 36 • Murders per 100,000 population, by State for 1993

  5. Frequency Distribution • Murders per 100,000 population for 1993, by State • What have we lost? What have we gained?

  6. Relative Frequency Distribution • Murders per 100,000 population, by State

  7. Collapsed Relative Frequency Distribution • Murders per 100,000 population, by State • What have we lost? What have we gained?

  8. 3.2: Measuring central tendency - mean • Mean: sum of measurements divided by number of measurements. • Equation for the mean of a sample: • or, if you don’t have an equation editor, Ybar = SUM(Yi) / n where… Ybar is the sample mean (Yi) is a measurement of Y for case i n is the number of cases in the sample

  9. Weighted means • Weighted sample mean: the sum of measurements divided by the number of observations, adjusted for the number of cases in each observation • Example: we could weight the state murder rates by the number of persons in each state in 1993 to get the mean murder rate for persons in the US • If n = 2 the equation for the weighted mean is

  10. 3.3 Other measures of central tendency • Median: the measurement that falls in the middle of an ordered sample • the median is the value of the 50th percentile • Percentile: the number such that p% of scores fall below it and (100-p)% of scores fall above it • Mode: the value that occurs most frequently

  11. 3.4: Measures of variation • range: the difference between the largest and smallest observations • interquartile range: the difference between the 25th and 75th percentile observation • deviation: for any observation, the difference between that observation and the sample mean Di = Yi - Ybar (one averaged measure of variation for a sample would be to take the mean of the absolute values of all the deviations for the sample)

  12. Variance and standard deviation: the most common measures of variation • variance: the mean of the squared deviations for a sample, labeled s2. • standard deviation: the square root of the variance, or the root mean squared deviation, labeled s.

  13. Practice: Calculate the mean, variance, and standard deviation.

  14. Interpreting the standard deviation. • s is (formally) the root mean squared deviation. • s is one version of the typicaldistance of an observation from the sample mean. • Because s accounts for squared deviations, it is affected by extreme scores. • Is this a desirable property? • Compare these samples: (-3,-3,+3,+3) vs (-2,-2,-2,+6) • Generally, for a continuous quantitative variable Y about 68% of scores fall between Ybar - s and Ybar + s.

  15. Interpreting sample statistics. • Recall that… • A statistic is a single number estimated from a sample • A parameter is a single number that summarizes some quality of a variable in a population. • For means: • the population mean is  (mu) • The sample mean Ybaris an estimator of  . • For standard deviations • the population standard deviation is  (sigma), • The sample standard deviation s is an estimator of  .

  16. A conceptual map of STATA

  17. The STATA windows environment - icons • Open (use) • Save • Print Results • Begin Log • Start viewer • Bring results window to front • Bring graph window to front • Do-file editor • Data editor • Data browser • Clear • Break

  18. The .do file: interface of choice for social research • Icons within the .do file: • New • Open • Save • Print • Find • Cut • Copy • Paste • Undo • Do current file • Run current file

  19. Sample commands in a .do file use "I:\601Fall08\socy601data.dta", clear summarize AGE summarize AGE [weight=ADULTS] tabulate AGE tabulate AGE [weight=ADULTS] clear

  20. How to create a log file • One approach is to use the log icon to start and stop a log. • Another approach is to type the log-starting command into a .do file : log using I:\601Fall08\week01hmwk.txt, replace *. . . (your work here) . . . log close

More Related