1 / 0

MGMT 276: Statistical Inference in Management Room 103 CESL Fall , 2011.

MGMT 276: Statistical Inference in Management Room 103 CESL Fall , 2011. Welcome. Remember to hold onto homework until we have a chance to cover it. http://www.youtube.com/watch?v=Ahg6qcgoay4&watch_response. http://www.youtube.com/watch?v=oSQJP40PcGI. Review of Homework Worksheet.

masao
Download Presentation

MGMT 276: Statistical Inference in Management Room 103 CESL Fall , 2011.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MGMT 276: Statistical Inference in ManagementRoom 103 CESLFall, 2011. Welcome Remember to hold onto homework until we have a chance to cover it http://www.youtube.com/watch?v=Ahg6qcgoay4&watch_response http://www.youtube.com/watch?v=oSQJP40PcGI
  2. Review of Homework Worksheet
  3. Use this as your study guide By the end of lecture today9/8/11 Field observation/naturalistic research Time series design vs. Cross sectional design Simple versus systematic random sampling Sample frame and randomization Stratified sampling, cluster sampling, judgment sampling Snowball sampling, convenience sampling Questionnaire design and evaluation Dot Plots Frequency Distributions - Frequency Histograms Frequency, cumulative frequency Relative frequency, cumulative relative frequency Guidelines for constructing frequency distributions Three primary types of “measures of central tendency”? Mean, Median, Mode
  4. No Homework due September 13th Please double check – Allcell phones other electronic devices are turned off and stowed away
  5. Time series versus cross-sectional comparisons: Trends over time versus a snapshot comparison Time series design: Each observation represents a measurement at some point in time. Repeated measurements allow us to see trends. This is similar to longitudinal design. Cross-sectional design: Each observation represents a measurement at some point in time. Comparing across groups allows us to see differences. Traffic accidents Please note: Any one piece of data can often (not always) be used in either a time series comparison or a cross-sectional comparison. It depends how you set up your question. Does Tucson or Albuquerque have more traffic accidents (they have similar population sizes)? Does Tucson have more traffic accidents as the year ends and winter approaches?
  6. Time series versus cross-sectional comparisons: Trends over time versus a snapshot comparison Time series design: Each observation represents a measurement at some point in time. Repeated measurements allow us to see trends. Cross-sectional design: Each observation represents a measurement at some point in time. Comparing across groups allows us to see differences. Unemployment rate Is there an increase in workers calling in sick as the summer months approach? Do more young workers call in sick than older workers? Grade point average (GPA) Does GPA tend to go up or down as students move from freshman to sophomores to juniors to seniors? Does GPA tend to go up or down when you compare Mr. Chen’s class with Mr. Frank’s Freshman English classes?
  7. Naturalistic Research Naturalistic research: Descriptive method in which observations are made in a natural social setting. Also called field observation. Survey is a series of self-report measures administered either through an interview or a written questionnaire Behavioral data is a measurement of observable actions in natural setting
  8. Population (census) versus sampleParameter versus statistic Parameter – Measurement or characteristic of the population Usually unknown (only estimated) Usually represented by Greek letters (µ) pronounced “mu” pronounced “mew” Statistic – Numerical value calculated from a sample Usually represented by Roman letters (x) pronounced “x bar” How is a census different from a sample?
  9. Simple random sampling: each person from the population has an equal probability of being included Sample frame = how you define population Let’s take a sample …a random sample Question: Average weight of U of A football player Sample frame population of the U of A football team Pick 24th name on the list Random number table – List of random numbers Or, you can use excel to provide number for random sample =RANDBETWEEN(1,110) Pick 64th name on the list(64 is just an example here) 64
  10. Systematic random sampling: A probability sampling technique that involves selecting every kth person from a sampling frame You pick the number Other examples of systematic random sampling 1) check every 2000th light bulb 2) survey every 10th voter
  11. Stratified sampling: sampling technique that involves dividing a sample into subgroups (or strata) and then selecting samples from each of these groups - sampling technique can maintain ratios for the different groups Average number of speeding tickets 12% of sample is from California 7% of sample is from Texas 6% of sample is from Florida 6% from New York 4% from Illinois 4% from Ohio 4% from Pennsylvania 3% from Michigan etc Average cost for text books for a semester 17.7% of sample are Pre-business majors 4.6% of sample are Psychology majors 2.8% of sample are Biology majors 2.4% of sample are Architecture majors etc
  12. Cluster sampling: sampling technique divides a population sample into subgroups (or clusters) by region or physical space. Can either measure everyone or select samples for each cluster Textbook prices Southwest schools Midwest schools Northwest schools etc Average student income, survey by Old main area Near McClelland Around Main Gate etc Patient satisfaction for hospital 7th floor (near maternity ward) 5th floor (near physical rehab) 2nd floor (near trauma center) etc
  13. Convenience sampling: sampling technique that involves sampling people nearby. A non-random sample and vulnerable to bias Snowball sampling: a non-random technique in which one or more members of a population are located and used to lead the researcher to other members of the population Used when we don’t have any other way of finding them Also vulnerable to biases
  14. Judgment sampling: sampling technique that involves sampling people who an expert says would be useful. A non-random sample and vulnerable to bias Focus group: members can be randomly or not randomly selected. Mediator gathers opinion and information from group. Information can be qualitative or quantitative
  15. Describing Data Visually Lists of numbers too hard to see patterns 14 17 20 25 21 29 16 25 27 18 16 13 11 21 19 24 20 11 20 28 16 13 17 14 14 16 8 17 17 11 11 14 17 19 24 8 16 12 25 9 20 17 11 14 16 18 22 14 18 23 12 15 10 13 15 11 11 8 11 14 17 19 24 8 12 14 17 20 25 9 12 15 17 20 25 10 13 15 17 20 25 11 13 16 17 20 27 11 13 16 17 21 28 11 14 16 18 21 29 11 14 16 18 22 11 14 16 18 23 11 14 16 19 24 Organizing numbers helps Graphical representation even more clear This is a dot plot
  16. Describing Data Visually 8 12 14 17 19 24 8 12 14 17 20 25 9 13 15 17 20 25 10 13 15 17 20 25 11 13 16 17 20 27 11 13 16 17 21 28 11 14 16 18 21 29 11 14 16 18 22 11 14 16 18 23 11 14 16 19 24 Measuring the “frequency of occurrence” Then figure “frequency of occurrence” for the bins We’ve got to put these data into groups (“bins”)
  17. Frequency distributions Frequency distributions an organized list of observations and their frequency of occurrence How many kids are in your family? What is the most common family size?
  18. Another example: How many kids in your family? Number of kids in family 1 3 1 4 2 4 2 8 2 14 14 4 2 1 4 2 3 2 1 8
  19. Frequency distributions Number of kids in family 1 3 1 4 2 4 2 8 2 14 How many kids are in your family? What is the most common family size? Crucial guidelines for constructing frequency distributions: 1. Classes should be mutually exclusive: Each observation should be represented only once (no overlap between classes) Wrong 0 - 5 5 - 10 10 - 15 Correct 0 - 4 5 - 9 10 - 14 Correct 0 - under 5 5 - under 10 10 - under 15 2. Set of classes should be exhaustive: Should include all possible data values (no data points should fall outside range) Correct 0 - 3 4 - 7 8 - 11 12 - 15 Wrong 0 - 3 4 - 7 8 - 11 No place for our family of 14!
  20. Frequency distributions Number of kids in family 1 3 1 4 2 4 2 8 2 14 How many kids are in your family? What is the most common family size? Crucial guidelines for constructing frequency distributions: 3. All classes should have equal intervals (even if the frequency for that class is zero) Correct 0 - 4 5 - 9 10 - 14 Wrong 0 - 4 8 - 12 14 - 19 Correct 0 - under 5 5 - under 10 10 - under 15 missing space for families of 5, 6, or 7 Clear & Easy 8 - 11 12 - 15 16 - 19 20 - 23 24 - 27 28 - 31 4. Class width should be round (easy) numbers Round numbers: 5, 10, 15, 20 etc or 3, 6, 9, 12 etc Remember: This is all about helping readers understand quickly and clearly. Lower boundary can be multiple of interval size 6. Try to avoid open ended classes For example 10 and above Greater than 100 Less than 50
  21. Let’s do one Scores on an exam 82 58 64 80 75 72 87 73 88 94 84 78 93 69 70 60 53 84 76 87 84 61 89 95 87 91 75 99 Step 1: List scores 53 58 60 61 64 69 70 72 73 75 75 76 78 80 82 84 84 84 87 87 87 88 89 91 93 94 95 99 Step 2: List scores in order Step 3: Decide whether grouped or ungrouped If less than 10 groups, “ungrouped” is fine If more than 10 groups, “grouped” might be better How to figure how many values Largest number - smallest number + 1 10 bins Interval of 5 99 - 53 + 1 = 47 Step 4: Generate number and size of intervals (or size of bins) Scores on an exam Score Frequency 95 - 99 2 90 - 94 3 85 - 89 5 80 – 84 5 75 - 79 4 70 - 74 3 65 - 69 1 60 - 64 3 55 - 59 1 50 - 54 1 Scores on an exam Score Frequency 93 - 100 4 85 - 92 6 77- 84 6 69 - 76 7 61- 68 2 53 - 60 3 6 bins Interval of 8
  22. Scores on an exam 82 58 64 80 75 72 87 73 88 94 84 78 93 69 70 60 53 84 76 87 84 61 89 95 87 91 75 99 Where are we? Scores on an exam Score Frequency 95 - 99 2 90 - 94 3 85 - 89 5 80 – 84 5 75 - 79 4 70 - 74 3 65 - 69 1 60 - 64 3 55 - 59 1 50 - 54 1 Relative Frequency .0715 .1071 .1786 .1786 .1429 .1071 .0357 .1071 .0357 .0357 Cumulative Rel. Freq. 1.0000 .9285 .8214 .6428 .4642 .3213 .2142 .1785 .0714 .0357 Cumulative Frequency 28 26 23 18 13 9 6 5 2 1 Cumulative Frequency Data Cumulative Frequency Histogram
  23. Pareto Chart: Categories are displayed in descending order of frequency
  24. Stacked Bar Chart: Bar Height is the sum of several subtotals
  25. Simple Line Charts: Often used for time series data (continuous data)(the space between data points implies a continuous flow) Note: For multiple variables lines can be better than bar graph Note: Fewer grid lines can be more effective Note: Can use a two-scale chart with caution
  26. Pie Charts: General idea of data that must sum to a total(these are problematic and overly used – use with much caution) Exploded 3-D pie charts look cool but a simple 2-D chart may be more clear Exploded 3-D pie charts look cool but a simple 2-D chart may be more clear Bar Charts can often be more effective
  27. Overview Frequency distributions The normal curve Challenge yourself as we work through characteristics of distributions to try to categorize each concept as a measure of 1) central tendency 2) dispersion or 3) shape Mean, Median, Mode, Trimmed Mean Standard deviation, Variance, Range Mean Absolute Deviation Skewed right, skewed left unimodal, bimodal, symmetric
  28. Another example: How many kids in your family? Number of kids in family 1 4 3 2 1 8 4 2 2 14 14 4 2 1 4 2 3 2 1 8
  29. Measures of Central Tendency(Measures of location)The mean, median and mode Mean: The balance point of a distribution. Found by adding up all observations and then dividing by the number of observations Mean for a sample: Σx / n = mean = x Mean for a population: ΣX / N = mean = µ(mu) Measures of “location” Where on the number line the scores tend to cluster Note: Σ = add up x or X = scores n or N = number of scores
  30. Measures of Central Tendency(Measures of location)The mean, median and mode Mean: The balance point of a distribution. Found by adding up all observations and then dividing by the number of observations Mean for a sample: Σx / n = mean = x 41/ 10 = mean = 4.1 Number of kids in family 1 4 3 2 1 8 4 2 2 14 Note: Σ = add up x or X = scores n or N = number of scores
  31. Number of kids in family 1 4 32 18 42 2 14 How many kids are in your family? What is the most common family size? Median: The middle value when observations are ordered from least to most (or most to least) 1, 3, 1, 4, 2, 4, 2, 8, 2, 14 1, 2, 2, 4, 2, 1, 8, 3, 4, 14
  32. Number of kids in family 1 3 1 4 2 4 2 8 2 14 Number of kids in family 1 4 32 18 42 2 14 How many kids are in your family? What is the most common family size? Median: The middle value when observations are ordered from least to most (or most to least) 1, 3, 1, 4, 2, 4, 2, 8, 2, 14 1, 2, 2, 4, 1, 2, 2, 4, 2, 1, 2, 1, 8, 8, 3, 4, 14 3, 4, 14 2.5 2 + 3 µ=2.5 If there appears to be two medians, take the mean of the two Median always has a percentile rank of 50% regardless of shape of distribution
  33. Number of kids in family 1 3 1 4 2 4 2 8 2 14 How many kids are in your family? What is the most common family size? Median: The middle value when observations are ordered from least to most (or most to least)
  34. Mode: The value of the most frequent observation Score f . 1 2 2 3 3 1 4 2 5 0 6 0 7 0 8 1 9 0 10 0 11 0 12 0 13 0 14 1 Number of kids in family 1 3 1 4 2 4 2 8 2 14 Please note: The mode is “2” because it is the most frequently occurring score. It occurs “3” times. “3” is not the mode, it is just the frequency for the value that is the mode Bimodal distribution: If there are two most frequent observations
  35. What about central tendency for qualitative data? Mode is good for nominal or ordinal data Median can be used with ordinal data Mean can be used with interval or ratio data
  36. Overview Frequency distributions The normal curve Challenge yourself as we work through characteristics of distributions to try to categorize each concept as a measure of 1) central tendency 2) dispersion or 3) shape Mean, Median, Mode, Trimmed Mean Standard deviation, Variance, Range Mean Absolute Deviation Skewed right, skewed left unimodal, bimodal, symmetric
  37. A little more about frequency distributions An example of a normal distribution
  38. A little more about frequency distributions An example of a normal distribution
  39. A little more about frequency distributions An example of a normal distribution
  40. A little more about frequency distributions An example of a normal distribution
  41. A little more about frequency distributions An example of a normal distribution
  42. Measure of central tendency: describes how scores tend to cluster toward the center of the distribution Normal distribution In all distributions: mode = tallest point median = middle score mean = balance point In a normal distribution: mode = mean = median
  43. Measure of central tendency: describes how scores tend to cluster toward the center of the distribution Positively skewed distribution In all distributions: mode = tallest point median = middle score mean = balance point In a positively skewed distribution: mode < median < mean Note: mean is most affected by outliers or skewed distributions
  44. Measure of central tendency: describes how scores tend to cluster toward the center of the distribution Negatively skewed distribution In all distributions: mode = tallest point median = middle score mean = balance point In a negatively skewed distribution: mean < median < mode Note: mean is most affected by outliers or skewed distributions
  45. Mode: The value of the most frequent observation Bimodal distribution: Distribution with two most frequent observations (2 peaks) Example: Ian coaches two boys baseball teams. One team is made up of 10-year-olds and the other is made up of 16-year-olds. When he measured the height of all of his players he found a bimodal distribution
  46. Remember… Frequency 10 20 30 40 50 60 70 80 90 100 Score on Exam Note: Label and Numbers Note: Always “frequency”
  47. Examples of data that would produce three of these shapes
  48. Thank you! See you next time!!
More Related