1 / 54

Economic Reasoning Using Statistics

Economic Reasoning Using Statistics. Econ 138 Dr. Adrienne Ohler. How you will learn. . Textbook: Stats : Data and Models 2 nd Ed ., by Richard D. DeVeaux , Paul E. Velleman , and David E. Bock Homework: MyStatLab brought to by www.coursecompass.com. The rest of this class.

ogden
Download Presentation

Economic Reasoning Using Statistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Economic Reasoning Using Statistics Econ 138 Dr. Adrienne Ohler

  2. How you will learn. • Textbook: Stats: Data and Models 2nd Ed., by Richard D. DeVeaux, Paul E. Velleman, and David E. Bock • Homework: MyStatLab brought to by www.coursecompass.com

  3. The rest of this class • Attendance Policy • Cellphone Policy • Homeworks (10 out of 12) • Due Sundays by 11:59pm • Quizzes (5 out of 6) • Exams • Oct. 10th • Nov. 28 • Cumulative Optional Final • Data Project

  4. Help for this Class • READ THE BOOK • Come to class prepared and awake • READ THE BOOK • Office Hours: T, H 9-11am and by Appointment • READ THE BOOK • Get a tutor at the Visor Center

  5. Economic reasoning using statistics • What is economics? • The study of scarcity, incentives, and choices. • The branch of knowledge concerned with the production, consumption, and transfer of wealth. (google) • Wealth • The health, happiness, and fortunes of a person or group. (google) • What is/are statistics? • Statistics (the discipline) is a way of reasoning, a collection of tools and methods, designed to help us understand the world. • Statistics (plural) are particular calculations made from data. • Data are values with a context.

  6. Statistics • Statistics (the discipline) is a way of reasoning, a collection of tools and methods, designed to help us understand the world. • Will the sun rise tomorrow?

  7. What is Statistics Really About? • A statistic is a number that represents a characteristic of a population. (i.e. average, standard deviation, maximum, minimum, range) • Statistics is about variation. • All measurements are imperfect, since there is variation that we cannot see. • Statistics helps us to understand the real, imperfect world in which we live and it helps us to get closer to the unveiled truth.

  8. The language of Statistics • For of literacy • 4 cows in a field • 7 cows by the road • 4 cows in a field on the left • 3 cows in a field on the right • At a party • Average age is 18 • Average age is 22 • Average age is 75

  9. In this class • Observe the real world • Create a hypothesis • Collect data • Understand and classify our data • Graph our data • Standardize our data • Apply probability rules to our data • Test our hypothesis • Interpret our results

  10. Questioning a Statistic • ½ of all American children will witness the breakup of a parent’s marriage. Of these, close to 1/2 will also see the breakup of a parent’s second marriage. • (Furstenberg et al, American Sociological Review �1983) • 66% of the total adult population in this country is currently overweight or obese. • (http://win.niddk.nih.gov/statistics/) • 28% of American adults have left the faith in which they were raised in favor of another religion - or no religion at all. • (http://religions.pewforum.org/reports)

  11. Chapter 2 - What Are Data? • Information • Data can be numbers, record names, or other labels. • Not all data represented by numbers are numerical data (e.g., 1=male, 2=female). • Data are useless without their context…

  12. The “W’s” • To provide context we need the W’s • Who • What (and in what units) • When • Where • Why (if possible) • and How of the data. • Note: the answers to “who” and “what” are essential.

  13. Who • The Who of the data tells us the individual cases about which (or whom) we have collected data. • Individuals who answer a survey are called respondents. • People on whom we experiment are called subjectsor participants. • Animals, plants, and inanimate subjects are called experimental units. • Sometimes people just refer to data values as observations and are not clear about the Who. • But we need to know the Who of the data so we can learn what the data say.

  14. Identify the Who in the following dataset? • Are physically fit people less likely to die of cancer? • Suppose an article in a sports medicine journal reported results of a study that followed 22,563 men aged 30 to 87 for 5 years. • The physically fit men had a 57% lower risk of death from cancer than the least fit group.

  15. Who are they studying? • The cause of death for 22,563 men in the study • The fitness level of the 22,563 men in the study • The age of each of the 22,563 men in the study • The 22,563 men in the study

  16. What and Why • Variables are characteristics recorded about each individual. • The variables should have a name that identify What has been measured. • A categorical (or qualitative) variable names categories and answers questions about how cases fall into those categories. • Categorical examples: sex, race, ethnicity

  17. What and Why (cont.) • A quantitative variable is a measured variable (with units) that answers questions about the quantity of what is being measured. • Quantitative examples: income ($), height (inches), weight (pounds)

  18. What and Why (cont.) • Example: In a fitness evaluation, one question asked to evaluate the statement “I consider myself physically fit” on the following scale: • 1 = Disagree Strongly; • 2 = Disagree; • 3 = Neutral; • 4 = Agree; • 5 = Agree Strongly. • Question: Is fitness categorical or quantitative?

  19. What and Why (cont.) • We sense an order to these ratings, but there are no natural units for the variable fitness. • Variables fitness are often called ordinal variables. • With an ordinal variable, look at the Why of the study to decide whether to treat it as categorical or quantitative.

  20. Are Fit People Less Likely to Die of Cancer? --------------Who is the population of interest? • All people • All men who exercise • All men who die of cancer • All men

  21. Identifying Identifiers • Identifier variables are categorical variables with exactly one individual in each category. • Examples: Social Security Number, ISBN, FedEx Tracking Number • Don’t be tempted to analyze identifier variables. • Be careful not to consider all variables with one case per category, like year, as identifier variables. • The Why will help you decide how to treat identifier variables.

  22. Counts Count • When we count the cases in each category of a categorical variable, the counts are not the data, but something we summarize about the data. • The category labels are the What, and • the individuals counted are the Who.

  23. Where, When, and How • Whenand Where give us some nice information about the context. • Example: Values recorded at a large public university may mean something different than similar values recorded at a small private college.

  24. Where, When, and How • GPA of Econ 101 classes. • Class 1 – 2.56 • Class 2 – 3.34 • Where – Washington State university • When – during the fall and spring semesters

  25. Where, When, and How (cont.) • How the data are collected can make the difference between insight and nonsense. • Example: results from voluntary Internet surveys are often useless • Example: Data collection of ‘Who will win Republican Primary?’ • Survey ISU students on campus • Run a Facebook survey • Rasmussen Reports national telephone survey

  26. Why statistics is challenging? • Word problems… • Rules of statistics don’t change • Data is information • If you are struggling with a problem, always ask the W questions about the data collected. • Who • What • When • Where • Why

  27. Chapter 3 • Displaying and Describing • Categorical Data

  28. Methods of Displaying Data • Frequency Table • Relative Frequency table • Bar Chart • Relative Frequency bar chart • Pie Chart • Contingency table • Contingency tables and Conditional Distributions • Segmented Bar charts

  29. Data On Students

  30. Frequency Tables: Making Piles • We can “pile” the data by counting the number of data values in each category of interest. • We can organize these counts into a frequency table, which records the totals and the category names.

  31. Frequency Tables: Making Piles (cont.) • A relative frequency table is similar, but gives the percentages (instead of counts) for each category.

  32. Bar Charts • A bar chart displays the distribution of a categorical variable, showing the counts for each category next to each other for easy comparison. • A bar chart stays true to the area principle. • Thus, a better display for the ship data is:

  33. Bar Charts (cont.) • A relative frequencybar chart displays the relative proportion of counts for each category. • A relative frequency bar chart also stays true to the area principle. • Replacing counts with percentages in the ship data:

  34. What year in school are you? • Freshman • Sophomore • Junior • Senior

  35. Pie Charts • When you are interested in parts of the whole, a pie chart might be your display of choice. • Pie charts show the whole group of cases as a circle. • They slice the circle into pieces whose size is proportional to the fraction of the whole in each category.

  36. Methods of Displaying Data • Frequency Table (How much?) • Relative Frequency table (What percentage?) • Bar Chart (How much?) • Relative Frequency bar chart (What percentage?) • Pie Chart (How much?) • Contingency table and Marginal Distributions • Contingency tables and Conditional Distributions

  37. Contingency Tables • A contingency table allows us to look at two categorical variables together. • It shows how individuals are distributed along each variable, contingent on the value of the other variable. • Example: we can examine the class of ticket and whether a person survived the Titanic:

  38. Contingency Table The two variables in this contingency table is gender and class/section number.

  39. Contingency Tables (cont.) • The margins of the table, both on the right and on the bottom, give totals and the frequency distributions for each of the variables. • Each frequency distribution is called a marginal distribution of its respective variable.

  40. Conditional Distributions • A conditional distribution shows the distribution of one variable for just the individuals who satisfy some condition on another variable. • The following is the conditional distribution of ticket Class, conditional on having survived:

  41. Conditional Distributions (cont.) • The following is the conditional distribution of ticket Class, conditional on having perished:

  42. What Can Go Wrong? (cont.) • Don’t confuse similar-sounding percentages—pay particular attention to the wording of the context. • The percentage of students that are female & in ECO 138 Section 1 • (cell distribution) • The percentage of females that are in ECO 138 Section 1 • (conditioned upon females) • The percentage of ECO 138 Section 1 students that are females • (conditioned upon ECO 138 Section 1)

  43. Conditional Distributions (cont.) • The conditional distributions tell us that there is a difference in class for those who survived and those who perished. • This is better shown with pie charts of the two distributions:

  44. Segmented Bar Charts • A segmented bar chart displays the same information as a pie chart, but in the form of bars instead of circles. • Here is the segmented bar chart for ticket Class by Survival status:

  45. Conditional Distributions (cont.) • We see that the distribution of Class/Section for the male is different from that of the female. • This leads us to believe that Class/Section and Gender are associated, that they are not independent. • The variables would be considered independent when the distribution of one variable in a contingency table is the same for all categories of the other variable.

More Related