1 / 54

Did you sign up for My Stat Lab?

Did you sign up for My Stat Lab?. Yes No. Announcements. Homework #1 due Sunday at 10:00 pm Quiz #1 in class August 28 th Part 1 of the Data Project due September 4 th. Data Project. Objective: Ask a question and try to answer it using statistics.

Download Presentation

Did you sign up for My Stat Lab?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Did you sign up for My Stat Lab? • Yes • No

  2. Announcements • Homework #1 due Sunday at 10:00 pm • Quiz #1 in class August 28th • Part 1 of the Data Project due September 4th

  3. Data Project • Objective: Ask a question and try to answer it using statistics. • Step 1: DATA COLLECTION - Due Wednesday September 4th in class. • Step 2: DESCRIPTION OF DATA – Due Monday September 16th in class • Step 3: QUESTIONS – Due Monday October 28th in class • Step 4: FINAL DATA PROJECT – Due by Thursday December 5th 5PM

  4. Collect Data • Bureau of Labor Statistics (BLS): http://bls.gov/ • Energy Information Administration (EIA): http://www.eia.gov/ • Bureau of Economic Analysis (BEA): http://www.bea.gov/ • Environmental Protection Agency (EPA): http://epa.gov/ • U.S. Census Bureau: http://www.census.gov/ • Google Data http://www.google.com/publicdata/directory

  5. Review from Last Class • A categorical (or qualitative) variable names categories and answers questions about how cases fall into those categories. • A quantitative variable is a measured variable (with units) that answers questions about the quantity of what is being measured. • Quantitative examples: income ($), height (inches), weight (pounds)

  6. Review from Last Class • Ordinal variables there are no natural units for the variable interest in teaching, but the order of the number reveals information. • Identifier variables are categorical variables with exactly one individual in each category.

  7. Homework Problem • We want to study the law of demand and if it applies to hot dogs. • Compile a list of 20 hotdogs, giving the brand, price, size in ounces, type (beef, pork, turkey, vegetarian), and overall taste rating (good, fair, bad). • Implement the survey on Monday and Wednesday at 5 different grocery stores and also collect the daily sales.

  8. What type of variable is Brand? • Categorical • Quantitative • Ordinal • Identifier

  9. What type of variable is price? • Categorical • Quantitative • Ordinal • Identifier

  10. What type of variable is overall taste rating (good, fair, bad)? • Categorical • Quantitative • Ordinal • Identifier

  11. What type of variable is Daily Sales? • Categorical • Quantitative • Ordinal • Identifier

  12. Where, When, and How • When and Where give us some nice information about the context. • Example: Values recorded at a large public university may mean something different than similar values recorded at a small private college.

  13. Where, When, and How • Class Grade of Econ 101 classes. • Class 1 – 2.56 • Class 2 – 3.34 • Where – Washington State University • When – during the fall and spring semesters

  14. Where, When, and How (cont.) • How the data are collected can make the difference between insight and nonsense. • Example: results from voluntary Internet surveys are often useless • Example: Data collection of ‘Who will win Republican Primary?’ • Survey ISU students on campus • Run a Facebook survey • Rasmussen Reports national telephone survey

  15. Identify the Who in the following dataset? • Are physically fit people less likely to die of cancer? • Suppose an article in a sports medicine journal reported results of a study that followed 22,563 men aged 30 to 87 for 5 years. • The physically fit men had a 57% lower risk of death from cancer than the least fit group.

  16. Who are they studying? • The cause of death for 22,563 men in the study • The fitness level of the 22,563 men in the study • The age of each of the 22,563 men in the study • The 22,563 men in the study

  17. Are Fit People Less Likely to Die of Cancer? --------------Who is the population of interest? • All people • All men who exercise • All men who die of cancer • All men

  18. Chapter 3 • Displaying and Describing • Categorical Data • Two datasets • Students currently in my class • Passengers on the Titanic.

  19. Methods of Displaying Data • Frequency Table • Relative Frequency table • Bar Chart • Relative Frequency bar chart • Pie Chart • Contingency table • Contingency tables and Conditional Distributions • Segmented Bar charts

  20. Data On Students

  21. Frequency Tables: Making Piles • We can “pile” the data by counting the number of data values in each category of interest. • We can organize these counts into a frequency table, which records the totals and the category names.

  22. Frequency Tables: Making Piles (cont.) • A relative frequency table is similar, but gives the percentages (instead of counts) for each category.

  23. Bar Charts • A bar chart displays the distribution of a categorical variable, showing the counts for each category next to each other for easy comparison. • A bar chart stays true to the area principle. • Thus, a better display for the ship data is:

  24. Bar Charts (cont.) • A relative frequencybar chart displays the relative proportion of counts for each category. • A relative frequency bar chart also stays true to the area principle. • Replacing counts with percentages in the ship data:

  25. What year in school are you? • Freshman • Sophomore • Junior • Senior

  26. Pie Charts • When you are interested in parts of the whole, a pie chart might be your display of choice. • Pie charts show the whole group of cases as a circle. • They slice the circle into pieces whose size is proportional to the fraction of the whole in each category.

  27. Methods of Displaying Data • Frequency Table (How much?) • Relative Frequency table (What percentage?) • Bar Chart (How much?) • Relative Frequency bar chart (What percentage?) • Pie Chart (What percentage? Or How much?) • Contingency table and Marginal Distributions • Contingency tables and Conditional Distributions

  28. Contingency Tables • A contingency table allows us to look at two categorical variables together. • It shows how individuals are distributed along each variable, contingent on the value of the other variable. • Example: we can examine the class of ticket and whether a person survived the Titanic:

  29. Contingency Tables (cont.) • Each cell of the table gives the count for a combination of values of the two values. • For example, the second cell in the crew column tells us that 673 crew members died when the Titanic sunk.

  30. Contingency Tables The two variables in this contingency table are gender and class/section number.

  31. Contingency Tables (cont.) • The margins of the table, both on the right and on the bottom, give totals and the frequency distributions for each of the variables. • Each frequency distribution is called a marginal distribution of its respective variable.

  32. Marginal Distributions The two variables in this contingency table are gender and class/section number.

  33. Conditional Distributions • A conditional distribution shows the distribution of one variable for just the individuals who satisfy some condition on another variable. • The following is the conditional distribution of ticket Class, conditional on having survived:

  34. Conditional Distributions (cont.) • The following is the conditional distribution of ticket Class, conditional on having perished:

  35. Conditional Distributions – Conditioned Upon Gender The two variables in this contingency table are gender and class/section number.

  36. Conditional Distributions – Conditioned Upon Gender The two variables in this contingency table are gender and class/section number.

  37. Conditional Distributions – Conditioned Upon Class The two variables in this contingency table are gender and class/section number.

  38. Conditional Distributions – Conditioned Upon Class The two variables in this contingency table are gender and class/section number.

  39. What Can Go Wrong? (cont.) • Don’t confuse similar-sounding percentages—pay particular attention to the wording of the context. • The percentage of students that are female & in ECO 138 Section 1 • (cell distribution) • The percentage of females that are in ECO 138 Section 1 • (conditioned upon females) • The percentage of ECO 138 Section 1 students that are females • (conditioned upon ECO 138 Section 1)

  40. Conditional Distributions (cont.) • The conditional distributions tell us that there is a difference in class for those who survived and those who perished. • This is better shown with pie charts of the two distributions:

  41. If you are male, what year in school are you? • Fr. • So. • Jr. • Sr.

  42. If you are female, what year in school are you? • Fr. • So. • Jr. • Sr.

  43. Conditional Distributions (cont.) • We see that the distribution of Class/Section for the male is different from that of the female. • This leads us to believe that Class/Section and Gender are associated, that they are not independent. • The variables would be considered independent when the distribution of one variable in a contingency table is the same for all categories of the other variable.

  44. Segmented Bar Charts • A segmented bar chart displays the same information as a pie chart, but in the form of bars instead of circles. • Here is the segmented bar chart for ticket Class by Survival status:

More Related