1 / 139

GAISEing into the Statistics Common Core Day 2: Statistical Association

GAISEing into the Statistics Common Core Day 2: Statistical Association. June 27, 2013. Team.

tevin
Download Presentation

GAISEing into the Statistics Common Core Day 2: Statistical Association

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GAISEing into the Statistics Common CoreDay 2: Statistical Association June 27, 2013

  2. Team • Dr. Stephanie Casey is an Assistant Prof. of MathEd at EMU. Her research focuses on teacher knowledge for teaching statistics at the middle and secondary levels, motivated by her experience of teaching secondary mathematics for fourteen years. • Dr. Andrew Ross is an Associate Prof. of Math at EMU, specializing in operations research. He was named the Michigan MAA Distinguished Teaching Awardee in 2011. • Dr. Brenda Gunderson is a Senior Lecturer in Stats Dept at the University of Michigan. She coordinates and teaches Statistics and Data Analysis, with approximately 1800 students each term. • AnamariaKazanis, Pstat, is a Senior Statistician at MSU. She is the current president of the Ann Arbor Chapter of ASA • Karen Nielsen  is a PhD student in the Stats Dept. at the University of Michigan. She has taught 2 years of undergraduate introductory Statistics labs and served as a mentor to other Graduate Student Instructors. As part of a cross-disciplinary team, she helped to bring online learning objects into large-enrollment gateway classes. • Mackenzie Fankell graduated from the U of M in 2009 with a degree in psychology.  After graduating she worked as an English teacher in Chile for two years before returning to the US and working as a high school math teacher in Dearborn, MI.  She began her masters in education at U of M in 2012 but transferred to a masters program in statistics later that year.  She hopes to pursue research in education and the social sciences.

  3. Outline of Our Day • 9:00-10:30 a.m. GAISE into the CCSS-M statistics standard(s) of the day: • The standard , • its learning trajectory, and • content • 10:30-10:40 a.m.: BREAK • 10:40 a.m.-12:10 p.m. GAISE activities part 1 • activities that teach the standard through the GAISE process, • debrief on the experience and how to utilize the activity in their own classroom • 12:10-1:00 p.m.: LUNCH BREAK • 1:00-2:00 p.m.: GAISE activities part 2 • 2:00-2:30 p.m.: Interactive lecture on • knowledge of standard and students, • discussing what students are likely to think about and do as they progress through the learning trajectory for the standard; • common student conceptions, effective ways to support students as they move through the learning trajectory • 2:30-3:00 p.m.: Reflections on the day’s standard(s), share ideas, comments, concerns, etc. for teaching the standard(s)

  4. 9:00-10:30 a.m. • GAISE into the CCSS-M statistics standards of the day: • The standards • Learning trajectory • Content

  5. Standards, Grade 8 (part 1) • Investigate patterns of association in bivariate data. • CCSS.Math.Content.8.SP.A.1 Construct and interpret scatter plots for bivariate measurement data to investigate patterns of association between two quantities. Describe patterns such as clustering, outliers, positive or negative association, linear association, and nonlinear association. • CCSS.Math.Content.8.SP.A.2 Know that straight lines are widely used to model relationships between two quantitative variables. For scatter plots that suggest a linear association, informally fit a straight line, and informally assess the model fit by judging the closeness of the data points to the line.

  6. Standards, Grade 8 (part 2) • Investigate patterns of association in bivariate data. • CCSS.Math.Content.8.SP.A.3 Use the equation of a linear model to solve problems in the context of bivariate measurement data, interpreting the slope and intercept. For example, in a linear model for a biology experiment, interpret a slope of 1.5 cm/hr as meaning that an additional hour of sunlight each day is associated with an additional 1.5 cm in mature plant height. • CCSS.Math.Content.8.SP.A.4 Understand that patterns of association can also be seen in bivariate categorical data by displaying frequencies and relative frequencies in a two-way table. Construct and interpret a two-way table summarizing data on two categorical variables collected from the same subjects. Use relative frequencies calculated for rows or columns to describe possible association between the two variables. For example, collect data from students in your class on whether or not they have a curfew on school nights and whether or not they have assigned chores at home. Is there evidence that those who have a curfew also tend to have chores?

  7. Standards, High School (part 1) • Summarize, represent, and interpret data on two categorical and quantitative variables • CCSS.Math.Content.HSS-ID.B.5 Summarize categorical data for two categories in two-way frequency tables. Interpret relative frequencies in the context of the data (including joint, marginal, and conditional relative frequencies). Recognize possible associations and trends in the data. • CCSS.Math.Content.HSS-ID.B.6 Represent data on two quantitative variables on a scatter plot, and describe how the variables are related. • CCSS.Math.Content.HSS-ID.B.6a Fit a function to the data; use functions fitted to data to solve problems in the context of the data. Use given functions or choose a function suggested by the context. Emphasize linear, quadratic, and exponential models. • CCSS.Math.Content.HSS-ID.B.6b Informally assess the fit of a function by plotting and analyzing residuals. • CCSS.Math.Content.HSS-ID.B.6c Fit a linear function for a scatter plot that suggests a linear association.

  8. Standards, High School (part 2) • Interpret linear models • CCSS.Math.Content.HSS-ID.C.7 Interpret the slope (rate of change) and the intercept (constant term) of a linear model in the context of the data. • CCSS.Math.Content.HSS-ID.C.8 Compute (using technology) and interpret the correlation coefficient of a linear fit. • CCSS.Math.Content.HSS-ID.C.9 Distinguish between correlation and causation.

  9. AP Statistics (part 1) • 1 . Exploring Data: Describing patterns and departures from patterns (20%–30%) Exploratory analysis of data makes use of graphical and numerical techniques to study patterns and departures from patterns. Emphasis should be placed on interpreting information from graphical and numerical displays and summaries D . Exploring bivariate data 1 . Analyzing patterns in scatterplots 2 . Correlation and linearity 3 . Least-squares regression line 4 . Residual plots, outliers and influential points 5 . Transformations to achieve linearity: logarithmic and power transformations E . Exploring categorical data 1 . Frequency tables and bar charts 2 . Marginal and joint frequencies for two-way tables 3 . Conditional relative frequencies and association 4 . Comparing distributions using bar charts

  10. AP Statistics (part 2) • IV . Statistical Inference: Estimating population parameters and testing hypotheses (30%–40%) Statistical inference guides the selection of appropriate models. A . Estimation (point estimators and confidence intervals) 8 . Confidence interval for the slope of a least-squares regression line B . Tests of significance 6 . Chi-square test for … homogeneity of proportions, and independence (…two-way tables) 7 . Test for the slope of a least-squares regression line

  11. Learning Trajectories/Progressions • TurnOnCCMath.net • Progressions for the Common Core State Standards in Mathematics • Project SET: http://project-set.com/ • http://project-set.com/presentations/121712-regressionlp-final-released/

  12. Turn On CC Math.net (up to 8th grade)

  13. Progressions for the Common Core State Standards in Mathematics • By The Common Core Standards Writing Team themselves

  14. GAISE Level A, assoc.-related • I. Formulate the Question • → Teachers help pose questions (questions in contexts of interest to the student). • II. Collect Data to Answer the Question • → Students conduct a census of the classroom. • → Students understand individual-to-individual natural variability. • → Students conduct simple experiments with nonrandom assignment of treatments. • III. Analyze the Data • → Students observe association between two variables • → Students use tools for exploring … association, including: • ▪ Scatterplot ▪ Tables (using counts) • IV. Interpret Results

  15. Example:

  16. GAISE Level B, assoc.-related • I. Formulate Questions • → Students begin to pose their own questions • III. Analyze Data • → Students quantify the strength of association between two variables, develop simple models for association between two numerical variables, and use expanded tools for exploring association, including: • ▪ Contingency tables for two categorical variables • ▪ Time series plots • ▪ The QCR (Quadrant Count Ratio) as a measure of strength of association • ▪ Simple lines for modeling association between two numerical variables • IV. Interpret Results • → Students understand basic interpretations of measures of association.

  17. Example: favorite music

  18. GAISE Level C, assoc.-related • I. Formulate Questions • → Students should be able to formulate questions and determine how data can be collected and analyzed to provide an answer • III. Analyze Data • → Students should be able to recognize association between two categorical variables. • → Students should be able to recognize when the relationship between two numerical variables is reasonably linear, know that Pearson’s correlation coefficient is a measure of the strength of the linear relationship between two numerical variables, and understand the least squares criterion in line fitting

  19. Example

  20. Example: plotting residuals

  21. http://project-set.com (there are many other Project SET’s) • Aimed at high school • Loop 1, golf ball drop, could be used in middle school • Informal lines of fit • Loop 2, vertical leap, is for HS: least-squares, residuals • And possibility of categorical association • Loop 2, used car prices, is for HS: least-squares, residuals • Loop 3, NFL QB salaries, is for HS: least-squares, r or R^2 • Loop 4&5, txting, just for AP Stat

  22. Loop 1: Informal Fit • Using Golf Ball Drop data • Please read the handout, use spaghetti to show your informally fitted line. • Not allowed to break spaghetti to connect individual dots! • Finish instructions on handout. • Also, what is wrong with experimental plan?

  23. Lack of Replication! • When possible, should do at least 2 experiments under each experimental setting (drop height, in this case) • Helps quantify uncertainty at each x value • Can then use fancy tests for nonlinearity (post-AP-level stats) What if we had only done one trial at each dose? Might see just the diamonds, or just the Xs. Also, when designing, choose 3 or more X values, so we can detect nonlinearity.

  24. Show 3 Types of Scatterplots: • Designed experiment, with replication Don’t average the y values at each x value to “make it simpler”!

  25. Show 3 Types of Scatterplots: • Observational Study

  26. Show 3 Types of Scatterplots: • Time Series

  27. Common Suggestions for Informal Fits • Connect First and Last Points • Connect Lowest and Highest Points • Divide the data in half • Connect as many points as possible • And others we’ll get to later. • Before we go on, sketch graphs that show these ideas aren’t great.

  28. Common suggestions for informal fits:

  29. Another common suggestion

  30. Loop 2: Residuals = actual - predicted NOTE: Residuals are measured VERTICALLY, not horizontally and not perpendicular to the line of best fit. New ideas for informal fit?

  31. Usual student’s answer: sum the absolute residuals • Not a bad idea! • But, some bad points about it: • Historically, harder to do than what we’ll see next. • Sometimes the choice of line is not unique. • Advanced statistical theory supports a different choice. • Good points: • Modern software can do it. • It’s resistant to outliers.

  32. Usual statistician’s answer: sum the squared residuals • This applet shows the geometric squares of the residuals:http://www.geogebra.org/en/upload/files/mrfox001/line_of_best_fit.html • Does CCSSM require use or knowledge of formulas to find the line that minimizes the sum of squared residuals? • Standards aren’t so clear to me; the draft Progressions document seems to focus only on using technology to fit the line automatically.

  33. Standards, High School (part 1) • Summarize, represent, and interpret data on two categorical and quantitative variables • CCSS.Math.Content.HSS-ID.B.5 Summarize categorical data for two categories in two-way frequency tables. Interpret relative frequencies in the context of the data (including joint, marginal, and conditional relative frequencies). Recognize possible associations and trends in the data. • CCSS.Math.Content.HSS-ID.B.6 Represent data on two quantitative variables on a scatter plot, and describe how the variables are related. • CCSS.Math.Content.HSS-ID.B.6a Fit a function to the data; use functions fitted to data to solve problems in the context of the data. Use given functions or choose a function suggested by the context. Emphasize linear, quadratic, and exponential models. • CCSS.Math.Content.HSS-ID.B.6b Informally assess the fit of a function by plotting and analyzing residuals. • CCSS.Math.Content.HSS-ID.B.6c Fit a linear function for a scatter plot that suggests a linear association.

  34. Standards, High School (part 2) • Interpret linear models • CCSS.Math.Content.HSS-ID.C.7 Interpret the slope (rate of change) and the intercept (constant term) of a linear model in the context of the data. • CCSS.Math.Content.HSS-ID.C.8 Compute (using technology) and interpret the correlation coefficient of a linear fit. • CCSS.Math.Content.HSS-ID.C.9 Distinguish between correlation and causation.

  35. What is the length of this line?

  36. What is the length of this line?

  37. Is this a square?What is its Area?

  38. Popular drawing: sum of squared residuals But squares are actually coming “out of the page” at us; both base & depth are measured in $

  39. What is the danger lurking in the equation that it shows?

  40. http://xkcd.com/833/

  41. Label the axes! • It is very easy to get confused: is y=original data, or y=residuals? • Other, more advanced plots have: • X=predicted, y=actual • X=predicted, y=residual • X=run sequence of data (1st, 2nd, etc) , y=residual • Here are six recommended plots for examining the residuals: http://www.itl.nist.gov/div898/handbook/eda/section3/6plot.htm However, it neglects another type that it mentions elsewhere: a run-order or run-sequence plot.

  42. It is standard practice to graph the residuals!

  43. Timing data from yesterday Let’s try it on the TI calculators.

  44. Mackenzie 17 17Lori 21 27Paul 17 21ASK 24 22Katelyn 23 20Karin 24 33Allison 18 19Karen 20 20Jamie 22 19Andra 18 20Susan 24 25Sherita 53 45Susan 15 16Stephanie 23 27Jordan 27 26Ed 18 18Mila 25 27Wendy 24 23Claudia 28 27Steve 24 26Linda 25 25Karen 28 28Elizabeth 28 26Jeff 26 25Kim 38 26Jeannette 19 24Lisa 29 28Joanne 30 25Molly 31 38Laura 33 35

  45. With line of best fit: • What if we flip the x & y data before doing regression?

  46. It is standard practice to graph the residuals!

  47. What should residual graphs look like? • No patterns! • If there are any patterns, that means our original regression missed something. Which of these are okay/not okay? Each graph has x=original x data values, y=residuals

More Related