Understanding Sources of Variation in Housing Price Predictions

Stat 513 – Day 3 Sources of Variation (Section 1.1)

Last Time – Review ideas?

Last Time • Lab 1 – predicting housing prices • Model 1: predicted price = $408,060, • Residuals SE = $240,380 • Model 2: predicted price = -59.37 + 0.2127 sqft • Residuals SE = $185,070 • Model 3: predicted price = 86.76 +.22 sqft if lake front 58.11 + 08 sqft if not lake front • Residuals SE = $49,480 • Which model do you prefer?

Definition • The study protocol outlines how the study will be conducted, providing enough detail, so that someone else could carry out the same study under identical conditions. It is important to consider the research question when evaluating whether the study protocol will be appropriate.

Data Collection • You will be given a slip of paper with 30 letters to memorize in order • You will have 20 seconds to look at the letters, then turn the paper over and write down as many as you can remember in order • Your score is the number of correct letters before the first mistake

Definitions • The explanatory variable deliberately manipulated in an experiment is often called a treatment variable or factor. • For categorical explanatory variables, the different categories of the treatment variable are often called levels. • In experiments, the objects (or “subjects”) that we are measuring are often called experimental units rather than observational units. • The conditions we impose on the experimental units (here the levels of the treatment variable) are also called treatments. Each experimental unit is assigned to one treatment.

Key Idea • The goal of random assignment is to reduce the chances of there being any confounding variables in the study. By creating groups that are expected to be similar with respect to variables (other than the treatment variable of interest) that may impact the response, random assignment attempts to eliminate confounding. • See Example 1.1 for more discussion.

Definition • A study is double blind if (i) the subjects do not know which treatment condition they are in, and, (ii) the person evaluating the response variable does not know which treatment condition the subject is in If only one of the above conditions are true for a study, then the study is said to be single blind.

Definition • Inclusion criteria are the set of characteristics that individuals must have in order to participate in a study.

Study Conclusions • Were the experimental units randomly selected from the population of interest? • If so, then feel comfortable generalizing our conclusions to that population • Were the experimental units randomly assigned to the treatments? • If so, then feel comfortable drawing cause-and-effect conclusions (potentially)

Study Conclusions

For next time • Finish and submit Lab 1 • Uploading files • Be working on HW 1 • Begin reading Ch. 1

Understanding Sources of Variation in Housing Price Predictions

Understanding Sources of Variation in Housing Price Predictions

Presentation Transcript

Vision for the Blind . Stat 19 SEM 2. 263057202. Talk 1.

Statistical Inference and Regression Analysis: Stat-GB.3302.30, Stat-UB.0015.01

Statistical Inference and Regression Analysis: Stat-GB.3302.30, Stat-UB.0015.01

Statistical Inference and Regression Analysis: Stat-GB.3302.30, Stat-UB.0015.01

Intermediate Applied Statistics STAT 460

STAT 3130

Line of Best Fit

Statistical Office of the Republic of Serbia

CS 311 – Lecture 12 Outline

Stat 470-8

Statistical Office of the Republic of Serbia

Statistics Major at Penn State