STAT 231 MIDTERM 2

STAT 231 MIDTERM 2

Introduction • Niall MacGillivray • 2B Actuarial Science

Agenda • 6:05 – 6:25 Likelihood Functions and MLEs • 6:25 – 6:35 Regression Model • 6:35 – 6:50 Gaussian, Chi Square, T RVs • 6:50 – 7:00 Sampling and Estimators • 7:00 – 7:30 Confidence Intervals • 7:30 – 8:00 Hypothesis Testing

Probability Models • Random Variables • Represents what we’re going to measure in our experiment • Realizations • Represents the actual data we’ve collected from our experiment

Binomial Model • Problem: what is π, the proportion of the target population that possesses a certain characteristic • We will use our data to estimate π • Let X be a random variable that represents the number of people in your sample (of size n) that possesses the characteristic • X ~ BIN (n, π) • A realization of X will give us the number of people in our sample that possesses the characteristic.

Response Model • Problem: what is μ, the average variate of the target population • We will use our collected data to estimate μ • Let Y be a random variable that represents the measured response variate • Y = μ + R R~G(0, σ ) • Y ~ G(μ, σ) • A realization of Y is the measured attribute of one unit in the sample

Maximum Likelihood Estimation • Binomial = ; x = # of successes • Response = ; yi is the ith realization • Maximum Likelihood Estimation • A procedure used to determine a parameter estimate given any model

Maximum Likelihood Estimation • First, we assume our data collected will follow a distribution • Before we collect the sample  random variables • {Y1, Y2, …, Yn} • After we collect the sample  realizations • {y1, y2, …, yn} • We know the distribution of Yi (with unknown parameters), hence we know the PDF/PMF

Likelihood Function • The Likelihood Function: • Likelihood: the probability of observing the dataset you have • We want to choose an estimate of the parameter θ that gives the largest such probability • Ω is the parameter space, the set of possible values for θ Discrete (no sample) Discrete (with sample) Continuous

MLE Process • Step One: Define the likelihood function • Step Two: Define the log likelihood function l(θ ) = ln[L(θ)] • Step Three: Take the derivative with respect to θ • If there are multiple parameters to estimate: take partial derivatives with respect to each parameter • Step Four: Solve for zero to arrive at the maximum likelihood estimate • Step Five: Plug in data values (if given) to arrive at a numerical maximum likelihood estimate (or values of other MLEs if multiple parameters)

Example 1 Derive the MLEs of the Gaussian distribution with parameters μ and σ, for a sample of data of size n.

Regression Model • Let Y|{X=x} be a random variable that represents the measured response variate, given a certain value of the explanatory variate • We can define a distribution Y|{X=x} ~ G(μ(x), σ) • Simple Linear Regression: Y |{X=x} = α + βx + R, where R ~ G (0, σ) • Y|{X=x} ~ G(α + βx , σ) • Response Model: μ is the average response variate of the target population • Regression Model: α + βx is the average response variate of the subgroup in the target population, as specified by the value of the explanatory variate x • Allows us to look at subgroups within the target population

Regression Model • Problem: what is α, the average response variate of the subgroup in the target population where the explanatory variate, x, is equal to 0? • Problem: what is β, the change in the average value of the response variate, given a one unit change in x, our explanatory variate? • We will use our collected data to estimate α or β using the MLE method

Example 2 Derive the MLEs of the simple linear regression model with parameters α, β, and σ given a sample of size n.

Gaussian Distribution Gaussian Distribution • f(x; μ, σ) = • If Y ~ G(μ,σ), then Z = ~ G(0,1) • If Y1,…,Yn are independent G(μ1,σ1),…,G(μn,σn): • ~ G( , ) • If Y1,…,Yn are iid G(μ, σ): • ~ G(nμ, σ ) • = ~ G(μ, σ/ )

Chi-Squared • Y, a random variable following a distribution is defined as , Y > 0 where • We say , where n = degrees of freedom • E(Y) = n • If and , • Can use tables to get quantiles based on degrees of freedom

Chi-Squared Table

Example 3 • Prove that, if X ~ and Y ~ , a) E(X) = m b) X + Y ~ • If m = 10, estimate P(X > 20)

t Distribution • T, a random variable following a distribution, is defined as where and Z and S are independent We say where m is the degrees of freedom

T Table

Estimators • In the data stage, we assume follows a certain distribution • Population Distribution • From MLE in Example 1, we obtained an estimate • This is a number • The corresponding estimator is • This is a random variable • Replace all instances of realizations (xi) with RVs (Xi) • If = g(y1,y2,…,yn), then = g(Y1,Y2,…,Yn) • We can then look at the distribution of these parameter estimators, aka the sampling distribution of θ • We can make probability statements about the accuracy of our parameter estimates

Response Model Estimators • In the response model Y = μ + R ~ G(μ, σ ): • For a sample y1,y2,…,yn , = /n • The corresponding estimator = /n • The distribution of ~ G(μ, σ/ ) • The sample error, - μ , ~ G(0, σ/ ) • For a sample y1,y2,…,yn , = • The corresponding estimator = • In this case, we call the sample variance

Confidence Intervals • In estimation problems, we use collected data to determine estimates for model parameters • Confidence intervals are statements about the true model parameter being in between two values: • We can make a statement about our ‘confidence’ that μ is located somewhere between a and b • The confidence is measured by probability statements • We will use sampling distributions to make probability statements as a starting point in determining the end values of the confidence interval

Confidence Intervals • A confidence interval helps answer the question: “What is the probability that ?” • C(θ) = P[ ] coverage probability • Confidence interval = [l(D), u(D)] • Interpretation: the true value of the parameter θ will fall in the confidence interval [l(D), u(D)] in proportion C(θ) of all cases

Confidence Intervals for the Response Model • Sampling distribution for the response model: • But we want a distribution we can work with (we want to use our probability tables) so standardizing gives • For now, we will assume the true value of σ (population standard deviation) is known

Confidence Intervals for the Response Model • Our goal: find (a,b) such that P( ) = 0.95 • Method: construct a 95% interval estimator (coverage interval) such that or equivalently What are a and b? Use = to get a confidence interval

Example 4 • Coverage Interval: • Confidence Interval: • is called the Standard Error

Confidence Intervals for the Response Model • Often, we don’t know the value of σ • So we need to use the sample standard deviation as an estimator for σ: becomes where =

Confidence Intervals for the Response Model • no longer follows a G(0,1) distribution! • ~ tn-1 • New 95% CI for :

Example 5

T Table

Confidence Intervals for the Binomial Model • Population Distribution: Y ~ Bin (n, π) • The parameter we want to estimate is π • Using MLE, we get an estimate of • This is a number • The corresponding estimator is • This is a random variable • What is the sampling distribution?

Confidence Intervals for the Binomial Model • To derive the sampling distribution for , consider the expectation and variance of : • E(Y) = nπ • Var(Y) = nπ(1 – π) • CLT tells us that, for large n, Y is well approximated as a Gaussian: • Then will also be a Gaussian:

Confidence Intervals for the Binomial Model Standardizing gives We will use an approximation instead of Confidence Interval:

Example 6

Confidence Intervals for the Regression Model • Population Distribution: • Using MLE, we obtain: • Your course notes simplify by: 

Confidence Intervals for the Regression Model • What is the sampling distribution of ? • is a linear combination of independent Gaussians, and thus is Gaussian itself • Standardizing gives • If σ is unknown, then

Confidence Intervals for the Regression Model • Confidence Interval: • Assuming sigma is unknown, we will get c from the t table with (n – 2) degrees of freedom

Terminology The random variables that we’ve used to construct confidence intervals are called pivotal quantities • Distribution does not depend on choice of parameters Confidence intervals are often written in the form • Point Estimate c Standard Error (SE) • Point Estimate: the MLE for the parameter • c: found using probability tables depending on the distribution of the pivotal quantity

Terminology Standard Error (SE): square root of the variance of our sampling distribution (replace all unknown parameters (i.e. σ) with estimates) • Response (σ known) • Response (σ unknown) • Binomial • Regression

Confidence Interval Recap Response Model (σ known) Response Model (σ unknown) Binomial Model Regression Model

Interpretation of theConfidence Interval • Does NOT mean there’s a 95% chance our true parameter will be between a and b • 95% confidence interval: after repeatedly collecting data and calculating lots of confidence intervals, around 95% of them will contain the actual parameter

Hypothesis Testing 1) Define the null hypothesis, define the alternate hypothesis 2) Define the test statistic, identify the distribution , calculate the observed value 3) Calculate the p-value 4) Make a conclusion about your hypothesis

Hypothesis Testing 1) Define the null hypothesis, define the alternate hypothesis: Null hypothesis always contains an “=” sign!

Hypothesis Testing 2) Define the test statistic, identify the distribution; calculate the observed value Assume that H0 will be tested using some random data Test Statistic: random variable, denoted D Distribution: of the test statistic, the standardized sampling distribution of the model based on H0 Observed Value: a realization of the test statistic from our data

Hypothesis Testing Test Statistics: These distributions only hold because of the Null hypothesis: θ = θ0 Response (σ known) Binomial Response (σ unknown)

Hypothesis Testing Calculate the observed value: Response (σ known) Binomial Response (σ unknown)

Hypothesis Testing 3) Calculate the p-value p-value = for p-value = for p-value = for • p-value (aka observed significance level) is the tail probability of observing a dataset more extreme than our sample data, given H0 is true

Hypothesis Testing 4) Make a conclusion about your hypothesis General Rule of Thumb • If the p-value > 0.05, do not reject the null hypothesis • If the p-value < 0.05, reject the null hypothesis

Example 7 What if we want to test if the average weight is less than 18 ounces?

STAT 231 MIDTERM 2

STAT 231 MIDTERM 2

Presentation Transcript

1. Stat 231. A.L. Yuille. Fall 2004

Midterm 2 Revision 2

Midterm 2 Review

1. Stat 231. A.L. Yuille. Fall 2004

1. Stat 231. A.L. Yuille. Fall 2004.

1. Stat 231. A.L. Yuille. Fall 2004

Midterm 2

Midterm 2

Midterm Review 2

Midterm 2 Practice

Midterm Exam (2)

Midterm 2 Revision

Stat 470-2

STAT 231 Winter 2011

Midterm 2

Stat 211 Midterm 2 SOS Session

CS 231 Midterm 1

STAT 231 MIDTERM 1 Fall 2010

1. Stat 231. A.L. Yuille. Fall 2004

STAT 230 MIDTERM 2 EXAM-AID November 14, 2011

Midterm 2 Review