1 / 7

Regression Variance-Bias Trade-off

Regression Variance-Bias Trade-off. Regression. We need a regression function h(x ) We need a loss function L(h(x),y ) We have a true distribution p(x,y ) Assume a quadratic loss, then:. Note: y t ; h(x)y(x ). e stimation error. n oise error. Regression: Learning.

ashton
Download Presentation

Regression Variance-Bias Trade-off

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. RegressionVariance-Bias Trade-off

  2. Regression • We need a regression function h(x) • We need a loss function L(h(x),y) • We have a true distribution p(x,y) • Assume a quadratic loss, then: Note: yt; h(x)y(x) estimation error noise error

  3. Regression: Learning • Assume h(x) is a parametric curve, e.g. h(x)=af(x)+b. • Minimize loss over the parameters (e.g. a,b), where p(x,y) is replaced with a sum over • data-cases (called a “Monte Carlo sum”): • That is: we solve: • The same results follows from posing a Gaussian model q(y|x) for p(y|x) with mean h(x) • and maximizing the probability of the data over the parameters. • (This approach is taken in 274; probabilistic learning).

  4. Back to overfitting • More parameters lead to more flexible functions • which may lead to over-fitting. • Formalize this by imagining very many datasets D, • all of size N. Call h(x,D) the regression function • estimated from a dataset D of size N, i.e. a(D)f(x)+b(D), • then: • Next, average over p(D)=p(x1)p(x2)….p(xN). • Only first term depends on D: 0 Variance+bias2

  5. Bias/Variance Tradeoff A B C A: The label y label fluctuates (label variance). B: The estimate of h fluctuates across different datasets (estimation variance). C: The average estimate of h does not fit well to the true curve (squared estimation bias).

  6. Bias/Variance Illustration Variance Bias

  7. Relation to Over-fitting Training error is measuring bias, but ignoring variance. Testing error / X-validation error is measuring both bias and variance. Increasing regularization (less flexible models) Decreasing regularization (more flexible models)

More Related