1 / 21

Loss

Loss. Minimum Expected Loss /Risk. If we want to consider more than zero-one loss, then we need to define a loss matrix with elements L kj specifying the penalty associated with assigning a pattern belonging to class C k as class C j (i.e. Read kj as k-> j or ‘’ k classified as j ’’ )

hshipman
Download Presentation

Loss

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Loss

  2. Minimum Expected Loss/Risk • If we want to consider more than zero-one loss, then we need to define a loss matrix with elements Lkjspecifying the penalty associated with assigning a pattern belonging to class Ck as class Cj (i.e. Read kj as k-> j or ‘’k classified as j’’) • Example: classify medical images as ‘cancer’ or ‘normal’ • Then, to compute the minimum expected loss, we need to look at the concept of expected value. Decision Truth

  3. Expected Value • The expected value of a function f(x), where x has the probability density/mass p(x) is Discrete Continuous • For a finite set of data pointsx1 , . . . ,xn, drawn from the distribution p(x),the expectation can be approximated by the average over the data points:

  4. Reminder: Minimum Misclassification Rate Illustration with more general distributions, showing different error areas.

  5. Minimum Expected Loss/Risk For two classes: Expected loss= ∫R2 L12p(x,C1)dx + ∫R1 L21p(x,C2)dx In general: Regions are chosen to minimize:

  6. Reject Option

  7. Loss for Regression

  8. Regression • For regression, the problem is a bit more complicated and we also need the concept of conditional expectation. E[t|x] = S p(t|x) t(x) t

  9. MultiVariable and Conditional Expectations Rememberthedefinition of theexpectation of f(x) wherex has theprobability p(x) : Conditional Expectation (discrete) E[t|x] = S p(t|x) t(x) t

  10. Decision Theory for Regression Inference step Determine . Decision step For given x, make optimal prediction, y(x). Loss function:

  11. The Squared Loss Function If we used the squared loss as loss function: Advanced After some calculations (next slides...), we can show that:

  12. ADVANCED - Explanation: • Consider the first term inside the loss: • This is equal to: since p(x,t)=p(t|x)p(x) since p(x) doesn’t depend on t, we can move out of the integral; then the integral ∫p(t|x)dt amounts to 1 as we are summing prob.s through all possible t

  13. Advanced: Explanation • Consider the second term inside the loss: • This is equal to zero: since doesn’t depend on t, we can move out of the integral

  14. ADVANCED: Explanation for last step • E[t|x] does not vary with different values of t, so it can be moved out. • Notice that you could also immediately see that the expected value of differences from the mean for the random variable t is 0 (first line of the formula).

  15. Important • Hence we have: • The first term is minimized when we select y(x) as • The second term is independent of y(x) and represents the intrinsic variability of the target • It is called the intrinsic error.

  16. Alternative approach/explanation • Using the squared error as the loss function: • We want to choose y(x) to minimize the expected loss:

  17. Solving for y(x), we get:

More Related