Logistic Regression: Understanding Confidence and Predicted Probability

CS 189 Brian Chu brian.c@berkeley.edu Office Hours: Cory 246, 6-7p Mon. (hackerspace lounge) twitter: @brrrianchu brianchu.com

Agenda • Email me for slides • Questions? • Random / HW • Why logistic regression • Worksheet

Questions • Any grad students? • Thoughts on final project? • Who would be able to make my 12-1pm section? • Lecture / worksheet split section • Questions? Concerns? • Lecture pace / content / coverage?

Features • sklearn hog, sklearntfidf, bag of words, etc.

Terminology • Shrinkage (regularization) • Variable with a hat (ŷ)  estimated/predicted • P(Y | X) ∝ P(X |Y) P(Y) • posterior ∝ likelihood * prior

Why logistic regression • Odds • measure of relative confidence • P = .9998; 4999:1 • P = .9999; 9999:1 • Doubled confidence! • .5001%  .5002; 1.0004:1  1.0008:1 • (basically no change in confidence) • “relative increase or decrease of a factor by one unit becomes more pronounced as the factors absolute difference increases.”

Log-odds (calculations in base 10) • (0, 1)  (-∞, ∞) • Symmetric: .99 ≈ 2, .01 ≈ -2 • X units of log-odds  same Y % change in confidence • 0.5  0.91 ≈ 0  1 • .999  .9999 ≈ 3  4 • “Log-odds make it clear that increasing from 99.9% to 99.99% is just as hard as increasing from 50% to 91%” Credit: https://dl.dropboxusercontent.com/u/34547557/log-probability.pdf

Logistic Regression • w •x = lg[ P(Y=1|x) / (1 – P(Y=1|x) ] • Intuition: some linear combination of the features tells us the log-odds that Y = 1 • Intuition: some linear combination of the features tells us the “confidence” that Y = 1

Logistic Regression: Understanding Confidence and Predicted Probability