1 / 29

Middle Term Exam

Middle Term Exam. 02/28 (Thursday), take home, turn in at noon time of 02/029 (Friday). Project. 03/14 (Phase 1): 10% of training data is available for algorithm development 04/04 (Phase 2): full training data and test examples are available

stefan
Download Presentation

Middle Term Exam

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Middle Term Exam 02/28 (Thursday), take home, turn in at noon time of 02/029 (Friday)

  2. Project • 03/14 (Phase 1): 10% of training data is available for algorithm development • 04/04 (Phase 2): full training data and test examples are available • 04/17 (submission): submit your prediction before 11:59pm Apr. 20 (Wednesday) • 04/23 and 04/25: • Project presentation • Announce the competition results • 04/28: project report is due

  3. Logistic Regression Rong Jin

  4. Logistic Regression Generative models often lead to linear decision boundary Linear discriminatory model Directly model the linear decision boundary w is the parameter to be decided

  5. Logistic Regression

  6. Logistic Regression Learn parameter w by Maximum Likelihood Estimation (MLE) Given training data

  7. Logistic Regression Convex objective function, global optimal Gradient descent Classification error

  8. Logistic Regression Convex objective function, global optimal Gradient descent Classification error

  9. Illustration of Gradient Descent

  10. Example: Heart Disease 1: 25-29 2: 30-34 3: 35-39 4: 40-44 5: 45-49 6: 50-54 7: 55-59 8: 60-64 • Input feature x: age group id • Output y: if having heart disease • y=1: having heart disease • y=-1: no heart disease

  11. Example: Heart Disease

  12. Example: Text Categorization Learn to classify text into two categories Input d: a document, represented by a word histogram Output y=1: +1 for political document, -1 for non-political document

  13. Example: Text Categorization Training data

  14. Example 2: Text Classification • Dataset: Reuter-21578 • Classification accuracy • Naïve Bayes: 77% • Logistic regression: 88%

  15. Logistic Regression vs. Naïve Bayes • Both are linear decision boundaries • Naïve Bayes: • Logistic regression: learn weights by MLE • Both can be viewed as modeling p(d|y) • Naïve Bayes: independence assumption • Logistic regression: assume an exponential family distribution for p(d|y) (a broad assumption)

  16. Discriminative vs. Generative Discriminative Models Generative Models Model P(x|y) Pros Usually fast converge Cheap computation Robust to noise data Cons Usually performs worse Model P(y|x) Pros Usually good performance Cons Slow convergence Expensive computation Sensitive to noise data

  17. OverfittingProblem • Consider text categorization • What is the weight for a word j appears in only one training document dk?

  18. Using regularization Without regularization Iteration Overfitting Problem Decrease in the classification accuracy of test data

  19. Solution: Regularization • Regularized log-likelihood • The effects of regularizer • Favor small weights • Guarantee bounded norm of w • Guarantee the unique solution

  20. Using regularization Without regularization Iteration Regularized Logistic Regression Classification performance by regularization

  21. Regularization as Robust Optimization • Assume each data point is unknownbutbounded in a sphere of radius  and center xi

  22. Sparse Solution by Lasso Regularization • RCV1 collection: • 800K documents • 47K unique words

  23. Sparse Solution by Lasso Regularization How to solve the optimization problem? Subgradient descent Minimax

  24. Bayesian Treatment Compute the posterior distribution of w Laplacian approximation

  25. Bayesian Treatment Laplacian approximation

  26. Multi-class Logistic Regression How to extend logistic regression model to multi-class classification ?

  27. Conditional Exponential Model Let classes be Need to learn Normalization factor (partition function)

  28. Conditional Exponential Model Learn weights ws by maximum likelihood estimation Any problem ?

  29. Modified Conditional Exponential Model

More Related