1 / 8

Mathematical formulation

Mathematical formulation. XIAO LIYING. Mathematical formulation. A set of training examples (x1,y1),…..( xn,yn ) where xi R n and yi ( model parameters and intercept . ). Mathematical formulation.

Download Presentation

Mathematical formulation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mathematical formulation XIAO LIYING

  2. Mathematical formulation • A set of training examples (x1,y1),…..(xn,yn) where xiRnand yi • (model parameters and intercept .)

  3. Mathematical formulation • A common choice to find the model parameters is by minimizing the regularized training error given by • L is a loss function that measure model fit. • R is a regularization term (aka penalty) that penalizes model complexity • is a non-negative hyperparameter

  4. Mathematical formulation • Different choices for L entail different classifiers such as Hinge: (soft-margin) Support Vector Machines. Log: Logistic Regression. Least-Squares: Ridge Regression. Epsilon-Insensitive: (soft-margin) Support Vector Regression. • All of the above loss functions can be regarded as an upper bound on the misclassification error (Zero-one loss) as shown in the Figure below.

  5. Mathematical formulation

  6. Mathematical formulation • Popular choices for the regularization term R include: • L2 norm: • L1 norm: , which leads to sparse solutions. • Elastic Net: , a convex combination of L2 and L1, where is given by 1 - l1_ratio. • The Figure below shows the contours of the different regularization terms in the parameter space when

  7. The Figure below shows the contours of the different regularization terms in the parameter space when .

  8. 1.3.6.1 SGD • Stochastic gradient descent is an optimization method for unconstrained optimization problems. In contrast to (batch) gradient descent, SGD approximates the true gradient of by considering a single training example at a time. • Where is the learning rate which controls the step-size in the parameter space. The intercept b is updated similarly but without regularization.

More Related