220 likes | 492 Views
Logistic Regression. Saed Sayad. Definition. Logistic Regression is a type of regression model where the dependent variable (target) has just two values, such as: 0, 1 Y, N F, T. Sample Dataset. Linear Regression ( Continuous Dependent Variable ). Balance. Months in Business.
E N D
Logistic Regression Saed Sayad www.ismartsoft.com
Definition Logistic Regression is a type of regression model where the dependent variable (target) has just two values, such as: 0, 1 Y, N F, T www.ismartsoft.com
Sample Dataset www.ismartsoft.com
Linear Regression(Continuous Dependent Variable) Balance Months in Business www.ismartsoft.com
Linear Regression(Binary Dependent Variable) Default Months in Business www.ismartsoft.com
Linear Regression Model – Binary Target • If the actual Y is a binary variable then the predicted Y can be less than zero or greater than 1 • If the actual Y is a binary variable then error is not normally distributed. www.ismartsoft.com
Linear Regression Model Y 1 0 X www.ismartsoft.com
Frequency Table www.ismartsoft.com
Frequency Plot Default Probability Months in Business - Bins www.ismartsoft.com
Logistic Function www.ismartsoft.com
Logistic Regression • The logistic distribution constrains the estimated probabilities to lie between 0 and 1. • Maximum Likelihood Estimation is a statistical method for estimating the coefficients of a model. www.ismartsoft.com
Logistic Model Logistic Regression Model Linear Model Y 1 0 X www.ismartsoft.com
Maximum Likelihood Estimation (MLE) • MLE maximizes the log likelihood (LL) which reflects how likely it is that the dependent variable will be predicted from the independent variables. • MLE is an iterative algorithm which starts with initial arbitrary numbers of what the coefficients should be. • After this initial function is estimated, the process is repeated until LL does not change significantly. www.ismartsoft.com
Log Likelihood (LL) • Likelihood is the probability that the dependent variable may be predicted from the independent variables. • LL is calculated through iteration, using maximum likelihood estimation (MLE). • Log likelihood is the basis for tests of a logistic model. www.ismartsoft.com
Log Likelihood Test (-2LL) • The log likelihood test is a test of the significance of the difference between the likelihood ratio for the baseline model minus the likelihood ratio for a reduced model. • This difference is called "model chi-square“. • Also called Likelihood Ratio test. www.ismartsoft.com
Wald Test • A Wald test is used to test the statistical significance of each coefficient (b) in the model. • A Wald test calculates a Z statistic, which is: • This Z value is then squared, yielding a Wald statistic with a chi-square distribution. www.ismartsoft.com
Summary • Logistic Regression is a classification method. • It returns the probability that the binary dependent variable may be predicted from the independent variables. • Maximum Likelihood Estimation is a statistical method for estimating the coefficients of the model. • The Likelihood Ratio test is used to test the statistical significance between the full model and the simpler model. • The Wald test is used to test the statistical significance of each coefficient in the model. www.ismartsoft.com
Questions? www.ismartsoft.com