80 likes | 100 Views
ExcelR is now offering data science course in chennai data science is global and it is comprehensive in the market . ExcelR is consider best data science institute where 400 were placed in multinational companies .Here the total life cycle is covered in course
E N D
Advanced Regression Nega)ve Binomial Poisson Regression ZeroInflated Mul)nomial Regression AGENDA © 2013 ExcelR Solutions. All Rights Reserved
Multinomial Regression • Logis'cregression(Binomialdistribu'on)isusedwhenoutputhas‘2’categories • Mul'nomialregression(classifica'onmodel)isusedwhenoutputhas>‘2’categories • Extensiontologis'cregression • Nonaturalorderingofcategories Modeof transport Car Carpool Bus Rail Allmodes Count 218 32 81 122 453 • Responsevariablehas>‘2’categories&henceweapplymul'logit Probability 0.48 0.07 0.18 0.27 1 • Understandtheimpactofcost&'meonthevariousmodesoftransport © 2013 ExcelR Solutions. All Rights Reserved
Multinomial Regression Whetherwehave‘Y’(response)or‘X’(predictor),whichiscategoricalwith‘s’categories ü Lowestinnumerical/lexicographicalvalueischosenasbaseline/reference ü Missinglevelinoutputisbaselinelevel ü Wecanchoosethebaselinelevelofourchoicebasedon‘relevel’func'oninR ü Modelformulatestherela'onshipbetweentransformed(logit)Y&numericalXlinearly ü Modelingquan'ta'vevariableslinearlymightnotalwaysbecorrect • © 2013 ExcelR Solutions. All Rights Reserved
Multinomial Regression - Output Itera'onHistory: • Itera'veprocedureisusedtocomputemaximumlikelihoodes'mates • #itera'ons&convergencestatusisprovided • -2logL=2*nega'veloglikelihood • -2logLhasχ2distribu'on,whichisusedforhypothesistes'ngofgoodnessoffit #parameters=27 © 2013 ExcelR Solutions. All Rights Reserved
Multinomial Regression - Output ‘car’hasbeenchosenasbaseline • x=vectorrepresen'ngthevaluesofallinputs • Log(P(choice=carpool|x)/P(choice=car|x)=β20+β21*cost.car+β22*cost.carpool+……………. Thisequa'oncomparesthelogofprobabili'esofcarpooltocar Theregressioncoefficient0.636indicatesthatfora‘1’unitincreasesthe‘cost.car’,thelogoddsof‘carpool’to‘car’ increasesby0.636 • Interceptvaluedoesnotmeananythinginthiscontext • • IfwehaveacategoricalXalso,sayGender(female=0,male=1),thenregressioncoefficient(say0.22)indicates thatrela'vetofemales,malesincreasethelogoddsof‘carpool’to‘car’by0.22 © 2013 ExcelR Solutions. All Rights Reserved
Probability • Letp=p(x|A)betheprobabilityofanyevent(sayairi'on)undercondi'onA(say gender=female) Odds • Thenp(x|A)÷(1-p(x|A)iscalledtheoddsassociatedwiththeevent Odds Ratio • Iftherearetwocondi'onsA(gender=female)&B(gender=male)thenthera'o p(x|A)÷(1-p(x|A)/p(x|B)÷(1-p(x|B)iscalledasoddsra'oofAwithrespecttoB Relative Risk • p(x|A)÷p(x|B)iscalledasrela'verisk hips://en.wikipedia.org/wiki/Rela've_risk © 2013 ExcelR Solutions. All Rights Reserved
Odds Ratio • Oddsra'oiscomputedfromthecoefficientsinthelinearmodelequa'onbysimply exponen'a'ng • Exponen'atedregressioncoefficientsareoddsra'oforaunitchangeinapredictor variable • Theoddsra'oforaunitincreaseincost.caris1.88forchoosingcarpoolvscar © 2013 ExcelR Solutions. All Rights Reserved
Goodness of fit Linear AnalysisofVariance ResidualDeviance OLS GLM AnalysisofDeviance ResidualSumofSquares MaximumLikelihood • ResidualDevianceis-2logL • AddingmoreparameterstothemodelwillreduceResidualDevianceevenifitisnot goingtobeusefulforpredic'on • Inordertocontrolthis,penaltyof“2*numberofparameters”isaddedtoto Residualdeviance • Thispenalizedvalueof-2logLiscalledasAICcriterion • AIC=-2logL+2*numberofparameters Note:“Mul'logitModelwithInterac(on” © 2013 ExcelR Solutions. All Rights Reserved