80 likes | 99 Views
ExcelR is provided Data Analytics Training for over 5 universities, namely IIM, ISB, BITS Pilani, UNIMAS and Woxen School of Business and 40 premier educational institutions. Excelr is the only organization that is partnered with UNIMAS to provide Data analytics certification with global validation. <br>All our trainers are Data Analytics experts with 15 Years of experience and they hail from IIT, ISB and IIM. Being one of the best data Analytics Institutes, we offer Data Analytics certification course with tailor made course curriculum to suit the professionals as well as freshers. <br>We provide 100% placements with more than 400 participants being successfully placed with top MNCs namely E&Y, Accenture, VMWare, Infosys, IBM etc.<br>
E N D
Advanced Regression Nega)ve Binomial Poisson Regression ZeroInflated Mul)nomial Regression AGENDA © 2013 ExcelR Solutions. All Rights Reserved
Multinomial Regression • Logis'cregression(Binomialdistribu'on)isusedwhenoutputhas‘2’categories • Mul'nomialregression(classifica'onmodel)isusedwhenoutputhas>‘2’categories • Extensiontologis'cregression • Nonaturalorderingofcategories Modeof transport Car Carpool Bus Rail Allmodes Count 218 32 81 122 453 • Responsevariablehas>‘2’categories&henceweapplymul'logit Probability 0.48 0.07 0.18 0.27 1 • Understandtheimpactofcost&'meonthevariousmodesoftransport © 2013 ExcelR Solutions. All Rights Reserved
Multinomial Regression Whetherwehave‘Y’(response)or‘X’(predictor),whichiscategoricalwith‘s’categories ü Lowestinnumerical/lexicographicalvalueischosenasbaseline/reference ü Missinglevelinoutputisbaselinelevel ü Wecanchoosethebaselinelevelofourchoicebasedon‘relevel’func'oninR ü Modelformulatestherela'onshipbetweentransformed(logit)Y&numericalXlinearly ü Modelingquan'ta'vevariableslinearlymightnotalwaysbecorrect • © 2013 ExcelR Solutions. All Rights Reserved
Multinomial Regression - Output Itera'onHistory: • Itera'veprocedureisusedtocomputemaximumlikelihoodes'mates • #itera'ons&convergencestatusisprovided • -2logL=2*nega'veloglikelihood • -2logLhasχ2distribu'on,whichisusedforhypothesistes'ngofgoodnessoffit #parameters=27 © 2013 ExcelR Solutions. All Rights Reserved
Multinomial Regression - Output ‘car’hasbeenchosenasbaseline • x=vectorrepresen'ngthevaluesofallinputs • Log(P(choice=carpool|x)/P(choice=car|x)=β20+β21*cost.car+β22*cost.carpool+……………. Thisequa'oncomparesthelogofprobabili'esofcarpooltocar Theregressioncoefficient0.636indicatesthatfora‘1’unitincreasesthe‘cost.car’,thelogoddsof‘carpool’to‘car’ increasesby0.636 • Interceptvaluedoesnotmeananythinginthiscontext • • IfwehaveacategoricalXalso,sayGender(female=0,male=1),thenregressioncoefficient(say0.22)indicates thatrela'vetofemales,malesincreasethelogoddsof‘carpool’to‘car’by0.22 © 2013 ExcelR Solutions. All Rights Reserved
Probability • Letp=p(x|A)betheprobabilityofanyevent(sayairi'on)undercondi'onA(say gender=female) Odds • Thenp(x|A)÷(1-p(x|A)iscalledtheoddsassociatedwiththeevent Odds Ratio • Iftherearetwocondi'onsA(gender=female)&B(gender=male)thenthera'o p(x|A)÷(1-p(x|A)/p(x|B)÷(1-p(x|B)iscalledasoddsra'oofAwithrespecttoB Relative Risk • p(x|A)÷p(x|B)iscalledasrela'verisk hips://en.wikipedia.org/wiki/Rela've_risk © 2013 ExcelR Solutions. All Rights Reserved
Odds Ratio • Oddsra'oiscomputedfromthecoefficientsinthelinearmodelequa'onbysimply exponen'a'ng • Exponen'atedregressioncoefficientsareoddsra'oforaunitchangeinapredictor variable • Theoddsra'oforaunitincreaseincost.caris1.88forchoosingcarpoolvscar © 2013 ExcelR Solutions. All Rights Reserved
Goodness of fit Linear AnalysisofVariance ResidualDeviance OLS GLM AnalysisofDeviance ResidualSumofSquares MaximumLikelihood • ResidualDevianceis-2logL • AddingmoreparameterstothemodelwillreduceResidualDevianceevenifitisnot goingtobeusefulforpredic'on • Inordertocontrolthis,penaltyof“2*numberofparameters”isaddedtoto Residualdeviance • Thispenalizedvalueof-2logLiscalledasAICcriterion • AIC=-2logL+2*numberofparameters Note:“Mul'logitModelwithInterac(on” © 2013 ExcelR Solutions. All Rights Reserved