230 likes | 317 Views
HypothesisTest for Proportions. Chapter 9,Section 2. Statistical Methods II. QM 3620. The Sample. Suppose that we are not so interested in estimating this proportion as we are in verifying if some claim is true.
E N D
HypothesisTestforProportions Chapter9,Section2 StatisticalMethodsII QM3620
TheSample • Supposethatwearenotsointerestedinestimatingthis proportionasweareinverifyingifsomeclaimistrue. • Forexample,management maywanttoknowifmorethan50%ofourcustomersaresatisfied. • Wecould calculateaconfidenceintervalandcompareitto50%,orwecantestthisconcept usingahypothesistest.
TheHypotheses Wewouldwantarandomsampleof customersandenoughevidencefromthatsampleofcustomerstoshowthatthe opposingviewpoint(that50%orlessofthecustomersaresatisfied)isnot reasonable. This sets up our hypotheses…
TheHypotheses(cont’d) Theviewpointthatwearetrying to“prove”(morethan50%aresatisfied),isouralternativehypothesis.Notethat theseclaimsareaboutthepopulation,notthesample. ≤.50 >.50 HO: HA:
SamplingDistribution Ifweassumethatthenullhypothesisistrue,thentheclosesttheproportioncouldbe tothealternativehypothesis(andstillbepartofthenull)isiftheproportionis.50. So…ifthenullistrue,andtheproportioninthepopulationis.50,thenwhat proportionscouldweexpecttoseeinsamples.Well,inlargesamples,the proportionwouldlikelybereallyclosetotheactualproportion,whereasinsmall sample,theproportioncouldbeexpectedtovarymuchmore. ≤.50 HO: n=100 HA: >.50 Forinstance,ifweweretakingsamplesofsize 100,wecoulddeterminethat95%ofthesamples Assume =.50 wouldhaveaproportionofsatisfiedcustomers betweenapproximately40%and60%.So,getting asampleof100inwhichtheproportionofsatisfied customersis55%isnotreallyproofofthe alternativehypothesis…itcouldeasilyoccurifthe nullhypothesisistrue.
MakingaDecision So,whatevidencewouldbeenough?Certainlyonlysampleproportionsthatare above50%wouldqualify.Asampleproportionoflessthan50%couldoccurevenif thepopulationofcustomershasagreaterthan50%satisfactionrate,butitwould neverbeenoughevidencetoconcludetheproportionwasabove50%inthe population.Whatwearelookingforaresampleresultsthatare“improbable”ifthe nullhypothesisistrue.Howimprobable?Thatiswherethealphalevelkicksin.Say alpha=.05. ≤.50 >.50 HO: HA: Ifthesampleresult occurslessthan5%of thetimeifthenull hypothesisistrue,then thatisenoughevidence. Ifthatistooprobable, thenreducethealpha level. Reject .05 Assume =.50
ProportionTestPossibilities Justlikeourhypothesistestsofthepopulationmean,wehavethreealternativesfor thesetupofthetest.Wecantesttoseeifthepopulationproportionisgreaterthan somenumber(Example1),lessthansomenumber(Example2),ordifferencethan somenumber(Example3.) Example1 Example2 Example3 ≥.50 <.50 ≤.50 >.50 H0: HA: H0: HA: =.50 ≠.50 H0: HA: But,rememberthatallofthesetestsareaboutthepopulation,notthesample…and theystillrequirethatyoutakeagoodrandomsampleanddon’ttrytofudgethe results.Statisticsisbuilttohandlethenaturalvariationinsamples,notthedeliberate variationcausedbyincompetence.
RemembertheProcess Specifypopulationparameterofinterest Formulatethenullandalternativehypotheses Specifythedesiredsignificancelevel, Takearandomsampleandcalculaterelevant statistics Calculatep-valuefortestandcompareitto Reachadecisionanddrawaconclusion 1) 2) 3) 4) 5) 6)
TheTestStatistic Asstatedearlier,theamountofvariationinproportionsfromsampletosample canbecalculatedifyouknowthesamplesize(n)andthehypothesized proportioninthepopulation(0).Theformulais: π0(1π0) n σp wherepisthestandarderrorofthesampleproportions(i.e.theamountof variationintheproportionfromsampletosample). Wewillusethisstandarderrortohelpusdetermineifoursamplecouldhave comefromthepopulationwiththehypothesizedproportion(0).Ifwedivide thedifferenceinthesampleproportionandthehypothesizedproportionp-0, bythestandarderror,wegetateststatisticsthatcanbecomparedtothe standardnormaldistributionviatheNORMSDISTfunctioninExcel.Thatwillget usap-value. Theteststatisticis pπ0 π0(1π0) n
Excel’sNORMSDIST–UpperTailed Excel’sNORMSDISTfunctionisacumulative function,whichmeansitwillalwaysaggregatethe areaunderthedistributiontotheleftofsomepoint. ≤.50 >.50 HO: HA: Suppose Thehypothesized proportion(O)is.50. p-value=1-NORMSDIST(TS) Remember,ifthesample proportionisactuallybelow.50in thiscase(alreadyintheHOregion), thenthereisnoreasontogo forwardwiththehypothesistest. Theresultisobvious. pπ0 π0(1π0) n Tofindtheareaintheupper tail,weneedtosubtractthe areatotheleftofthetest statisticfrom1(sincethe totalareaunderthe distributionequals1.) TS
Excel’sNORMSDIST–LowerTailed Excel’sNORMSDISTfunctiontestofanlower- tailedalternativehypothesisistheeasiest.Weare justfindingtheareatotheleftoftheteststatistic. ≥.50 <.50 HO: HA: Suppose Thehypothesized value(O)is.50 p-value=NORMSDIST(TS) Remember,ifthesample proportionisactuallyabove.50in thiscase(alreadyintheHO region),thenthereisnoreasonto goforwardwiththehypothesis test.Theresultisobvious. pπ0 π0(1π0) n TS
Excel’sNORMSDIST–TwoTailed Excel’sNORMSDISTfunctiontestofatwo-tailed alternativehypothesisisnotthathard,butyou havetowatchtheparentheses. Tofindthetotalprobabilityinboth areas,youneedtousetheABS() functionsothatyoualwayshavea positiveteststatistic.Also,besure =.50 .50 tomultiplyby2onlyasthefinal stepinthecalculation. HO: HA: Suppose Thehypothesized value(O)is.50 p-value=(1-NORMSDIST(ABS(TS)))*2 Excel’s Absolute Value function pπ0 π0(1π0) n Enclosethe entirecalculation inparentheses soitcanbe calculatedbefore multiplyingby2. TS Thisgivesus thetotalarea inbothtails.
Summary:Findingthep-valuewithExcel Lowertailtest Example: Twotailedtest Example: Uppertailtest Example: ≥.50 <.50 ≤.50 >.50 H0: HA: H0: HA: =.50 ≠.50 H0: HA: p-value p-value p-value Test Statistic Test Statistic Test Statistic Test Statistic OR =NORMSDIST(TS) =(1-NORMSDIST(ABS(TS)))*2 =1-NORMSDIST(TS) Theabsolutevalueof theteststatisticusing theABS()function Multiplytheareaby2to adjustfor2tails Thevalue ofthetest statistic. Thevalue ofthetest statistic. SinceNORMSDIST() isacumulative function,weneedto subtracttheresult fromone.
The Assumption Wedomakeoneassumption(beyondtherandomsampleassumption)whichcanbe easilychecked.Weneedtomakesurethatwehaveenoughobservationsinour sampleinordertousethenormaldistribution.Thatcanbecheckedwithtwoeasy formulas… HypothesisTestfor ≥5 n n <5 0 0 or n(1-0)<5 Callastatistician and n(1-0)≥5 Thesamplingdistributionofp isnormal…proceedwith testing.
Statistical Terms TheNulland AlternativeHypothesis Thesamethreepossibilities existfortheproportion as for the mean.Wecantesttoseeiftheproportioninthepopulationislessthansomenumber, greaterthan somenumber, ordifferentthansomenumber. Themaindifferentinthenullandalternativehypothesesisthatthenumber isnowapercentage. The AlphaLevel Thealphalevelisthe acceptablechanceofatypeIerror–thechancethatyouwillacceptthealternativehypothesiswhenthenullhypothesisis true.
Statistical Terms The TestStatistic Mostteststatisticshavethesameoverallapproach.Theyareameasureofhowfaravalueisfromahypothesizedvaluein termsofstandarderrors.Inthismodule,wewillbeusingthestandarderrorofthe sampleproportion(aswearetestingtoseeifthesampleproportionisconsistentwiththehypothesizedproportion). Thep-value Ifthep-valueshowssupportforthenullhypothesis, thenitwillbelarge(oratleastlargerthanα).Ifthep-valuesupports thealternativehypothesis,thenitwillbesmall(lessthanorequaltoα). The Assumption Besidesarandomsample,weneedenoughobservationstoensurethatthenormaldistributioncanbeused.
ApplicationTime Let’strythisforreal
Business ApplicationHighlights Readthebusinessapplicationonpage400. First AmericanBank&Trustprovidesautomobileloanstocustomers. Itisimportantthatdocumentationbeprovidedforeachloantopreventfraudand tocomplywithregulations. Internalauditorscheckthecomplianceoffilesbyevaluatingasampleofthe22,500 outstandingloans.Itisnotpossibletoauditeachloanfileduetotimeand resourceconstraints. Nomorethan1%oftheloanscanbeoutofcompliancewiththebank’sstandards. Asampleof600filesistakenand9filesarefoundtobeoutofcompliance.
The Approach Intheproblem,wearetryingtoarriveatsomeconclusionaboutthepopulationof files, butweonlyhaveinformationon600(whichsoundslikealot, butthatmeans thatnearly22,000fileswerenotanalyzed).Wemustdecidewhetherthereis evidencethattheproportionofallfilesthatareoutofcomplianceexceedsthe acceptablelevel.Thusitmaybethecasethattheproportionmaybegreaterthan theacceptablelevelinthesample, butweneedconclusiveevidencethatthesame conditionwouldbetrueifweexaminedallofthe22,500files. Wewillstartwiththeassumptionthatthenullhypothesis(thefilesarewithinthe acceptablecompliancelevel)istrue.Wewilllookattheinformationinthedatato seeifthereisevidencetosupportthealternativehypothesis(thefilesarenotwithin theacceptablecompliancelevel).Thedatawillbetheevidence, butweneedmore thancompellingevidence…weneedconclusiveevidence. Hypothesis: H0: Files out of compliance < or = 1% HA: Files out of compliance > 1%
UsingStatistics TheNulland AlternativeHypothesis Wearegoingtoassumethatthefilesarecompliant(HO:≤0.01)andseeifwecanprove thattheyareoutofcompliance(HA:>0.01).Thenullhypothesisalwayshasthe“=“sign embeddedinit,andweassumethatthenullhypothesisistrueuntilprovenotherwise. The AlphaLevel Theproblemstatesthatthealphalevelshouldbesetto0.02, anunusualsetting. Thep-value Statisticalcalculationswilltellusthelikelihoodthat9outof600fileswouldbefoundtobe noncompliantifthenullhypothesisistrue(thefilesin totalareincompliance).Thatisthe p-value;itisalikelihoodoraprobability.Ifthep-valueissmall(i.e.thesampledatais unusualifthenullhypothesisistrue),thenwewillconcludethatthereismoreevidenceto supportthealternativehypothesis.Ifthep-valueislarge,thentheevidenceisnotunlikely underthenullhypothesis…oratleastnotunlikelyenoughforustorejectitoutright. Howsmallorhowlargeisenoughisdeterminedby, ourrequired“levelofsignificance” whichisourpredetermineddecisionpoint. The Assumptions Theassumptionsaresolelybasedonsamplesize,whichischeckedbyrequiringthat0n≥5 and(1-0)n≥5.Ifthoseconditionshold,thenthenormaldistributioncanbeusedinthe hypothesistesttocalculatethep-value.Sincen=600and0=0.01,then0n=6and(1- 0)n=594,sotheassumptionholds.
BusinessApplicationHighlights Readproblem9-42onpage406. TheInvestmentCompanyInstituteisinterestedinwhetherindividualsfundtheirIRAs usingarolloverfromanemployer-sponsoredretirementplan. Whenanemployeeleavesacompany,theemployeeisnolongerallowedtoparticipate intheemployer-sponsoredretirementplanandmustdisposeofthemoneyinthe accountsomehow. Someindividualstakethemoneyasadispersal,butthattriggersataxeventwhichalso incurspenaltiespayabletotheIRS. AnalternativeistorolloverthemoneyintheaccounttoanIRA,whichpreventsany taxorpenaltyfrombeingpayable.Thisisthewisestapproach. Thequestionthatshouldbeaskedofthisstudyiswhethertheseindividualsevenlefta jobthathadanemployer-sponsoredretirementplan.Iftheydidnot,thenthequestion aboutarolloverismute. Notethatthebasisforthehypothesisvalue,the72%,isalsofromasampleand thereforehassomevariabilityinit.However,itisalsobasedon3500observations,so thevariationwillbeminimal. Thequestionposediswhethertheproportionofindividualsthatrollovertheir employer-sponsoredretirementplansisdifferentinMiamithanthe72%foundinthe countryingeneral. ThisresultmightbeusefultocompaniesthatmarketIRAs.
UsingStatistics TheNullandAlternativeHypothesis WearegoingtoassumethattheproportioninMiamiisthesameastherest ofthecountry(HO:=0.72)andseeifwecanprovethattheproportionis different(HA:0.72). The AlphaLevel Theproblemstatesthatthealphalevelshouldbesetto0.10,arelativelyhigh chanceofatypeIerror. Thep-value Thep-valuewilltellusthelikelihoodthatourresultinthesamplewouldbe foundifthenullhypothesisistrue(theproportionisthesameinMiamiasin therestofthecountry.) The Assumptions Wecanchecktheassumptionbycalculatingandcomparing0nand(1-0)n to5.Ifthecalculationsarebothgreaterthan5,theassumptionshold.Since n=90and0=0.72,then0n=64.8and(1-0)n=25.2.Wecancontinue usingthisapproach.