1 / 23

HypothesisTest for Proportions

HypothesisTest for Proportions. Chapter 9,Section 2. Statistical Methods II. QM 3620. The Sample. Suppose that we are not so interested in estimating this proportion as we are in verifying if some claim is true.

auryon
Download Presentation

HypothesisTest for Proportions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HypothesisTestforProportions Chapter9,Section2 StatisticalMethodsII QM3620

  2. TheSample • Supposethatwearenotsointerestedinestimatingthis proportionasweareinverifyingifsomeclaimistrue. • Forexample,management maywanttoknowifmorethan50%ofourcustomersaresatisfied. • Wecould calculateaconfidenceintervalandcompareitto50%,orwecantestthisconcept usingahypothesistest.

  3. TheHypotheses Wewouldwantarandomsampleof customersandenoughevidencefromthatsampleofcustomerstoshowthatthe opposingviewpoint(that50%orlessofthecustomersaresatisfied)isnot reasonable. This sets up our hypotheses…

  4. TheHypotheses(cont’d) Theviewpointthatwearetrying to“prove”(morethan50%aresatisfied),isouralternativehypothesis.Notethat theseclaimsareaboutthepopulation,notthesample. ≤.50 >.50 HO: HA:

  5. SamplingDistribution Ifweassumethatthenullhypothesisistrue,thentheclosesttheproportioncouldbe tothealternativehypothesis(andstillbepartofthenull)isiftheproportionis.50. So…ifthenullistrue,andtheproportioninthepopulationis.50,thenwhat proportionscouldweexpecttoseeinsamples.Well,inlargesamples,the proportionwouldlikelybereallyclosetotheactualproportion,whereasinsmall sample,theproportioncouldbeexpectedtovarymuchmore. ≤.50 HO: n=100 HA: >.50 Forinstance,ifweweretakingsamplesofsize 100,wecoulddeterminethat95%ofthesamples Assume =.50 wouldhaveaproportionofsatisfiedcustomers betweenapproximately40%and60%.So,getting asampleof100inwhichtheproportionofsatisfied customersis55%isnotreallyproofofthe alternativehypothesis…itcouldeasilyoccurifthe nullhypothesisistrue.

  6. MakingaDecision So,whatevidencewouldbeenough?Certainlyonlysampleproportionsthatare above50%wouldqualify.Asampleproportionoflessthan50%couldoccurevenif thepopulationofcustomershasagreaterthan50%satisfactionrate,butitwould neverbeenoughevidencetoconcludetheproportionwasabove50%inthe population.Whatwearelookingforaresampleresultsthatare“improbable”ifthe nullhypothesisistrue.Howimprobable?Thatiswherethealphalevelkicksin.Say alpha=.05. ≤.50 >.50 HO: HA: Ifthesampleresult occurslessthan5%of thetimeifthenull hypothesisistrue,then thatisenoughevidence. Ifthatistooprobable, thenreducethealpha level. Reject .05 Assume =.50

  7. ProportionTestPossibilities Justlikeourhypothesistestsofthepopulationmean,wehavethreealternativesfor thesetupofthetest.Wecantesttoseeifthepopulationproportionisgreaterthan somenumber(Example1),lessthansomenumber(Example2),ordifferencethan somenumber(Example3.) Example1 Example2 Example3 ≥.50 <.50 ≤.50 >.50 H0: HA: H0: HA: =.50 ≠.50 H0: HA: But,rememberthatallofthesetestsareaboutthepopulation,notthesample…and theystillrequirethatyoutakeagoodrandomsampleanddon’ttrytofudgethe results.Statisticsisbuilttohandlethenaturalvariationinsamples,notthedeliberate variationcausedbyincompetence.

  8. RemembertheProcess Specifypopulationparameterofinterest Formulatethenullandalternativehypotheses Specifythedesiredsignificancelevel, Takearandomsampleandcalculaterelevant statistics Calculatep-valuefortestandcompareitto Reachadecisionanddrawaconclusion 1) 2) 3) 4) 5) 6)

  9. TheTestStatistic Asstatedearlier,theamountofvariationinproportionsfromsampletosample canbecalculatedifyouknowthesamplesize(n)andthehypothesized  proportioninthepopulation(0).Theformulais: π0(1π0) n σp wherepisthestandarderrorofthesampleproportions(i.e.theamountof variationintheproportionfromsampletosample). Wewillusethisstandarderrortohelpusdetermineifoursamplecouldhave comefromthepopulationwiththehypothesizedproportion(0).Ifwedivide thedifferenceinthesampleproportionandthehypothesizedproportionp-0, bythestandarderror,wegetateststatisticsthatcanbecomparedtothe standardnormaldistributionviatheNORMSDISTfunctioninExcel.Thatwillget  usap-value. Theteststatisticis pπ0 π0(1π0) n 

  10. Excel’sNORMSDIST–UpperTailed Excel’sNORMSDISTfunctionisacumulative function,whichmeansitwillalwaysaggregatethe areaunderthedistributiontotheleftofsomepoint. ≤.50 >.50 HO: HA: Suppose Thehypothesized proportion(O)is.50. p-value=1-NORMSDIST(TS) Remember,ifthesample proportionisactuallybelow.50in thiscase(alreadyintheHOregion), thenthereisnoreasontogo forwardwiththehypothesistest. Theresultisobvious. pπ0 π0(1π0) n Tofindtheareaintheupper tail,weneedtosubtractthe areatotheleftofthetest statisticfrom1(sincethe totalareaunderthe distributionequals1.) TS

  11. Excel’sNORMSDIST–LowerTailed Excel’sNORMSDISTfunctiontestofanlower- tailedalternativehypothesisistheeasiest.Weare justfindingtheareatotheleftoftheteststatistic. ≥.50 <.50 HO: HA: Suppose Thehypothesized value(O)is.50 p-value=NORMSDIST(TS) Remember,ifthesample proportionisactuallyabove.50in thiscase(alreadyintheHO region),thenthereisnoreasonto goforwardwiththehypothesis test.Theresultisobvious. pπ0 π0(1π0) n TS

  12. Excel’sNORMSDIST–TwoTailed Excel’sNORMSDISTfunctiontestofatwo-tailed alternativehypothesisisnotthathard,butyou havetowatchtheparentheses. Tofindthetotalprobabilityinboth areas,youneedtousetheABS() functionsothatyoualwayshavea positiveteststatistic.Also,besure =.50 .50 tomultiplyby2onlyasthefinal stepinthecalculation. HO: HA: Suppose Thehypothesized value(O)is.50 p-value=(1-NORMSDIST(ABS(TS)))*2 Excel’s Absolute Value function pπ0 π0(1π0) n Enclosethe entirecalculation inparentheses soitcanbe calculatedbefore multiplyingby2. TS Thisgivesus thetotalarea inbothtails.

  13. Summary:Findingthep-valuewithExcel Lowertailtest Example: Twotailedtest Example: Uppertailtest Example: ≥.50 <.50 ≤.50 >.50 H0: HA: H0: HA: =.50 ≠.50 H0: HA: p-value p-value p-value Test Statistic Test Statistic Test Statistic Test Statistic OR =NORMSDIST(TS) =(1-NORMSDIST(ABS(TS)))*2 =1-NORMSDIST(TS) Theabsolutevalueof theteststatisticusing theABS()function Multiplytheareaby2to adjustfor2tails Thevalue ofthetest statistic. Thevalue ofthetest statistic. SinceNORMSDIST() isacumulative function,weneedto subtracttheresult fromone.

  14. The Assumption Wedomakeoneassumption(beyondtherandomsampleassumption)whichcanbe easilychecked.Weneedtomakesurethatwehaveenoughobservationsinour sampleinordertousethenormaldistribution.Thatcanbecheckedwithtwoeasy formulas… HypothesisTestfor ≥5 n n <5 0 0 or n(1-0)<5 Callastatistician and n(1-0)≥5 Thesamplingdistributionofp isnormal…proceedwith testing.

  15. Statistical Terms TheNulland AlternativeHypothesis Thesamethreepossibilities existfortheproportion as for the mean.Wecantesttoseeiftheproportioninthepopulationislessthansomenumber, greaterthan somenumber, ordifferentthansomenumber. Themaindifferentinthenullandalternativehypothesesisthatthenumber isnowapercentage. The AlphaLevel Thealphalevelisthe acceptablechanceofatypeIerror–thechancethatyouwillacceptthealternativehypothesiswhenthenullhypothesisis true.

  16. Statistical Terms The TestStatistic Mostteststatisticshavethesameoverallapproach.Theyareameasureofhowfaravalueisfromahypothesizedvaluein termsofstandarderrors.Inthismodule,wewillbeusingthestandarderrorofthe sampleproportion(aswearetestingtoseeifthesampleproportionisconsistentwiththehypothesizedproportion). Thep-value Ifthep-valueshowssupportforthenullhypothesis, thenitwillbelarge(oratleastlargerthanα).Ifthep-valuesupports thealternativehypothesis,thenitwillbesmall(lessthanorequaltoα). The Assumption Besidesarandomsample,weneedenoughobservationstoensurethatthenormaldistributioncanbeused.

  17. ApplicationTime Let’strythisforreal

  18. Business ApplicationHighlights Readthebusinessapplicationonpage400. First AmericanBank&Trustprovidesautomobileloanstocustomers. Itisimportantthatdocumentationbeprovidedforeachloantopreventfraudand tocomplywithregulations. Internalauditorscheckthecomplianceoffilesbyevaluatingasampleofthe22,500 outstandingloans.Itisnotpossibletoauditeachloanfileduetotimeand resourceconstraints. Nomorethan1%oftheloanscanbeoutofcompliancewiththebank’sstandards. Asampleof600filesistakenand9filesarefoundtobeoutofcompliance.      

  19. The Approach Intheproblem,wearetryingtoarriveatsomeconclusionaboutthepopulationof files, butweonlyhaveinformationon600(whichsoundslikealot, butthatmeans thatnearly22,000fileswerenotanalyzed).Wemustdecidewhetherthereis evidencethattheproportionofallfilesthatareoutofcomplianceexceedsthe acceptablelevel.Thusitmaybethecasethattheproportionmaybegreaterthan theacceptablelevelinthesample, butweneedconclusiveevidencethatthesame conditionwouldbetrueifweexaminedallofthe22,500files. Wewillstartwiththeassumptionthatthenullhypothesis(thefilesarewithinthe acceptablecompliancelevel)istrue.Wewilllookattheinformationinthedatato seeifthereisevidencetosupportthealternativehypothesis(thefilesarenotwithin theacceptablecompliancelevel).Thedatawillbetheevidence, butweneedmore thancompellingevidence…weneedconclusiveevidence. Hypothesis: H0: Files out of compliance < or = 1% HA: Files out of compliance > 1%  

  20. UsingStatistics TheNulland AlternativeHypothesis  Wearegoingtoassumethatthefilesarecompliant(HO:≤0.01)andseeifwecanprove thattheyareoutofcompliance(HA:>0.01).Thenullhypothesisalwayshasthe“=“sign embeddedinit,andweassumethatthenullhypothesisistrueuntilprovenotherwise.  The AlphaLevel  Theproblemstatesthatthealphalevelshouldbesetto0.02, anunusualsetting.  Thep-value  Statisticalcalculationswilltellusthelikelihoodthat9outof600fileswouldbefoundtobe noncompliantifthenullhypothesisistrue(thefilesin totalareincompliance).Thatisthe p-value;itisalikelihoodoraprobability.Ifthep-valueissmall(i.e.thesampledatais unusualifthenullhypothesisistrue),thenwewillconcludethatthereismoreevidenceto supportthealternativehypothesis.Ifthep-valueislarge,thentheevidenceisnotunlikely underthenullhypothesis…oratleastnotunlikelyenoughforustorejectitoutright. Howsmallorhowlargeisenoughisdeterminedby, ourrequired“levelofsignificance” whichisourpredetermineddecisionpoint.  The Assumptions  Theassumptionsaresolelybasedonsamplesize,whichischeckedbyrequiringthat0n≥5 and(1-0)n≥5.Ifthoseconditionshold,thenthenormaldistributioncanbeusedinthe hypothesistesttocalculatethep-value.Sincen=600and0=0.01,then0n=6and(1- 0)n=594,sotheassumptionholds. 

  21. ANOTHEREXAMPLE

  22. BusinessApplicationHighlights Readproblem9-42onpage406. TheInvestmentCompanyInstituteisinterestedinwhetherindividualsfundtheirIRAs usingarolloverfromanemployer-sponsoredretirementplan. Whenanemployeeleavesacompany,theemployeeisnolongerallowedtoparticipate intheemployer-sponsoredretirementplanandmustdisposeofthemoneyinthe accountsomehow. Someindividualstakethemoneyasadispersal,butthattriggersataxeventwhichalso incurspenaltiespayabletotheIRS. AnalternativeistorolloverthemoneyintheaccounttoanIRA,whichpreventsany taxorpenaltyfrombeingpayable.Thisisthewisestapproach. Thequestionthatshouldbeaskedofthisstudyiswhethertheseindividualsevenlefta jobthathadanemployer-sponsoredretirementplan.Iftheydidnot,thenthequestion aboutarolloverismute. Notethatthebasisforthehypothesisvalue,the72%,isalsofromasampleand thereforehassomevariabilityinit.However,itisalsobasedon3500observations,so thevariationwillbeminimal. Thequestionposediswhethertheproportionofindividualsthatrollovertheir employer-sponsoredretirementplansisdifferentinMiamithanthe72%foundinthe countryingeneral. ThisresultmightbeusefultocompaniesthatmarketIRAs.         

  23. UsingStatistics TheNullandAlternativeHypothesis  WearegoingtoassumethattheproportioninMiamiisthesameastherest ofthecountry(HO:=0.72)andseeifwecanprovethattheproportionis different(HA:0.72).  The AlphaLevel  Theproblemstatesthatthealphalevelshouldbesetto0.10,arelativelyhigh chanceofatypeIerror.  Thep-value  Thep-valuewilltellusthelikelihoodthatourresultinthesamplewouldbe foundifthenullhypothesisistrue(theproportionisthesameinMiamiasin therestofthecountry.)  The Assumptions  Wecanchecktheassumptionbycalculatingandcomparing0nand(1-0)n to5.Ifthecalculationsarebothgreaterthan5,theassumptionshold.Since n=90and0=0.72,then0n=64.8and(1-0)n=25.2.Wecancontinue usingthisapproach. 

More Related