1 / 35

V13: Causality

V13: Causality. Aims : (1) understand the causal relationships between the variables of a network (2) interpret a Bayesian network as a causal model whose edges have causal significance . For standard probabilistic queries it does not matter

Download Presentation

V13: Causality

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. V13: Causality Aims: (1) understandthecausalrelationshipsbetweenthe variables of a network (2) interpret a Bayesiannetworkas a causalmodelwhoseedges havecausalsignificance. Forstandardprobabilisticqueriesitdoes not matter whether a Bayesianmodeliscausalor not. Itmattersonlythatitencodethe „right“ distribution. A correlationbetween 2 variables X and Y canarise in multiple settings: • when X causes Y • when Y causes X • orwhen X and Y arebotheffectsof a singlecause. Mathematics of Biological Networks

  2. Intervention queries Whensome variable W causallyaffectsboth X and Y, wegenerallyobserve a correlationbetweenthem. Ifweknowabouttheexistenceof W andcanobserveit, wecandisentanglethecorrelationbetween X and Y thatisinducedby W. Oneapproachtomodelingcausalrelationshipsisusing thenotionof ideal intervention– interventionsofthe form do(Z := z), whichforcethevariable Z totakethe valuez, andhavenoother immediate effect. In interventionqueries, weareinterested in answering queriesoftheform P(Y | do(z)) or P(Y | do(z), X = x) Mathematics of Biological Networks

  3. Case study: studentexample Letusrevisitour simple studentexamplefrom lecture V8 andconsider a particularstudent Gump. Conditioning an observationthat Gump receives an A in theclassincreasestheprobabilitythat he has high intelligence, hisprobabilityofgetting a high SAT score, andhisprobabilityofgetting a job. Consider a situationwhere Gump islazyandratherthanworkinghard toget an A in theclass, he payssomeoneto hack intothe universitycomputersystemandchangehis grade in thecourseto an A. Whatishisprobabilityofgetting a goodjob in thiscase? Intuitively, thecompanywhere Gump isapplyingonlyhasaccesstohistranscript. Thus, weexpect P(J | do(g1) = P(J | g1). Mathematics of Biological Networks

  4. Case study: studentexample Whatabouttheother 2 probabilities? Intuitively, wefeelthatthemanipulationtoGump‘sgradeshould not affectourbeliefsabouthisintelligencenorabouthisSAT score. Thus, weexpect P(i1 | do(g1) ) = P(i1) and P(s1| do(g1) ) = P(s1) An appropriategraphicalmodelforthepostinter- ventionsituationisshownin therightfigure. Here, thegrade (andthushischances togetthejob) do nolongerdepend on intelligenceordifficultyoftheclass. This modelisan instanceof amutilatednetwork. Mathematics of Biological Networks

  5. Latent variables In practice, however, thereis a hugesetofpossiblelatent variables, representingfactors in theworldthatwecannotobserve andoftenare not evenawareof. A latent variable mayinducecorrelationsbetweentheobserved variables that do not correspondtocausalrelationsbetweenthem, andhenceforms a confoundingfactor in ourgoalofdeterminingcausalinteractions. Forthepurposeofcausalinference, itiscriticaltodisentangle thecomponent in thecorrelationbetween X and Y thatis due to causalrelationshipsandthecomponent due totheseconfoundingfactors. Unfortunately, itisvirtuallyimpossible in complex real-worldsettings, toidentify all relevant latent variables andquantifytheireffects. Mathematics of Biological Networks

  6. Causalmodels A causalmodelhasthe same form as a probabilisticBayesiannetwork. Itconsistsof a directedacyclicgraphovertherandom variables in thedomain. The modelassertsthateach variable X isgovernedby a causalmechanismthat (stochastically) determinesitsvaluebased on thevaluesofitsparents. A causalmechanismtakesthe same form as a standard CPD. For a node X anditsparentsU, thecausalmodelspecifies foreachvalueuofU a distributionoverthevaluesof X. The differenceto a probabilisticBayesiannetworkis in theinterpretationofedges. In a causalmodel, weassumethatX‘sparentsareitsdirectcauses. Mathematics of Biological Networks

  7. Causalmodels The assumptionthat CPDs correspondtocausalmechanisms formsthebasisforthetreatmentofinterventionqueries. Wen weintervene at a variable X, settingitsvalueto x, wereplaceits original causalmechanismwithonethatdictatesthatitshouldtakethevalue x. The modelpresentedbeforeis an instanceofthemutilatednetwork. In a mutilatednetwork BZ=z, weeliminate all incomingedgesintoeach variable Zi  Z, (here: grade) andsetitsvalue tobeziwithprobability 1. Mathematics of Biological Networks

  8. Causalmodels Definition: A causalmodelC over X is a Bayesiannetworkover X, which in additiontoansweringprobabilisticqueries, can also answerinterventionqueries P(Y | do(z), x) asfollows: This approachdealsappropriatelywiththestudentexample. Mathematics of Biological Networks

  9. Causalmodels Let Cstudentbetheappropriatecausalmodel. Whenweintervene in thismodelbysetting Gump‘sgrade to an A, weobtainthemutilated networkshownbefore. The distributioninducedbythisnetworkoverGump‘sSAT score isthe same asthepriordistributionoverhisSAT score in the original network. Thus asexpected Conversely, thedistributioninduced on Gump‘sjobprospectsis Mathematics of Biological Networks

  10. Causalmodels Assume thatwestart out with a somewhat different studentmodel. In thiscase, therecruitercan also base her hiringdecision on thestudent‘sSAT score. Fig. 21.1.b Now, thequery is answeredbythemutilatednetwork shownbelow. The answerisclearly not due tothedirectcausalinfluenceofhis grade on hisjobprospects. Itis also not equalto becausethenewnetwork also includesan influence via G ← I → S → J whichis not present in themutilatedmodel. Mathematics of Biological Networks

  11. Simpson‘s paradox Considertheproblemoftryingtodeterminewhether a drugisbeneficial in curing a particulardiseasewithinsomepopulationofpatients. Statisticsshowthat, withinthepopulation, 57.5% ofpatientswhotookthedrug (D) arecured(C), whereasonly 50% ofthepatientswhodid not takethedrugarecured. Onemay belief giventhesestatisticsthatthedrugisbeneficial. However, withinthesubpopulationofmale patients 70% whotookthedrugarecured, whereas 80% whodid not takethedrugarecured. Withinthesubpopulationoffemalepatients 20% ofwhotookthedrugarecured, whereas 40% ofthosewhodid not takethedrugarecured. Mathematics of Biological Networks

  12. Simpson‘s paradox Thus, despitetheapparentlybeneficialeffectofthedrug on theoverallpopulation, thedrugappearstobedetrimentaltobothmenandwomen. Wehave This surprisingcasecanoccurbecausetakingthisdrugiscorrelatedwithgender. In thisparticularexample, 75% ofmentakethedrug, but only 25% ofwomen. Ifthepopulationcontains 200 people, equallydistributedinto 100 menand 100 women, 75 mentakethedrug, ofthese 52.5 arecured (70%) 25 mendid not takethedrug, ofthese 20 arecured (80%). 25 womentakethedrug, ofthese 5 arecured (20%) 75 womendid not takethedrug. Ofthese 30 arecured (40%). → of 100 people (men + women) takingthedrug, 57.5 arecured (57.5%). → of 100 people not takingthedrug, 50 arecured (50%). Mathematics of Biological Networks

  13. Simpson‘s paradox The conceptualdifficultybehindthis paradox is thatitis not clearwhichstatisticsoneshoulduse whendecidingwhethertodescribethedrugto a patient. The causalframeworkprovides an answertothe problem on which variables weshouldcondition on. The appropriatequeryweneedtoanswer in determiningwhethertoprescribethedrug is not but rather. We will showlaterthatthecorrectansweristhat thedrugis not beneficial, asexpected. Mathematics of Biological Networks

  14. StructuralCausalIdentifiability Fullyspecifying a causalmodelisoftenimpossible. At least, forinterventionqueries, we must disentangle thecausalinfluenceof X and Y fromotherfactors leadingtocorrelationsbetweenthem. Mathematics of Biological Networks

  15. StructuralCausalIdentifiability Consider a pair of variables X, Y with an observedcorrelationbetweenthem, andimaginethatourtaskistodetermine P(Y | do(X) ). Letusevenassumethat X temporallyprecedes Y, andthereforeweknowthat Y cannotcause X. However, ifweconsiderthepossibilitythat least someofthe correlationbetween X and Y is due to a hiddencommoncause, wehavenowayofdetermininghowmucheffectperturbing X wouldhave on Y. If all ofthecorrelationis due to a causal link, thenP(Y | do(X) ) = P(Y | X). Conversely, if all ofthecommoncorrelationis due tothe hiddencommoncause, thenP(Y | do(X) ) = P(Y). In general, anyvaluebetweenthose 2 distributionsispossible. Mathematics of Biological Networks

  16. Query Simplification Rules Which interventionqueriesareidentifiable? We will nowseethatthestructureof a causalmodel givesrisetocertainequivalencerulesoverinterventionalqueries. This allowsonequerytobereplacedby an equivalentone thatmayhave a simpler form. These rulescanbedefined in termsof an augmentedcausalmodel thatencodesthepossibleeffectofinterventionsexplicitly withinthegraphstructure. We will viewtheprocessof an intervention in terms of a newdecision variable thatdetermines whetherweintervene at Z, andif so, whatitsvalueis. Mathematics of Biological Networks

  17. Query Simplification Rules The variable takes on values in . If, then Z behavesas a random variable whosedistributionisdeterminedbyitsusual CPD . If , thenitdeterministicallysetsthevalueof Z tobe z withprobability 1. Letdenotetheset In thosecases, whereZ‘svalueisdeterministicallyset byoneparent, all ofZ‘sotherparentsUbecome irrelevant so thattheiredgesto Z canberemoved. Letbetheaugmentedmodelfor G. Letbethegraphobtainedfromexceptthateveryhasonlythesingleparent. Mathematics of Biological Networks

  18. Query Simplification Rules The firstquerysimplificationruleallowsus toinsertordeleteobservationsinto a query. Proposition. Let C be a causalmodeloverthegraphstructure G. Then: ifWis d-separatedfromY givenZ, X in thegraph. Nomenclature: WesaythatXandYare d-separatedgivenZ, ifthereisnoactivetrailbetweenanynode X  Xand Y  YgivenZ. Mathematics of Biological Networks

  19. Query Simplification Rules The secondruleissubtlerandallowsus toreplace an interventionwiththecorrespondingobservation. Proposition. Let C be a causalmodeloverthegraphstructureG.Then: ifYis d-separatedfromgivenX, Z, W in thegraph. This ruleholdsbecauseittellsusthatwe do not getmoreinformationregardingYfromthefact that an interventiontookplace at Xthanthevaluesxthemselves. Mathematics of Biological Networks

  20. Query Simplification Rules The thirdruleallowsustointroduceordeleteinterventions. Proposition. Let C be a causalmodeloverthegraphstructureG.Then: ifYis d-separatedfromgivenZ, W in thegraph. Mathematics of Biological Networks

  21. Iterated Query Simplification Therearemanyquerieswherenoneofthe 3 rulesapplydirectly. But we will seethatwecan also performothertransformations on thequeryallowingtherulestobeapplied. Letusrevisittherightfigure whichinvolvesthequery P(J | do(G) ). None ofourrulesapplydirectlytothisquery. • wecannoteliminatetheinterventionas P(J | do(G) )  P(J) • we also cannot turn theinterventioninto an observation, P(J | do(G) )  P(J | G) becauseintervening at G onlyaffects J via thesingleedge G → J, whereasconditioning G also influences J bytheindirecttrail G ← I → S → J. This trailiscalleda back-doortrail, sinceitleaves G via the „back door“. Mathematics of Biological Networks

  22. Iterated Query Simplification However wecanusestandardprobabilisticreasoningandobtain: Bothoftheterms in thesummationcanbefurthersimplified. Forthefirstterm, wehavethattheonlyactivetrail from G to J isthedirectedge G → J. I.e. J is d-separatedfrom G given S in thegraphwhereoutgoingarcs fromG havebeendeleted. Thus wecanapplythesecondrule andconclude Mathematics of Biological Networks

  23. Iterated Query Simplification For thesecondterm, we already argued (hackingthecomputersystemdoes not changeour belief on hisintelligence/SAT score). Puttingthetwotogetheryields thushackingthecomputersystemdoes not affectthejobchancesanymore. A back-doortrailfrom X to Y is an activetrailthatleaves X via a parentof X. For a query, a setWsatisfiesthe back-doorcriterionifnonode in Wis a descendantofX , andWblocks all back-doorpathsfromXtoY. Onecanshowthatif a setWsatisfiesthe back-doorcriterion for a query, then Mathematics of Biological Networks

  24. RevisitSimpson‘s paradox We will reconsiderSimpson‘s paradox usingthe back-doorcriterion. Consideragainthequery. The variable G (gender) introduces a back-doortrailbetween C and D. Wecanaccountforitsinfluenceusingtheeq. just derived: weobtain: Therefore, weshould not prescribethedrug. Mathematics of Biological Networks

  25. Case study: lungcancer In theearly 1960s, following a significantincrease in thenumber ofsmokersthatoccurredaround World War II, peoplebegan tonotice a substantial increase in thenumberofcasesoflungcancer. After manystudies, a correlation was noticedbetweensmokingandlungcancer. This correlation was noticed in bothdirections: The frequencyofsmokersamonglungcancerpatients was substantiallyhigherthan in thegeneralpopulation, Also, thefrequencyoflungcancerpatientswithinthepopulationof smokers was substantiallyhigherthanwithinthepopulationofnonsmokers. This ledtheSurgeon General, in 1964, toissue a report linkingcigarettesmokingtolungcancer. This reportcameundersevereattackbythetobaccoindustry. Mathematics of Biological Networks

  26. Case study: lungcancer The industryclaimedthattheobservedcorrelation can also beexplainedby a model in whichthereis nocausalrelationshipbetweensmokingandlungcancer. Instead, an observedgenotypemightexistthat simultaneouslycausescancerand a desirefornicotine. Thereexistseveralpossiblemodels DirectcausaleffectIndirectinfluence via a latent commonparentgenotype Mathematics of Biological Networks

  27. Case study: lungcancer The 2 modelscan express preciselythe same setofdistributions overthe observable variables S and C. Thus, theycan do an equallygoodjobofrepresenting theempiricaldistributionoverthese variables, andthereisnoway todistinguishbetweenthembased on observationaldataalone. Bothmodels will providethe same answerto standardprobabilisticqueries such as However, relative tointerventionalqueries, thesemodelshavevery different consequences. Mathematics of Biological Networks

  28. Case study: lungcancer According totheSurgeonGeneral‘smodel, wewouldhave In otherwords, ifweforcepeopleto smoke, theirprobabilityof gettingcanceristhe same astheprobabilityconditioned on smoking, whichismuchhigherthanthepriorprobability. Accordingtothetobaccoindustrymodel, wehave In otherwords, makingthepopulation smoke orstopsmoking wouldhavenoeffect on the rate ofcancercases. Pearl (1995) proposed a formal analysisofthisdilemma. He proposedthatweshouldcombinethese 2 modelsinto a singlejointmodel. Mathematics of Biological Networks

  29. Case study: lungcancer Pearl model Wenowneedtoassessfromthe marginal distributionover theobserved variables alonetheparametrizationofthe 3 links. Unfortunately, itisimpossibletodeterminetheparameters ofthese links fromtheobservationaldataalone, sinceboth original modelscanexplainthedataperfectly. Pearl thusrefinedthemodelsomewhatbyintroducing an additional assumption, andcouldthendetermineestimatesforthe links. Mathematics of Biological Networks

  30. Case study: lungcancer Assume thatwedeterminethattheeffectofsmoking on cancer is not a directeffect, but occursthroughtheaccumulationof tardeposits in thelungs, seefigure. Here, weassumethatthedeposition oftar in thelungsis not directly affectedbythe latent Genotype variable. We will nowshowthat, ifwecan measuretheamountoftardeposits in thelungsofvariousindividuals (e.g. by X-rayor in autopsies), wecandeterminetheprobabilityoftheinterventionquery usingobservedcorrelationsalone. Mathematics of Biological Networks

  31. Case study: lungcancer We areinterested in whichis an interventionquery whosemutilatednetworkis Standard probabilisticreasoningshowsthat Wenowconsiderandsimplifyeachtermseparately. Mathematics of Biological Networks

  32. Case study: lungcancer The secondterm, whichmeasurestheeffectofsmoking on tar, canbesimplifieddirectlyusingourruleforconvertinginterventions toobservations (secondrule). Here, is d-separatedfrom T given S in graph. Itfollowsthat Intuitively, theonlyactivetrailfromto T goes via S, andtheeffectofthattrailisidenticalregardlessof whetherwecondition on S orintervene at S. Mathematics of Biological Networks

  33. Case study: lungcancer The firsttermmeasurestheeffectoftar on cancer in thepresenceofourintervention on S. Unfortunately, wecannotdirectlyconverttheintervention at S to an observation, since C is not d-separatedfromgiven S, T in . However, wecanconverttheobservation at T to an intervention, because C is d-separatedfromgiven S,T in thegraph. Mathematics of Biological Networks

  34. Case study: lungcancer We cannoweliminatetheintervention at S fromthisexpressionusingthethirdrule, whichappliesbecause C is d-separatedfromgiven T in thegraph Weobtain Bystandardprobabilisticreasoning andconditioning on S weget By rule 2 because C is d-separatedfrom given T, S in By rule 3 because S is d-separatedfrom in Mathematics of Biological Networks

  35. Case study: lungcancer Putting everythingtogether, weget Thus, ifweagreethattar in thelungsistheintermediary betweensmokingandlungcancer, wecanuniquelydeterminetheextenttowhichsmoking causeslungcancer even in thepresenceof a confounding latent variable (here: genotype). Mathematics of Biological Networks

More Related