350 likes | 504 Views
V13: Causality. Aims : (1) understand the causal relationships between the variables of a network (2) interpret a Bayesian network as a causal model whose edges have causal significance . For standard probabilistic queries it does not matter
E N D
V13: Causality Aims: (1) understandthecausalrelationshipsbetweenthe variables of a network (2) interpret a Bayesiannetworkas a causalmodelwhoseedges havecausalsignificance. Forstandardprobabilisticqueriesitdoes not matter whether a Bayesianmodeliscausalor not. Itmattersonlythatitencodethe „right“ distribution. A correlationbetween 2 variables X and Y canarise in multiple settings: • when X causes Y • when Y causes X • orwhen X and Y arebotheffectsof a singlecause. Mathematics of Biological Networks
Intervention queries Whensome variable W causallyaffectsboth X and Y, wegenerallyobserve a correlationbetweenthem. Ifweknowabouttheexistenceof W andcanobserveit, wecandisentanglethecorrelationbetween X and Y thatisinducedby W. Oneapproachtomodelingcausalrelationshipsisusing thenotionof ideal intervention– interventionsofthe form do(Z := z), whichforcethevariable Z totakethe valuez, andhavenoother immediate effect. In interventionqueries, weareinterested in answering queriesoftheform P(Y | do(z)) or P(Y | do(z), X = x) Mathematics of Biological Networks
Case study: studentexample Letusrevisitour simple studentexamplefrom lecture V8 andconsider a particularstudent Gump. Conditioning an observationthat Gump receives an A in theclassincreasestheprobabilitythat he has high intelligence, hisprobabilityofgetting a high SAT score, andhisprobabilityofgetting a job. Consider a situationwhere Gump islazyandratherthanworkinghard toget an A in theclass, he payssomeoneto hack intothe universitycomputersystemandchangehis grade in thecourseto an A. Whatishisprobabilityofgetting a goodjob in thiscase? Intuitively, thecompanywhere Gump isapplyingonlyhasaccesstohistranscript. Thus, weexpect P(J | do(g1) = P(J | g1). Mathematics of Biological Networks
Case study: studentexample Whatabouttheother 2 probabilities? Intuitively, wefeelthatthemanipulationtoGump‘sgradeshould not affectourbeliefsabouthisintelligencenorabouthisSAT score. Thus, weexpect P(i1 | do(g1) ) = P(i1) and P(s1| do(g1) ) = P(s1) An appropriategraphicalmodelforthepostinter- ventionsituationisshownin therightfigure. Here, thegrade (andthushischances togetthejob) do nolongerdepend on intelligenceordifficultyoftheclass. This modelisan instanceof amutilatednetwork. Mathematics of Biological Networks
Latent variables In practice, however, thereis a hugesetofpossiblelatent variables, representingfactors in theworldthatwecannotobserve andoftenare not evenawareof. A latent variable mayinducecorrelationsbetweentheobserved variables that do not correspondtocausalrelationsbetweenthem, andhenceforms a confoundingfactor in ourgoalofdeterminingcausalinteractions. Forthepurposeofcausalinference, itiscriticaltodisentangle thecomponent in thecorrelationbetween X and Y thatis due to causalrelationshipsandthecomponent due totheseconfoundingfactors. Unfortunately, itisvirtuallyimpossible in complex real-worldsettings, toidentify all relevant latent variables andquantifytheireffects. Mathematics of Biological Networks
Causalmodels A causalmodelhasthe same form as a probabilisticBayesiannetwork. Itconsistsof a directedacyclicgraphovertherandom variables in thedomain. The modelassertsthateach variable X isgovernedby a causalmechanismthat (stochastically) determinesitsvaluebased on thevaluesofitsparents. A causalmechanismtakesthe same form as a standard CPD. For a node X anditsparentsU, thecausalmodelspecifies foreachvalueuofU a distributionoverthevaluesof X. The differenceto a probabilisticBayesiannetworkis in theinterpretationofedges. In a causalmodel, weassumethatX‘sparentsareitsdirectcauses. Mathematics of Biological Networks
Causalmodels The assumptionthat CPDs correspondtocausalmechanisms formsthebasisforthetreatmentofinterventionqueries. Wen weintervene at a variable X, settingitsvalueto x, wereplaceits original causalmechanismwithonethatdictatesthatitshouldtakethevalue x. The modelpresentedbeforeis an instanceofthemutilatednetwork. In a mutilatednetwork BZ=z, weeliminate all incomingedgesintoeach variable Zi Z, (here: grade) andsetitsvalue tobeziwithprobability 1. Mathematics of Biological Networks
Causalmodels Definition: A causalmodelC over X is a Bayesiannetworkover X, which in additiontoansweringprobabilisticqueries, can also answerinterventionqueries P(Y | do(z), x) asfollows: This approachdealsappropriatelywiththestudentexample. Mathematics of Biological Networks
Causalmodels Let Cstudentbetheappropriatecausalmodel. Whenweintervene in thismodelbysetting Gump‘sgrade to an A, weobtainthemutilated networkshownbefore. The distributioninducedbythisnetworkoverGump‘sSAT score isthe same asthepriordistributionoverhisSAT score in the original network. Thus asexpected Conversely, thedistributioninduced on Gump‘sjobprospectsis Mathematics of Biological Networks
Causalmodels Assume thatwestart out with a somewhat different studentmodel. In thiscase, therecruitercan also base her hiringdecision on thestudent‘sSAT score. Fig. 21.1.b Now, thequery is answeredbythemutilatednetwork shownbelow. The answerisclearly not due tothedirectcausalinfluenceofhis grade on hisjobprospects. Itis also not equalto becausethenewnetwork also includesan influence via G ← I → S → J whichis not present in themutilatedmodel. Mathematics of Biological Networks
Simpson‘s paradox Considertheproblemoftryingtodeterminewhether a drugisbeneficial in curing a particulardiseasewithinsomepopulationofpatients. Statisticsshowthat, withinthepopulation, 57.5% ofpatientswhotookthedrug (D) arecured(C), whereasonly 50% ofthepatientswhodid not takethedrugarecured. Onemay belief giventhesestatisticsthatthedrugisbeneficial. However, withinthesubpopulationofmale patients 70% whotookthedrugarecured, whereas 80% whodid not takethedrugarecured. Withinthesubpopulationoffemalepatients 20% ofwhotookthedrugarecured, whereas 40% ofthosewhodid not takethedrugarecured. Mathematics of Biological Networks
Simpson‘s paradox Thus, despitetheapparentlybeneficialeffectofthedrug on theoverallpopulation, thedrugappearstobedetrimentaltobothmenandwomen. Wehave This surprisingcasecanoccurbecausetakingthisdrugiscorrelatedwithgender. In thisparticularexample, 75% ofmentakethedrug, but only 25% ofwomen. Ifthepopulationcontains 200 people, equallydistributedinto 100 menand 100 women, 75 mentakethedrug, ofthese 52.5 arecured (70%) 25 mendid not takethedrug, ofthese 20 arecured (80%). 25 womentakethedrug, ofthese 5 arecured (20%) 75 womendid not takethedrug. Ofthese 30 arecured (40%). → of 100 people (men + women) takingthedrug, 57.5 arecured (57.5%). → of 100 people not takingthedrug, 50 arecured (50%). Mathematics of Biological Networks
Simpson‘s paradox The conceptualdifficultybehindthis paradox is thatitis not clearwhichstatisticsoneshoulduse whendecidingwhethertodescribethedrugto a patient. The causalframeworkprovides an answertothe problem on which variables weshouldcondition on. The appropriatequeryweneedtoanswer in determiningwhethertoprescribethedrug is not but rather. We will showlaterthatthecorrectansweristhat thedrugis not beneficial, asexpected. Mathematics of Biological Networks
StructuralCausalIdentifiability Fullyspecifying a causalmodelisoftenimpossible. At least, forinterventionqueries, we must disentangle thecausalinfluenceof X and Y fromotherfactors leadingtocorrelationsbetweenthem. Mathematics of Biological Networks
StructuralCausalIdentifiability Consider a pair of variables X, Y with an observedcorrelationbetweenthem, andimaginethatourtaskistodetermine P(Y | do(X) ). Letusevenassumethat X temporallyprecedes Y, andthereforeweknowthat Y cannotcause X. However, ifweconsiderthepossibilitythat least someofthe correlationbetween X and Y is due to a hiddencommoncause, wehavenowayofdetermininghowmucheffectperturbing X wouldhave on Y. If all ofthecorrelationis due to a causal link, thenP(Y | do(X) ) = P(Y | X). Conversely, if all ofthecommoncorrelationis due tothe hiddencommoncause, thenP(Y | do(X) ) = P(Y). In general, anyvaluebetweenthose 2 distributionsispossible. Mathematics of Biological Networks
Query Simplification Rules Which interventionqueriesareidentifiable? We will nowseethatthestructureof a causalmodel givesrisetocertainequivalencerulesoverinterventionalqueries. This allowsonequerytobereplacedby an equivalentone thatmayhave a simpler form. These rulescanbedefined in termsof an augmentedcausalmodel thatencodesthepossibleeffectofinterventionsexplicitly withinthegraphstructure. We will viewtheprocessof an intervention in terms of a newdecision variable thatdetermines whetherweintervene at Z, andif so, whatitsvalueis. Mathematics of Biological Networks
Query Simplification Rules The variable takes on values in . If, then Z behavesas a random variable whosedistributionisdeterminedbyitsusual CPD . If , thenitdeterministicallysetsthevalueof Z tobe z withprobability 1. Letdenotetheset In thosecases, whereZ‘svalueisdeterministicallyset byoneparent, all ofZ‘sotherparentsUbecome irrelevant so thattheiredgesto Z canberemoved. Letbetheaugmentedmodelfor G. Letbethegraphobtainedfromexceptthateveryhasonlythesingleparent. Mathematics of Biological Networks
Query Simplification Rules The firstquerysimplificationruleallowsus toinsertordeleteobservationsinto a query. Proposition. Let C be a causalmodeloverthegraphstructure G. Then: ifWis d-separatedfromY givenZ, X in thegraph. Nomenclature: WesaythatXandYare d-separatedgivenZ, ifthereisnoactivetrailbetweenanynode X Xand Y YgivenZ. Mathematics of Biological Networks
Query Simplification Rules The secondruleissubtlerandallowsus toreplace an interventionwiththecorrespondingobservation. Proposition. Let C be a causalmodeloverthegraphstructureG.Then: ifYis d-separatedfromgivenX, Z, W in thegraph. This ruleholdsbecauseittellsusthatwe do not getmoreinformationregardingYfromthefact that an interventiontookplace at Xthanthevaluesxthemselves. Mathematics of Biological Networks
Query Simplification Rules The thirdruleallowsustointroduceordeleteinterventions. Proposition. Let C be a causalmodeloverthegraphstructureG.Then: ifYis d-separatedfromgivenZ, W in thegraph. Mathematics of Biological Networks
Iterated Query Simplification Therearemanyquerieswherenoneofthe 3 rulesapplydirectly. But we will seethatwecan also performothertransformations on thequeryallowingtherulestobeapplied. Letusrevisittherightfigure whichinvolvesthequery P(J | do(G) ). None ofourrulesapplydirectlytothisquery. • wecannoteliminatetheinterventionas P(J | do(G) ) P(J) • we also cannot turn theinterventioninto an observation, P(J | do(G) ) P(J | G) becauseintervening at G onlyaffects J via thesingleedge G → J, whereasconditioning G also influences J bytheindirecttrail G ← I → S → J. This trailiscalleda back-doortrail, sinceitleaves G via the „back door“. Mathematics of Biological Networks
Iterated Query Simplification However wecanusestandardprobabilisticreasoningandobtain: Bothoftheterms in thesummationcanbefurthersimplified. Forthefirstterm, wehavethattheonlyactivetrail from G to J isthedirectedge G → J. I.e. J is d-separatedfrom G given S in thegraphwhereoutgoingarcs fromG havebeendeleted. Thus wecanapplythesecondrule andconclude Mathematics of Biological Networks
Iterated Query Simplification For thesecondterm, we already argued (hackingthecomputersystemdoes not changeour belief on hisintelligence/SAT score). Puttingthetwotogetheryields thushackingthecomputersystemdoes not affectthejobchancesanymore. A back-doortrailfrom X to Y is an activetrailthatleaves X via a parentof X. For a query, a setWsatisfiesthe back-doorcriterionifnonode in Wis a descendantofX , andWblocks all back-doorpathsfromXtoY. Onecanshowthatif a setWsatisfiesthe back-doorcriterion for a query, then Mathematics of Biological Networks
RevisitSimpson‘s paradox We will reconsiderSimpson‘s paradox usingthe back-doorcriterion. Consideragainthequery. The variable G (gender) introduces a back-doortrailbetween C and D. Wecanaccountforitsinfluenceusingtheeq. just derived: weobtain: Therefore, weshould not prescribethedrug. Mathematics of Biological Networks
Case study: lungcancer In theearly 1960s, following a significantincrease in thenumber ofsmokersthatoccurredaround World War II, peoplebegan tonotice a substantial increase in thenumberofcasesoflungcancer. After manystudies, a correlation was noticedbetweensmokingandlungcancer. This correlation was noticed in bothdirections: The frequencyofsmokersamonglungcancerpatients was substantiallyhigherthan in thegeneralpopulation, Also, thefrequencyoflungcancerpatientswithinthepopulationof smokers was substantiallyhigherthanwithinthepopulationofnonsmokers. This ledtheSurgeon General, in 1964, toissue a report linkingcigarettesmokingtolungcancer. This reportcameundersevereattackbythetobaccoindustry. Mathematics of Biological Networks
Case study: lungcancer The industryclaimedthattheobservedcorrelation can also beexplainedby a model in whichthereis nocausalrelationshipbetweensmokingandlungcancer. Instead, an observedgenotypemightexistthat simultaneouslycausescancerand a desirefornicotine. Thereexistseveralpossiblemodels DirectcausaleffectIndirectinfluence via a latent commonparentgenotype Mathematics of Biological Networks
Case study: lungcancer The 2 modelscan express preciselythe same setofdistributions overthe observable variables S and C. Thus, theycan do an equallygoodjobofrepresenting theempiricaldistributionoverthese variables, andthereisnoway todistinguishbetweenthembased on observationaldataalone. Bothmodels will providethe same answerto standardprobabilisticqueries such as However, relative tointerventionalqueries, thesemodelshavevery different consequences. Mathematics of Biological Networks
Case study: lungcancer According totheSurgeonGeneral‘smodel, wewouldhave In otherwords, ifweforcepeopleto smoke, theirprobabilityof gettingcanceristhe same astheprobabilityconditioned on smoking, whichismuchhigherthanthepriorprobability. Accordingtothetobaccoindustrymodel, wehave In otherwords, makingthepopulation smoke orstopsmoking wouldhavenoeffect on the rate ofcancercases. Pearl (1995) proposed a formal analysisofthisdilemma. He proposedthatweshouldcombinethese 2 modelsinto a singlejointmodel. Mathematics of Biological Networks
Case study: lungcancer Pearl model Wenowneedtoassessfromthe marginal distributionover theobserved variables alonetheparametrizationofthe 3 links. Unfortunately, itisimpossibletodeterminetheparameters ofthese links fromtheobservationaldataalone, sinceboth original modelscanexplainthedataperfectly. Pearl thusrefinedthemodelsomewhatbyintroducing an additional assumption, andcouldthendetermineestimatesforthe links. Mathematics of Biological Networks
Case study: lungcancer Assume thatwedeterminethattheeffectofsmoking on cancer is not a directeffect, but occursthroughtheaccumulationof tardeposits in thelungs, seefigure. Here, weassumethatthedeposition oftar in thelungsis not directly affectedbythe latent Genotype variable. We will nowshowthat, ifwecan measuretheamountoftardeposits in thelungsofvariousindividuals (e.g. by X-rayor in autopsies), wecandeterminetheprobabilityoftheinterventionquery usingobservedcorrelationsalone. Mathematics of Biological Networks
Case study: lungcancer We areinterested in whichis an interventionquery whosemutilatednetworkis Standard probabilisticreasoningshowsthat Wenowconsiderandsimplifyeachtermseparately. Mathematics of Biological Networks
Case study: lungcancer The secondterm, whichmeasurestheeffectofsmoking on tar, canbesimplifieddirectlyusingourruleforconvertinginterventions toobservations (secondrule). Here, is d-separatedfrom T given S in graph. Itfollowsthat Intuitively, theonlyactivetrailfromto T goes via S, andtheeffectofthattrailisidenticalregardlessof whetherwecondition on S orintervene at S. Mathematics of Biological Networks
Case study: lungcancer The firsttermmeasurestheeffectoftar on cancer in thepresenceofourintervention on S. Unfortunately, wecannotdirectlyconverttheintervention at S to an observation, since C is not d-separatedfromgiven S, T in . However, wecanconverttheobservation at T to an intervention, because C is d-separatedfromgiven S,T in thegraph. Mathematics of Biological Networks
Case study: lungcancer We cannoweliminatetheintervention at S fromthisexpressionusingthethirdrule, whichappliesbecause C is d-separatedfromgiven T in thegraph Weobtain Bystandardprobabilisticreasoning andconditioning on S weget By rule 2 because C is d-separatedfrom given T, S in By rule 3 because S is d-separatedfrom in Mathematics of Biological Networks
Case study: lungcancer Putting everythingtogether, weget Thus, ifweagreethattar in thelungsistheintermediary betweensmokingandlungcancer, wecanuniquelydeterminetheextenttowhichsmoking causeslungcancer even in thepresenceof a confounding latent variable (here: genotype). Mathematics of Biological Networks