340 likes | 553 Views
V7 Foundations of Probability Theory. „ Probability “ : degree of confidence that an event of an uncertain nature will occur . „Events“ : we will assume that there is an agreed upon space of possible outcomes („ events “).
E N D
V7 FoundationsofProbabilityTheory „Probability“ : degreeofconfidencethat an eventof an uncertainnature will occur. „Events“ : we will assumethatthereis an agreed upon spaceofpossibleoutcomes(„events“). E.g. a normal die (dt. Würfel) has a space 1,2,3,4,5,6 Also weassumethatthereis a setofmeasurableevents S towhichwearewillingtoassignprobabilities. In the die example, theevent6 isthecasewherethe die shows 6. The event 1,3,5 representsthecaseof an oddoutcome. Mathematics of Biological Networks
FoundationsofProbabilityTheory Probabilitytheoryrequiresthattheeventspacesatisfy 3 basicproperties: • Itcontainstheemptyevent andthetrivial event. • Itisclosedunderunion→ if , S, then so is S, • Itisclosedundercomplementation→if S, then so is S The requirementthattheeventspaceisclosedunderunion andcomplementationimpliesthatitis also closedunderother Boolean operations, such asintersectionandsetdifference. Mathematics of Biological Networks
Probabilitydistributions A probabilitydistributionP over (, S) is a mappingfromevents in S toreal valuesthatsatisfiesthefollowingconditions: (1) P( 0 for all S → Probabilitiesare not negative (2) P() = 1 → The probabilityofthe trivial eventwhichallows all possibleoutcomeshasthe maximal possibleprobabilityof 1. (3) If, S and = 0 then P( ) = P() + P() Mathematics of Biological Networks
Interpretation ofprobabilities The frequentist‘sinterpretation: The probabilityof an eventisthefractionoftimes theeventoccursifwerepeattheexperimentindefinitely. E.g. throwingofdice, coinflips, cardgames, … wherefrequencies will satisfytherequirementsof proper distributions. For an event such as „It will rain tomorrowafternoon“, thefrequentistapproachdoes not provide a satisfactoryinterpretation. Mathematics of Biological Networks
Interpretation ofprobabilities An alternative interpretationviewsprobabilitiesassubjectivedegreesof belief. E.g. thestatement „theprobabilityof rain tomorrowafternoonis 50 percent“ tellsusthat in theopinionofthespeaker, thechancesof rain andno rain tomorrowafternoonarethe same. Whenwediscussprobabilities in thefollowingweusually do not explicitlystate theirinterpretationsincebothinterpretationsleadtothe same mathematicalrules. Mathematics of Biological Networks
Conditionalprobability The conditionalprobabilityof given isdefinedas The probabilitythat istruegiventhatweknow isthe relative proportion ofoutcomessatisfying amongthesethatsatisfy . Fromthisweimmediatelyseethat This equalityisknowasthechainruleofconditionalprobabilities. More generally, if 1, 2, … kareevents, wecanwrite Mathematics of Biological Networks
Bayesrule Another immediate consequenceofthedefinitionofconditionalprobabilityis Bayes‘ rule. Due tosymmetry, wecanswapthe 2 variables and in thedefinition andgettheequivalentexpression Ifwerearrange, wegetBayes‘ ruleor A moregeneralconditionalversionofBayes‘ rulewhere all probabilitiesareconditioned on somebackgroundevent also holds: Mathematics of Biological Networks
Example 1 forBayesrule Consider a studentpopulation. LetSmartdenote smart studentsandGradeAdenotestudentswhogot grade A. Assumewebelievethat P(GradeA|Smart) = 0.6, andnowwelearnthat a particularstudentreceived grade A. SupposethatP(Smart) = 0.3 andP(GradeA) = 0.2 Thenwehave P(Smart|GradeA) = 0.6 0.3 / 0.2 = 0.9 In thismodel, an A grade stronglysuggeststhatthestudentis smart. On theotherhand, ifthetest was easierand high grades weremorecommon, e.g. P(GradeA) = 0.4, thenwewouldget P(Smart|GradeA) = 0.6 0.3 / 0.4 = 0.45whichismuchlessconclusive. Mathematics of Biological Networks
Example 2 forBayesrule Supposethat a tuberculosisskintestis 95% percentaccurate. Thatis, ifthepatientis TB-infected, thenthetest will be positive withprobability 0.95 andifthepatientis not infected, thetest will be negative withprobability 0.95. Nowsupposethat a persongets a positive testresult. Whatistheprobabilitythatthepersonisinfected? Naive reasoningsuggeststhatifthetestresultiswrong5% ofthe time, thentheprobabilitythatthesubjectisinfectedis 0.95. Thatwouldmeanthat 95% ofsubjectswith positive resultshave TB. Mathematics of Biological Networks
Example 2 forBayesrule IfweconsidertheproblembyapplyingBayes‘ rule, weneedtoconsiderthepriorprobabilityof TB infection, andtheprobabilityofgetting a positive testresult. Supposethat 1 in 1000 ofthesubjectswhogettestedisinfected→ P(TB) = 0.001 Weseethat 0.001 0.95 infectedsubjectsget a positive result and 0.999 0.05 uninfectedsubjectsget a positive result. Thus P(Positive) = 0.001 0.95 + 0.999 0.05 = 0.0509 ApplyingBayes‘ rule, weget P(TB|Positive) = P(TB) P(Positive|TB) / P(Positive) = 0.001 0.95 / 0.0509 0.0187 Thus, although a subjectwith a positive testismuchmore probable tobe TB-infectedthanis a randomsubject, fewerthan 2% ofthesesubjectsare TB-infected. Mathematics of Biological Networks
Random Variables A random variable isdefinedby a function thatassociateswitheachoutcome in a value. Forstudents in a class, thiscouldbe a functionthatmaps eachstudent in theclass (in ) tohisor her grade (1, …, 5). The eventgrade = Ais a shorthandfortheevent. Thereexistcategorical (ordiscrete) randomvaluesthattake on oneof a fewvalues, e.g. intelligencecouldbe „high“ or „low“. There also existinteger or real random variable thatcantake on an infinite numberofcontinuousvalues, e.g. theheightofstudents. By Val(X) wedenotethesetofvaluesthat a random variable X cantake. Mathematics of Biological Networks
Random Variables In thefollowing, we will eitherconsidercategoricalrandom variables orrandom variables thattake real values. We will usecapitalletters X, Y, Z todenoterandom variables. Lowercasevalues will refertothevaluesofrandom variables. E.g. Whenwediscusscategoricalrandomnumbers, we will denotethei-thvalueasxi. Boldcapitallettersareusedforsetsofrandom variables: X, Y, Z. Mathematics of Biological Networks
Marginal Distributions Once wedefine a random variable X, wecanconsiderthe marginal distributionP(X)overeventsthatcanbedescribedusing X. E.g. letustakethetworandom variables IntelligenceandGrade andtheir marginal distributionsP(Intelligence) andP(Grade) Letussupposethat These marginal distributionsareprobabilitydistributionssatisfyingthe 3 properties. Mathematics of Biological Networks
Joint Distributions Often weareinterested in questionsthat involvethevaluesofseveralrandom variables. E.g. wemightbeinterested in theevent „Intelligence = high andGrade = A“. In thatcaseweneedtoconsiderthejointdistribution over thesetworandom variables. The jointdistributionof 2 random variables hastobeconsistent withthe marginal distribution in that putFigure 2.1 Mathematics of Biological Networks
ConditionalProbability The notionofconditionalprobabilityextendsto induceddistributionsoverrandom variables. denotestheconditionaldistributionovertheeventsdescribablebyIntelligencegiventheknowledgethatthestudent‘s grade is A. Note thattheconditionalprobability is quite different fromthe marginal distribution. We will usethenotationtopresent a setofconditionalprobabilitydistributions. Bayes‘ rule in termsofconditionalprobabilitydistributionsreads Mathematics of Biological Networks
Independence We usuallyexpectP( | ) tobe different fromP( ) . Learning that istruetypicallychangesourprobabilityover. However, in somesituationsP( | ) =P( ) . Definition: Wesaythat an event isindependentofevent in P, denotedasifP( | ) = P( ) orifP() = 0. We will nowprovidean alternative definitionforthisconceptofindependence. Mathematics of Biological Networks
Independence Proposition: A distributionPsatisfiesifandonlyif Proof IfP() = 0 → Also so thatisfulfilled. LetnowP() 0 Fromthechainruleweget Since isindependentof , Thus weget • SupposethatThen, bydefinitionwehave whichiswhatneedstobeshown. Note that implies Mathematics of Biological Networks
Independence of Random Variables Definition: LetX, Y, Z besetsofrandom variables. WesaythatXisconditionallyindependentofY givenZ in a distributionP ifPsatisfies for all values, , . As beforewecangive an alternative characterizationofconditionalindependence Proposition: The distributionPsatisfiesifandonlyif Mathematics of Biological Networks
Independence propertiesofdistributions Symmetry Decomposition Weakunion Contraction Mathematics of Biological Networks
ProbabilityDensityFunctions A function is a probabilitydensityfunction(PDF) for X ifitis a nonnegativeintegrablefunction so that The functionisthecumulativedistributionfor X. Byusingthedensityfunctionwecanevaluatetheprobabilityofotherevents. E.g. Mathematics of Biological Networks
Uniform distribution The simplest PDF istheuniform distribution Definition: A variable X has a uniform distributionover [a,b] denoted X Unif[a,b] ifithasthe PDF Thus theprobabilityofanysubintervalof [a,b] is proportional toitssize relative tothesizeof [a,b]. Ifb – a < 1, thedensitycanbegreaterthan 1. Weonlyhavetosatisfytheconstraintthatthe total areaunderthe PDF is 1. Mathematics of Biological Networks
Gaussiandistribution A random variable X has a Gaussiandistributionwithmean andvariance 2 , denoted X N(;2) ifithasthe PDF A standardGaussianhasmean 0 andvariance 1. Fig. 2.2. Mathematics of Biological Networks
Joint densityfunctions Let P be a jointdistributionovercontinuousrandom variables X1, … Xn . A functionp(x1, … xn) is a jointdensityfunctionof X1, … Xnif - p(x1, … xn) 0 for all valuesx1, … xnofX1, … Xn - pis an integrablefunction - foranychoiceofa1, … anandb1, … bn Fromthejointdensitywecanderivethe marginal densityofanyrandom variable byintegrating out theother variables. E.g. ifp(x,y)isthejointdensityof X and Y Mathematics of Biological Networks
Conditionaldensityfunctions We nowwanttobeabletodescribeconditionaldistributionsofcontinuous variables. Applyingthepreviousdefinitionisproblematic becausetheprobabilityof an isolatedpointP(X = x) iszero. Thus wedefine Ifthereexists a continuousjointdensityfunctionp(x,y) thenwecanderivethe form ofthisterm. Letusconsidersomeevent on Y, saya Y b. Mathematics of Biological Networks
Conditionaldensityfunctions When issufficientlysmall, wecanassumethatp(x) = const in thisintervalandapproximate Using a similarapproximationforp(x‘,y) , weget WeconcludethatisthedensityofP( Y | X = x) Mathematics of Biological Networks
Conditionaldensityfunctions Let p(x,y)bethejointdensityofXandY. The conditionaldensityfunctionofY givenXisdefinedas Whenp(x) = 0, theconditionaldensityisundefined. The propertiesofjointdistributionsandconditionaldistributions carry over tojointandconditionaldensityfunctions. In particular, wehavethechainrule andBayes‘ rule Mathematics of Biological Networks
Conditionaldensityfunctions Definition: LetX, YandZbesetsofcontinuousrandom variables withjointdensityp(X, Y,Z). WesaythatXisconditionallyindependentofYgivenZ iffor all x,y, z such thatp(z) > 0 Mathematics of Biological Networks
Expectation Let X be a discreterandom variable thattakesnumericalvalues. Then, theexpectationof X underthedistribution P is If X is a continuous variable, thenweusethedensityfunction E.g. ifweconsider X tobetheoutcomeofrolling a good die withprobability 1/6 foreachoutcome, thenE[X] = 1 1/6 + 2 1/6 + … + 6 1/6 = 3.5 Mathematics of Biological Networks
Properties oftheexpectationof a random variable E[a X + b] = a E[X ] + b Let X and Y betworandom variables E[X + Y] = E[X] + E[Y] Here, itdoes not matter whether X and Y areindependentor not. Whatcanbesayabouttheexpectationvalueof a productoftworandom variables? In thegeneralcaseverylittle. Consider 2 variables X and Y thattakeeach on thevalues +1 and -1 with probabilities 0.5. If X and Y areindependent, thenE[X Y] = 0. Iftheyalwaystakethe same value (theyarecorrelated), thenE[X Y] = 1. Mathematics of Biological Networks
Properties oftheExpectationof a random variable If X and Y areindependentthen E[X Y]= E[X] E[Y] The conditionalexpectationof X given y is Mathematics of Biological Networks
Variance The expectationof X tellsusthemeanvalueof X. However, itdoes not indicatehowfar X deviatesfromthisvalue. A measureofthisdeviationisthevarianceof X: The varianceistheexpectationofthesquareddifferencebetween X anditsexpectedvalue. An alternative formulationofthevarianceis If X and Y areindependent, then Forthisreason, weareofteninterested in thesquarerootofthevariance, whichiscalledthestandarddeviationoftherandom variable. Wedefine Mathematics of Biological Networks
Variance Let X be a random variable withGaussiandistribution N(;2). ThenE[X] = and Var[X] = 2. Thus, theparametersoftheGaussiandistributionspecifytheexpectationandthevarianceofthedistribution. The form oftheGaussiandistributionimpliesthatthedensityofvaluesof X dropsexponentially fast in thedistance(x - ) / . Not all distributionsshow such a rapid decline in theprobabilityofoutcomesthataredistancefromtheexpectation. However, evenforarbitrarydistributions, onecanshowthatthereis a decline. Chebyshevinequalitystates or in termsof Mathematics of Biological Networks