240 likes | 310 Views
Finite Model Theory Lecture 18. Extended 0/1 Laws Or “Getting Real”. Outline. A better probabilistic model Probabilities of conjunctive queries Probabilities for FO Based on work done with N. Dalvi and G.Miklau, and on papers by Lynch, Shelah and Spencer. Annomalies 0/1 Laws.
E N D
Finite Model TheoryLecture 18 Extended 0/1 LawsOr “Getting Real”
Outline • A better probabilistic model • Probabilities of conjunctive queries • Probabilities for FO • Based on work done with N. Dalvi and G.Miklau, and on papers by Lynch, Shelah and Spencer
Annomalies 0/1 Laws Database schema:Employee(name, city, occupation) We are not given the instance. • Any person belongs to Employee with m = 1/2 ! • The expected size E[Employee] = n3/2 !1 !! • In practice need conditional probabilities, m(f | y), but they often don’t exists [ why ?]
A Better Model • Postulate that for each R 2sE[R] = cR (a constant) • This leads to: for each tuple t:Pr[t 2 R] = cR / na where a = arity(R)
A Better Model No more anomalies: • For a given person, the probability of it belonging to Employee is ! 0 • The expected size is E[R] = cR • Asymptotic conditional probabilities always exists for conjunctive queries
Conjunctive Queries • Have the form:9 x1…9 xk.(C1Æ … Æ Cm) • Where each Ci is R(…) or xi=xj or xi¹ xj Empolyee(x,Seattle,-),Employee(x,y,Clerk),Employee(-,y,Lawer)
Conjunctive Queries TheoremFor every Q there are numbers E, C s.t: Pr[Q] =C / nE + O(1/NE+1) Corollary Pr[Q1 | Q2] always has a limit • Will show next how to compute C, E
Subgraph Properties • Consider R(x,y); • For every edge, Pr(R(u,v)) = c/n2 • Given Q, let H = Q¹ obtained by adding all predicates of the form xi¹ xj • H checks for the presence of a subgraph
Subgraph Properties Example 1: • Q = R(x,y),R(y,z),R(z,x)H=Q¹ = R(x,y),R(y,z),R(z,x),x¹ y,y¹ z,z¹ x H =
Subgraph Properties Pr(H) = Pr(Çu,v,w H(u,v,w)) ·åu,v,w Pr(H(u,v,w)) = n(n-1)(n-2) * 1/3 * c3 / n6 = 1/3 c3 / n3 + O(1/n4)
Subgraph Properties Example 2: Q = R(x,y),R(y,a),R(b,x) H=Q¹=R(x,y),R(y,z),R(z,x),x¹ y,y¹a,a¹x,x¹b, b¹x b a
Subgraph Properties Pr(H) = Pr(Çu,v H(u,v)) ·åu,v Pr(H(u,v)) = n(n-1) * 1/1 * c3 / n6 = c3 / n4 + O(1/n5)
Subgraph Properties Let Q = G1, G2, …, Gm Lemma Pr(Q) · C/H * 1/nE V = number of variables in Q A = arity(Q) = arity(G1) + … + arity(Gm) E = A - V = “the exponent of Q” H = number of automorphisms Q ! Q C = c1 * c2 * … * cm = “the coefficient of Q”
Subgraph Properties Lower bound, for the triangle: Pr(H) = Pr(Çu,v,w H(u,v,w)) ¸åPr(H(u,v,w)) – åPr(H(u,v,w)Æ H(u’,v’,w’)= 1/3 c3/n3 + O(1/n4) - å Pr(HH)
Subgraph Properties • What is Pr(H) ? Each term belongs to one of the following cases: E = 12 – 6 = 6 E = 12 – 5 = 7 E = 10 – 4 = 6 A few others…. But all have E > 3 ! Hence Pr(HH) is neglijible
Subgraph Properties • Hence, for the triangle: Pr(H) ¼ 1/3 c3/n3 • This generalizes easily to any subgraph property
Subgraphs with E = 0 H = R(x,y) E = 2-2 = 0; what is Pr(H) ? H = R(x,y)R(u,v) E = 4–4 = 0what is Pr(H) ? H = R(x,y)R(y,z)R(z,x), R(u,v) E(H) = E(triangle); Exponent in the theorem is always correct, but need to adjust the coefficient
Conjunctive Queries • Consider the query:R(x,y),R(y,z),R(z,x) • Any of the variables x,y,z may be equal: results in the following subgraphs:H1 = R(x,y)R(y,z)R(z,x) E=6-3=3H2 = R(x,x)R(x,z)R(z,x) E=6-2=4H3 = R(x,x)R(x,x)R(x,x) = R(x,x) E=2 • Hence Pr(Q) = Pr(H3) = cR/n2
Conjunctive Queries • Now considerQ = R(a,x),R(y,b) • Two graphs:H1 = R(a,x)R(y,b) E = 4-2=2H2 = R(a,b) E = 2 • One can prove:Pr(Q) = Pr(H1) + Pr(H2) = (c + c2)/n2
More General Distributions [Shelah&Spencer, Lynch] • Pr(tuple) = b / na • Example: H = triangle • Pr(H) ¼ n3 * 1/3 * b3 / n3a = C / nE • Simply redefine E(H) to use a
More General Distributions • But, problem here; let \alpha = 3/2: E( ) = 3a – 3 = 3/2 E( ) = 3a – 3 + a – 2 = 1 Hence the more complex graph is more likely ! Solution: adjust E(H) to be the max of E(H0) for H0µ H
Threshold Functions for Subgraphs [Erdos and Reny] Edge probability Pr(t) = p(n) = some function Main theorem of random graphs:For any monotone property C there exists a threshold function t(n) s.t. • If p(n) ¿ t(n) then limn Pr(C) = 0 • If p(n) À t(n) then limn Pr(C) = 1
Threshold Functions [Erdos and Reny] The threshold function for subgraph property H is the following: Let a = maxH0µ H |nodes(H0)| / |edges(H0)| Then t(n) = 1/na Can derive it from the exponent [ show in class ]
Extended 0/1 Laws • Shelah and Spencer, and Lynch consider the following general case: • Pr(t) = b / na, for a > 0 • Lynch: a logic admits an extended 0/1 law if for each f one of the following holds:Pr(f) ¼ C/nE, orPr(f) < 1/nE for every E >0