1 / 25

A Dichotomy in the Complexity of Deletion Propagation with Functional Dependencies

PODS 2012. 2012 ACM SIGMOD/PODS Conference Scottsdale, Arizona, USA. A Dichotomy in the Complexity of Deletion Propagation with Functional Dependencies. Benny Kimelfeld IBM Research – Almaden. Deletion Propagation.

montana
Download Presentation

A Dichotomy in the Complexity of Deletion Propagation with Functional Dependencies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PODS 2012 2012 ACM SIGMOD/PODS Conference Scottsdale, Arizona, USA A Dichotomy in the Complexity of Deletion Propagation with Functional Dependencies Benny Kimelfeld IBM Research – Almaden

  2. Deletion Propagation • Translate a tuple deletion on the view back to the source relations … properly • Classic database problem • Specializing the more general view-update problem • [Dayal & Bernstein 1982; Cosmadakis & Papadimitriou 1984; Keller 1986; Cui & Widom 2001; Buneman & Khanna & Tan 2002; Cong & Fan & Geerts 2006; …] • Renewed motivation: debug/causality for false positives [K, Vondrak, Williams, 2011] • Various definitions of “properly” were studied • Minimize the view side effect • # view tuples lost except the intentional one • Minimize the source side effect • # source tuples to delete • = maximal “responsibility” for an answer [Meliou et al., 2010] This Work!

  3. Example: File Access [Cui & Widom 2001; Buneman et al. 2002] = ⋈ Access(u,f) :–UserGroup(u,g), GroupFile(g,f) Delete source rows, s.t. Emma won’t access a.txt. But, maintain maximum access permissions!

  4. Example: File Access [Cui & Widom 2001; Buneman et al. 2002] = ⋈ Access(u,f) :–UserGroup(u,g), GroupFile(g,f) Delete source rows, s.t. Emma won’t access a.txt. But, maintain maximum access permissions!

  5. Example: File Access [Cui & Widom 2001; Buneman et al. 2002] = ⋈ side-effect free (& minimal side effect) Access(u,f) :–UserGroup(u,g), GroupFile(g,f) Delete source rows, s.t. Emma won’t access a.txt. But, maintain maximum access permissions!

  6. Formal Definitions SchemaS: rel. symbols + functional dependencies (fd) R1,….,Rm Ri: attribute-set → attribute Conjunctive Query (CQ) Q: Q(y1,y2,y3) :– R1(x1,y1), R2(x1,'ibm'), R3(x2,y1,y2,x3), R4(x4,y3) head variables existential variables atom No self joins! • Solution: E ⊆ D s.t. a ∉ Q(E) • Side-effect free: Q(E) = Q(D) – {a} • Optimal: |Q(E)| is maximal • Input: • DB D over S • Answer a ∈ Q(D) to delete

  7. Complexity Questions What is the complexity of • Deciding if a side-effect-free solution exists? • Finding an optimal solution? • Or one w/ approximatelyminimal side effect? • Or one w/ approximatelymaximal # surviving answers? • Not the same [K, Vondrák, Williams, 2011]

  8. Unirelation Algorithm (1Rel): Example [Buneman et al., 2002] ⋈ = Access(u,f) :–UserGroup(u,g), GroupFile(g,f) Delete a = (Emma, a.txt)

  9. Unirelation Algorithm (1Rel): Example [Buneman et al., 2002] ⋈ = Access(u,f) :–UserGroup(u,g), GroupFile(g,f) better than previous ⇒selected solution Delete a = (Emma, a.txt) Recall: there is even better solution (side-effect free)

  10. 1Rel: General Case … undesired a ∈Q(D) D Q has k atoms solution 1 … select best D solution 2 (i=1,…,k) solutioni: delete from Ri each tuple consistent w/ a … … D solutionk

  11. Head Domination [K, Vondrák, Williams, 2011] head domination: ∀C ∊CC(G∃[Q]) ∃j∊atoms(Q) s.t.,headVars(C) ⊆ vars(j) Connected Components Q(y1 ,y2) :– R1(x1,y1),R2(x1,y2),R3(x1,y1 ,y2) Q(y1 ,y2 ,y3) :– R1(x1,y1),R2(x1,y2),R3(y1 ,y2),R4(x2,y2,y3) Q(y1 ,y2) :– R1(x,y1),R2(x,y2) Access(u,f)

  12. Previous Dichotomy Theorem [KVW 2011] Let Q be a CQ over a schema S (no self joins) PTime (1Rel) Q(y1 ,y2) :– R1(x1,y1),R2(x1,y2),R3(x1,y1 ,y2) Q(y1 ,y2 ,y3) :– R1(x1,y1),R2(x1,y2),R3(y1 ,y2),R4(x2,y2,y3) PTime (1Rel) Q(y1 ,y2) :– R1(x,y1),R2(x,y2) Access(u,f) NP-hard

  13. Access Example Revisited Delete (Emma, a.txt) NP-hard ⋈ = group ← file PTime

  14. Access Example Revisited Delete (Emma, a.txt) NP-hard = ⋈ user → group group ← file PTime PTime

  15. Access Example Revisited Delete (Emma, a.txt) NP-hard = ⋈ user ← group PTime user → group group ← file PTime PTime

  16. Access Example Revisited Delete (Emma, a.txt) NP-hard = ⋈ user ← group group → file Every nontrivial set of FDs brings the problem to PTime PTime PTime user → group group ← file PTime PTime

  17. Additional Examples Q(y,y1 ,y2) :– R1(y1,x1),R(x1,y, x2),R2(y2 ,x2) NP-hard Q(y,y1 ,y2) :– R1(x1,y1),R(x1,y, x2),R2(x2,y2) PTime Q(y,y1 ,y2) :– R1(x1,y1),R(x1,y, x2),R2(x2,y2) NP-hard

  18. Dichotomy with FDs Let Q be a CQ over a schema S (no self joins) Remove tuple only if it is used for the undersired answer Depending on the CQ and FDs, the problem is either straightforward or hard!

  19. FDs Among Variables Access(u,f) :–UserGroup(u,g), GroupFile(g,f) FD: user → group FD: group → file u →g g→f u →f {u,g} →f

  20. The CQ Q+ Tractability Condition: Q+hasfunctional head domination Q+ :add to Q’s head every x s.t. headVars → x Access(u,f) :–UserGroup(u,g), GroupFile(g,f) group ← file g ← {u,f} ⇒ Access+(u,g,f) :–UserGroup(u,g), GroupFile(g,f)

  21. Functional Head Domination Tractability Condition: Q+hasfunctional head domination head domination: ∀C∈CC(G∃[Q]) ∃j∊atoms(Q), s.t. vars(j)⊇headVars(C) functional head domination: ∀C∈CC(G∃[Q]) ∃j∊atoms(Q), s.t. vars(j)→ headVars(C) Access(u,f) :–UserGroup(u,g), GroupFile(g,f) ⇐ {u,g} →{u,f} group → file

  22. Examples Tractability Condition: Q+hasfunctional head domination Q(y,y1 ,y2) :– R1(x1,y1),R(x1,y, x2),R2(x2,y2) NP-hard Q(y,y1 ,y2) :– R1(x1,y1),R(x1,y, x2),R2(x2,y2) {y,y1 ,y2} →x2 Q+(y,y1 ,y2,x2) :– R1(x1,y1),R(x1,y, x2),R2(x2,y2) PTime(1Rel*)

  23. Example: Key-Preserving Views Tractability Condition: Q+hasfunctional head domination Theorem [Cong, Fan, Geerts, 2006]: Q preserves keys* ⇒ deletion propagation in PTime For CQs w/o self joins, follows directly from our positive side: Q preserves keys ⇒ Q+ has no existential vars ⇒ G∃[Q+] has no edges ⇒ Q+ trivially hasfunctional head domination (every connected component is a node, dominated by itself…) ⇒ 1Rel* returns an optimal solution • Each relation has a key; none of the key attributes are projected out

  24. About the Proof • The positive side is fairly simple • … once the tractability condition is found • The negative side is intricate • Reduction from the special case of the Access CQ • Challenge: simulating Access(u,f) by an instance that satisfies all the FDs • Central concept: graph separation on the variable graph of the CQ Q(y1 ,y2) :– R1(y1,x),R2(x ,y2) → Q'(y1 ,y2) :– R1(y1,x1,x),R2(x ,x2,y2) R3(x1,x2)

  25. Conclusions & Ongoing Work • Studied deletion propagation in the presence of functional dependencies • Established a dichotomy in complexity: • PTime by a straightforward algorithm vs. • Hardness (of approximation) • Generalizes previously established special cases: no FDs, key-preserving views • Ongoing work: deletion of multiple answers • Preview: trichotomy • Straightforward • Hard but approximable (by a constant-factor) • Hard to approximate Questions?

More Related