1 / 38

Diagnosis of Asynchronous Discrete Event Systems: Datalog to the Rescue!

Diagnosis of Asynchronous Discrete Event Systems: Datalog to the Rescue! Serge Abiteboul (INRIA & U. Paris 11) Zoë Abrams (INRIA & Stanford U.) Stefan Haar (INRIA) Tova Milo (Tel Aviv U.) June 15 th , 2005. History. Deductive databases was a hot topic in the late 80s datalog

talon
Download Presentation

Diagnosis of Asynchronous Discrete Event Systems: Datalog to the Rescue!

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Diagnosis of Asynchronous Discrete Event Systems: Datalog to the Rescue! Serge Abiteboul (INRIA & U. Paris 11) Zoë Abrams (INRIA & Stanford U.) Stefan Haar (INRIA) Tova Milo (Tel Aviv U.) June 15th, 2005

  2. History • Deductive databases was a hot topic in the late 80s • datalog • query optimization: magic sets and QSQ • Research in this area led to beautiful results, with little industrial impact

  3. Current Context • Years later, with networks everywhere, recursive data management is becoming more essential • Datalog is hot again! • Trevor and Suciu [2001] • Loo, Hellerstein, Stoica, and Ramakreshnan [2005] • PODS Tutorial 1, Monica Lam et al. [2005] • This paper: use datalog for diagnosis of telecommunication systems

  4. ack messages Alarms task incomplete messg. unprocessed Diagnosis of Telecommunication Systems • A telecom system consists of software and hardware pieces distributed over a network • One piece fails and alarm signals are issued from throughout the network

  5. Diagnosis of Telecom Systems cont. • Supervisor: • Collects alarms • Alarms are asynchronous • Knows peer behavior pattern • Goal: determine what could have happened in the global system ack messages Alarms task incomplete messg. unprocessed

  6. Deductive Database Formulation • Extensional data: a sequence of alarms received by the supervisor • Intensional data: the possible execution flows that could have created the alarm sequence Can the diagnosis problem be stated in terms of query evaluation in deductive databases? Yes – it can!

  7. Outline • Technical • Datalog and Query-Sub-Query (QSQ) • Adapt QSQ to distributed a setting: dQSQ • Application: Distributed Diagnosis of Telecommunication Systems • Petri Nets and Unfoldings • Datalog formulation of the diagnosis problem • Benefits of using dQSQ

  8. Deductive Database • Explicit information • Rules that enable inferences based on the stored data Datalog program parent(x,y) anc(x,y) :- parent(x,y) anc(x,y) :- anc(x,z), parent(z,y) ↨ x,y (anc(x,y) ← parent(x,y)) x,y,z (anc(x,y) ← anc(x,z), parent(z,y))

  9. Query Evaluation • Query: “Who has Joyce as an ancestor?” • Naive evaluation: materialize everything, then evaluate query • Goal: Compute query with minimal data materialization q(y) :- anc(“Joyce”,y)

  10. Query-Sub-Query (QSQ) • Known techniques for optimization of Datalog queries: magic set and QSQ • QSQ rewrites the Datalog program according to the given query • Materializes tuples bottom-up • QSQ is based on two main notions: • Adorned relations • Supplementary relations

  11. Adorned Relations • A variable in a relation can be “bound” to a constant • For each relation, adorned versions based on the bindings of the variables are considered anc(“Joyce”,y) bound to a constant free

  12. Adorned Relations Rewriting using adorned relations • Different adornments of the same relation are treated as different relations during the QSQ computation anc (x,y) :- parent(x,y) anc (x,y) :- anc (x,z), parent(z,y) q(y) :- anc(“Joyce”,y) bf bf bf bf bound to a constant free

  13. Supplementary Relations Datalog QSQ rewriting supplementary relations accumulate the relevant bindings for each position in the rule ancbf(x,y) :- parent(x,y) ancbf(x,y) :- ancbf(x,z), parent(z,y) q(x) :- ancbf(“Joyce”,x) in_anc_bf(“Joyce”) :- sup_10(x) :- in_anc_bf(x) sup_11(x,y) :- sup_10(x), parent(x,y) anc_bf(x,y) :- sup_11(x,y) sup_20(x) :- in_anc_bf(x) sup_21(x,z) :- sup_20(x), anc_bf(x,z) sup_22(x,y) :- sup_21(x,z), parent(z,y) anc_bf(x,y) :- sup_22(x,y) sup_10(x) sup_11(x,y) sup_22(x,y) sup_20(x) sup_21(x,z)

  14. parent(x,y) QSQ Example Datalog ancbf(x,y) :- parent(x,y) sup_10(x) sup_11(x,y) QSQ rewriting Joyce Joyce, Lois Joyce, Ruth in_anc_bf(“Joyce”) :- sup_10(x) :- in_anc_bf(x) sup_11(x,y) :- sup_10(x), parent(x,y) anc_bf(x,y) :- sup_11(x,y) sup_20(x) :- in_anc_bf(x) sup_21(x,z) :- sup_20(x), anc_bf(x,z) sup_22(x,y) :- sup_21(x,z), parent(z,y) anc_bf(x,y) :- sup_22(x,y) ancbf(x,y) :- ancbf(x,z), parent(z,y) sup_20(x) sup_21(x,z) sup_22(x,y) Joyce Joyce, Lois Joyce, Ruth Joyce, Mark Joyce, Andy Joyce, Mark Joyce, Andy q(y) :- ancbf(“Joyce”,y) Lois Ruth Mark Andy query result

  15. Nice Properties of QSQ • Compute the correct answer to the query • Materialize only a minimal set of tuples • Guaranteed to terminate

  16. Beyond Datalog • We allow “object creation” (using Skolem functions) • crucial for our application • In general, may not terminate • OK for our context

  17. Outline • Technical • Datalog and Query-Sub-Query (QSQ) • Adapt QSQ to distributed a setting: dQSQ • Application: Distributed Diagnosis of Telecommunication Systems • Petri Nets and Unfoldings • Datalog formulation of the diagnosis problem • Benefits of using dQSQ

  18. Previous Work Distribution in Deductive Databases • Gelder, 1986 • Trevor and Suciu, 2001 • Hulin, 1989

  19. R S hosting r,a hosting s,b T hosting t,c Distributed Environment Centralized Datolog program r1r(x,y) :- a(x,y) r2r(x,y) :- s(x,z), t(z,y) r3s(x,y) :- r(x,y), b(y,z) r4t(x,y) :- c(x,y) Distribution of the program between 3 peers r1r@R(x,y) :- a@R(x,y) r2r@R(x,y) :- s@S(x,z), t@T(z,y) r3s@S(x,y) :- r@R(x,y), b@S(y,z) If a relation is maintained at some peer, the rules defining it are known at that peer r4t@T(x,y) :- c@T(x,y)

  20. Distributed QSQ Rewriting • For each rule: The peer in the head of the rule starts the rewriting • When a remote relation is encountered, the peer delegates the remainder of the rule to the remote peer in charge of that relation

  21. Nice Properties of dQSQ • Compute the correct answer to the query • Materialize only a minimal set of tuples • As good as QSQ • No need for global knowledge • Need, in general, some standard technique to detect termination

  22. Outline • Technical • Datalog and Query-Sub-Query (QSQ) • Adapt QSQ to distributed a setting: dQSQ • Application: Distributed Diagnosis of Telecommunication Systems • Petri Nets and Unfoldings • Datalog formulation of the diagnosis problem • Benefits of using dQSQ

  23. Petri Net Model Each piece is described by a Petri Net The communications are modeled as transitions

  24. Petri Net Model 1 7 marked place transition alarm symbol place When the transition fires, an alarm symbol is reported to the supervisor. In our example, alarm (b) is reported when (i) fires When a transition fires, the current state changes. Children of the transition are marked and parents are unmarked For example, if transition (i) fires, the marking moves from places 1,7 to places 2,3 • Circles denote places • Marked places model the current state of the peer • Squares denote transitions • A transition node can fire iff all its parent nodes are marked

  25. The Diagnosis Problem • The supervisor receives an alarm sequence (b,p1),(a,p2),…,(c,p1).a,b,c – alarm symbolspi – the peer that emitted the alarm • Due to asynchronous communication • Alarms sent by different peers may not appear in the order they were emitted • We can only assume that the order of alarms is kept for each individual peer • Goal: Find an explanation for a given alarm sequence

  26. Unfolding Model 4 v Petri Net • Purple node: not useful in explaining alarm sequence (b;p1),(c;p1) • QSQ Goal: eliminate unnecessary portions of the unfolding The nodes circled in red is another diagnosis for the alarm sequence (b; p1), (c; p1) The set of shaded nodes in the unfolding is a diagnosis for the alarm sequence (b; p1), (c; p1) Unfoldings represent all possible sequences of transition firings An Unfolding of the Petri Net

  27. Outline • Technical • Datalog and Query-Sub-Query (QSQ) • Adapt QSQ to distributed a setting: dQSQ • Application: Distributed Diagnosis of Telecommunication Systems • Petri Nets and Unfoldings • Datalog formulation of the diagnosis problem • Benefits of using dQSQ

  28. Relations for Unfolding • causal(x,y) relation: the transition x was fired, and this eventually led to the firing of node y • conflict(x,y) relation: transitions x and y cannot coexist (i.e. not possible for x and y to have both occurred) An Unfolding of the Petri Net

  29. Constructing the Unfolding with Datalog • The conflict and causal relations capture the information needed to create the unfolding. • The causal relation is similar to the ancestor example • Formulating the conflict relation in Datalog (without negation) was a significant technical challenge: see paper for details

  30. Diagnosis of an alarm sequence using Datalog • Describe unfoldings in distributed Datalog intensionally • Describe the alarm sequence in distributed Datalog extensionally alarmSeq@s(a1,b,p1,root) alarmSeq@s(a2,c,p1,a1) • Describe query in dist. Datalog q@s(z,x) :- seqOut@p1(z,a2), transInSeq@p1(z,x) (b;p1),(c;p1)

  31. Outline • Technical • Datalog and Query-Sub-Query (QSQ) • Adapt QSQ to distributed a setting: dQSQ • Application: Distributed Diagnosis of Telecommunication Systems • Petri Nets and Unfoldings • Datalog formulation of the diagnosis problem • Benefits of using dQSQ

  32. The Benefits • We have stated the diagnosis problem using datalog – so what? • Three major benefits: • Optimized distributed computation • using dQSQ • 2.Can solve more general diagnosis • problems • 3.Implementation language

  33. Benefit 1: Efficiency of dQSQ • Minimal amount of unfolding materialized • thm: dQSQ achieves an optimization as good as that previously provided by the dedicated diagnosis algorithms [BFHJ03,BFHJ04]

  34. Benefit 1 continued:Distributed Computation • dQSQ enables distributed computation • The dQSQ rewriting is performed locally at each peer, without any global knowledge • Limited communication: guarantee that a peer only need communicate with neighbours in the Petri Net. • Diagnosis occurs without any global knowledge of the overall net structure

  35. Benefit 2:Problem Generalizations • Hidden transitions: not all alarms reported to the supervisor • Alarm patterns: alarm patterns described by some regular language (eg ab*) • Constraints on the configurations of interest: alarm sequences not containing some known pattern • Issues with termination

  36. Benefit 3:Active XML (AXML) • AXML = XML with embedded calls to Web services [INRIA] • Implementation of dQSQ using AXML [Noam Pettel, Tel Aviv] • An AXML document contains both extensional and intensional data • Use of continuous services • Optimization of a fragment of AXML • The original motivation for dQSQ • Extended to “trees” – not in the paper

  37. Conclusion • Datalog strikes back: relevant to current P2P systems • Contribution • distributed QSQ • an application to network diagnosis • Future work • optimization and analysis (termination, confluence) of AXML and more generally P2P data management

  38. merci

More Related