1 / 51

DECOMPOSED CONFORMANCE

This paper explores the use of Structural and Reduced Process Structure Trees (SESE+RPST) for conformance checking and diagnosis in process mining. The authors discuss the benefits and limitations of this approach, and present a divide and conquer algorithm for diagnosis. They also propose a heuristic-based best effort analysis method.

tharris
Download Presentation

DECOMPOSED CONFORMANCE

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DECOMPOSED CONFORMANCE Jorge Munoz-Gama, Josep Carmona and W.M.P van der Aalst

  2. About Myself • Jorge Munoz-Gama • Barcelona • UniversitatPolitecnica de Catalunya (UPC) • Advisor: Josep Carmona • Studies • Bachelor in Computer Science (2009) • Master in Computation (2010) • PhD in Computation ( expected Oct. 2014) • TUE (2012 and 2013) and NII (2012) • Conformance Checking and Diagnosis in Process Mining • Topics • Precision within Conformance • Arya, Wil and Boudewijn • Decomposed Conformance • Wil and Eric

  3. Abstract Wordle

  4. Outline • Diagnosis using SESE + RPST • SESE / RPST • Benefits and limitations • Valid Decomposition using SESE + RPST • Valid Decomposition • Transform SESE into Valid Decomposition • Alignments and Fitness from Valid Decomposition • Stitching Check • Divide and Conquer Algorithm

  5. Diagnosis using SESE+RPST

  6. Conformance Diagnosis in the Large

  7. Process Diagnosis like a Map

  8. Process Diagnosis like a Map

  9. Decomposition Goals • Intuitive structural decomposition • Low decoupling • Sub-processes within the main process • SESE • Hierarchy between components • Nested components • RPST * ArtemPolyvyanyy: Structuring Process Models. PhD Thesis. University of Potsdam (Germany), January 2012

  10. Structure instead of Behavior

  11. Interior, Boundary, Entry, and Exit nodes • Given a subgraph and a node of it: • Interior node: connected only to nodes of the subgraph. • Boundary node: not interior • Entry node: boundary where • no incoming edge in subgraph • or all outgoing edges in • Exit node: boundary where • no outgoing edge in subgraph • or all incoming edges in

  12. SESE, Canonical SESE and RPST • SESE : set of edges which subgraph has a Single Entry node and a Single Exit node • Canonical SESE: not overlap with any other SESE • Refined Process Structure Tree (RPST) containing the Canonical SESEs • Unique • Modular

  13. Example of SESE and RPST

  14. Conformance and Markings • The analysis is strongly depended on the markings B B D A A C C E

  15. Best Effort Analysis • Best Effort Analysis oriented to understanding, diagnosis and testing • Include artificial place when the entry (or exit) is a transition • Short-circuited the component to allow repetitions • Heuristic based on invariants of the whole net • Use of the particularities of the net • Safe, Sound, Bounded, … • But at this point there are not guarantees for the general case

  16. Implementation PackageJorgeMunozGama

  17. Implementation PackageJorgeMunozGama

  18. Process Conformance and Refinement Published Work Hierarchical Conformance Checking of Process Models Based on Event Logs J. Munoz-Gama, J. Carmona and W. van der Aalst Petri Nets 2013

  19. Valid Decomposition using SESE+RPST

  20. Hierarchy is not Decomposition • Hierarchy aids in the diagnosis • But does not make conformance computation faster • Actually, the conformance is computed much more times • Possible to limit to some range of levels or to focus on particular part • Not guarantees for the general case • Can we achieve a decomposition of the conformance problem? • That reduces the time? • With guarantees on the fitness result?

  21. Partitioning the RPST • Any cut in the RPST is partitioning on the edges • Algorithm to cut by the size of the component (k-partitioning)

  22. Properties of the Partitioning • It is faster … • … but what about the guarantees? • Decomposed Perfectly Fitting Checking: A model/log is perfectly fitting if and only if all the components are perfectly fitting

  23. SESE and Decomposed Perfectly Fitting • SESEs (per se) do not satisfy the Decomposed Perfectly Fitting Checking property • 1 token in p => abcdef fits S but not S2 • 2 tokens in p => abdecf fits S1 and S2 but not S

  24. Valid Decomposition • Each place appears in precisely one of the subnets • Each edge appears in precisely one of the subnets • Transitions may appear in multiple subnets • Invisible transitions must appear in precisely one subnet • Duplicate transitions must appear in precisely one subnet Valid Decompositions satisfy Decomposed Perfectly Fitting Checking property ! * Wil M.P. van der Aalst: Decomposing Petri Nets for Process Mining: A Generic Approach. BPMCenter.org, 2012

  25. SESE to Valid Decomposition • Create a ‘bridge’ for each shared place

  26. Results (1) • 1 Net – 1h 15min • 7 Subnets – 2min

  27. Results (2)

  28. Topology

  29. Topology and NFCC and NFN • Non Fitting Connected Components (NFCC) • Non Fitting Net (NFN)

  30. Topology Algorithms on Large

  31. DivideAndConquer Package

  32. DecomposedConformance Package

  33. Process Conformance and Refinement Published Work Conformance Checking in the Large: Partitioning and Topology J. Munoz-Gama, J. Carmona and W. van der Aalst Business Process Management (BPM) 2013

  34. Alignments and Fitnesson Valid Decompositions

  35. Adapted Cost Function • Adapted Cost Function Cost involving the task # subnets having the task • Theorem: The sum of the costs of all the subnets using the adapted cost function is a lower bound of the cost in overall alignment • Upper bound on the fitness

  36. Lower bound on the costs - Idea SN A A J J K K L L B B C - D D E E F F G G - H I I SN1 SN2 SN3 A A B B D D E E A A A - C - K K L L J J J J C C C - F F F F G G G G H H - H I I I I

  37. SN1 Stitching Check A A B B D D E E SN SN2 A A J J K K L L B B C - D D E E F F G G - H I I SN3 C - K K L L SN1-SN2 SN2-SN3 SN1-SN3 C - A A SN2 SN1 A A J J C - F F G G - H I I A A C - SN3 SN2 The order of the tasks matters

  38. Stitching Check Theorem • Theorem: Given a trace, if it agrees on the stitching check, the sum of the costs using the adapted cost function is not a bound but the exact result. • An optimal alignment for the whole trace can be constructed straightforward from the alignments of the subnets

  39. Stitching Check Corollary • Corollary: if all the shared transitions are synchronous moves, is not a bound but exact. • Interesting from a Diagnosis point of view • Two optimal alignments: one with synchronous moves in the shared transitions, and the other no.

  40. Two possible optimal alignments B - A A - B C C D D E E F F - A B B A - C C D D E E F F

  41. Two possible optimal alignments They do not agree on B (not even in the # occurrences B B C C E E - A B - B B A - B B C C E E A A - B B B D D E E E E F F B B D D E E E E F F

  42. Modified Alignment Algorithm • Modify the alignment algorithm to prioritize solutions with synchronous moves for a given set of tasks (if exists) Priority queue 10 10 11 12 10 11 12

  43. Estimating Fitness • If all traces in the log satisfy the stitching check the fitness is exact (unlikely) • Just that one trace does not satisfy, the fitness is not formally guaranteed • However, it must be experimentally accurate The error is negligible 10000 traces satisfy stitching check fitness 1 trace not satisfy stitching check

  44. Fitness Interval • Give the percentage of traces with exact value • But also a confidence interval on the fitness Upper Bound of trace fitness fitness (if satisfy stitching check) Lower Bound of trace 0 (if not satisfy stitching check) • The bounds for the log are the average of the bounds per trace

  45. Merging Subnets • If they don’t agree, merge them

  46. Stitching Matrix • Stitching problems between subnets • Blueprint for merging L

  47. Stitching Matrix • Stitching problems between subnets • Blueprint for merging L

  48. Decomposed Conformance Algorithm dc (L,SN) L[ ], SN[ ] = decompose (L, SN) A[ ] = align (L[ ], SN[ ]) Lp, Ap[ ] = pass_stitching_check (L, A[ ]) Lf, Af[ ] = fail_stitching_check (L, A[ ]) while (not_final_condition) Ms = stitching_matrix (Af[ ]) SN[ ] = merge_subnets (Ms) L[ ] = project_log (Lf, SN[ ]) A[ ] = align (L[ ], SN[ ]) Lp, Ap[ ] = Lp, Ap[ ] + pass_stitching_check (L, A[ ]) Lf, Af[ ] = fail_stitching_check (L, A[ ]) compute_fitness (Ap[ ], Af[ ]) compute_alignments (Ap[ ])

  49. Conclusions • How SESE and RPST may help for diagnosis • How to create Valid Decompositions from SESE • Partitioning the Problem • Bridging • Topology and Topological Algorithms • Estimating fitness from Valid Decompositions • Stitching Check • Fitness Interval • Decomposed Conformance Algorithm

  50. Future Work • New approaches for creating Valid Decompositions • Based on Transition-Separation Pairs • SESE+Passages • Study on the decomposed fitness • When it’s more effective and when to stop • More complex merging strategies • Real-case scenarios • Conformance Checking in Hierarchy

More Related