290 likes | 416 Views
Mining Constraints for Artful Processes. Claudio Di Ciccio and Massimo Mecella. Process Mining. Definition.
E N D
Mining Constraints for Artful Processes Claudio Di Ciccio and Massimo Mecella Claudio Di Ciccio (cdc@dis.uniroma1.it) 15th International Conference on Business Information Systems (BIS 2012) Tuesday, May the 22nd, Vilnius, Lithuania
ProcessMining Definition Mining Constraints for Artful Processes (Di Ciccio, C. - DIAG, SAPIENZA Università di Roma) • Process Mining [Aalst2011.book], also referred to as Workflow Mining, is the set of techniques that allow the extraction of process descriptions, stemming from a set of recorded real executions (logs). • ProM [AalstEtAl2009] is one of the most used plug-in based software environment for implementing workflow mining (and more) techniques. • The new version 6.0 is available for download at www.processmining.org
ProcessMining Definition Mining Constraints for Artful Processes (Di Ciccio, C. - DIAG, SAPIENZA Università di Roma) • Process Mining involves: • Process discovery • Control flow mining, organizational mining, decision mining; • Conformance checking • Operationalsupport • We will focus on the control flow mining • Many control flow mining algorithms proposed • α [AalstEtAl2004] and α++ [WenEtAl2007] • Fuzzy [GüntherAalst2007] • Heuristic [WeijtersEtAl2001] • Genetic [MedeirosEtAl2007] • Two-step [AalstEtAl2010] • …
A differentcontext Artful processes and knowledge workers • Artful processes [HillEtAl06] • informal processes typically carried out by those people whose work is mental rather than physical (managers, professors, researchers, engineers, etc.) • “knowledge workers”[ACTIVE09] • Knowledge workers create artful processes “on the fly” • Though artful processes are frequently repeated, they are not exactly reproducible, even by their originators, nor can they be easily shared. Mining Constraints for Artful Processes (Di Ciccio, C. - DIAG, SAPIENZA Università di Roma)
A differentcontext Email conversations • In collaborative contexts, knowledge workers share their information and outcomes with other knowledge workers • E.g., a software development mgr. • Typically, by means of several email conversations • Email conversations are actual traces of running processes that knowledge workers adhere to Mining Constraints for Artful Processes (Di Ciccio, C. - DIAG, SAPIENZA Università di Roma)
A differentcontext Processes from email conversations Mining Constraints for Artful Processes (Di Ciccio, C. - DIAG, SAPIENZA Università di Roma) • From the collection of email messages, you can extract the processes that lay behind • Related e-mail conversations are traces of their runs • Valuable advantages for users • Automated discovery of formal representations • with no effort for knowledge workers • Tidy organization for naïve best practices kept only in mind • Opportunity to share and compare the knowledge on methodologies • Automated discovery of bottlenecks, delays, structural defects • from the analysis of previous runs • Email conversations are a kind of semi-structured text • this approach is not tailored to the electronic mail • it can be extended to the analysis of other semi-structured texts
MailOfMine What is MailOfMine? Mining Constraints for Artful Processes (Di Ciccio, C. - DIAG, SAPIENZA Università di Roma) MailOfMineis the approach and the implementation of a collection of techniques, the aim of which is to is to automatically build, on top of a collection of email messages, a set of workflow models that represent the artful processes laying behind the knowledge workers’ activities. [DiCiccioEtAl11] [DiCiccioMecella/TR12]
The MailOfMineapproach From the e-mail archive to key parts Mail archive Mail Database Conversations KeyParts Multi-format mail storage plug-in based crawlers [ZardettoEtAl10]-based clustering algorithm [CarvalhoEtAl04] -based filter MiningConstraints for ArtfulProcesses (Di Ciccio, C. - DIAG, SAPIENZA Università di Roma)
Key Parts Concatenation Activityindicium [ZardettoEtAl10]-based [CohenEtAl04,SakuraiEtAl05] -based task extractor From key parts to processes Tasks Processes Mining Constraints for Artful Processes (Di Ciccio, C. - DIAG, SAPIENZA Università di Roma)
Tasks MINERful Processes Mining Constraints for Artful Processes (Di Ciccio, C. - DIAG, SAPIENZA Università di Roma)
MINERful The mining algorithm in MailOfMine Mining Constraints for Artful Processes (Di Ciccio, C. - DIAG, SAPIENZA Università di Roma) • MINERfulis a workflowminingalgorithm • Its input is a collection of stringsT and an alphabetΣT • Eachstringtis a trace • Eachcharacteris an event (enacted task) • The collectionrepresents the log • Its output is a declarativeprocess model • Whatis a declarativeprocess model?
On the visualization of processes The imperative model • Represents the whole process at once • The most used notation is based on a subclass of Petri Nets (namely, the Workflow Nets) Mining Constraints for Artful Processes (Di Ciccio, C. - DIAG, SAPIENZA Università di Roma)
On the visualization of processes If A is performed, B must be perfomed, no matter before or afterwards (responded existence) The declarative model Whenever B is performed, C must be performed afterwards and B can not be repeated until C is done (alternate response) The notation here is based on [AalstEtAl06,MaggiEtAl11] (DecSerFlow, Declare) Mining Constraints for Artful Processes (Di Ciccio, C. - DIAG, SAPIENZA Università di Roma) • Rather thanusing a procedural language for expressing the allowed sequence of activities, it is based on the description of workflows through the usage of constraints • the idea is that every task can be performed, except the ones which do not respect such constraints • this technique fits with processes that are highly flexible and subject to changes, such as artful processes
On the visualization of processes Imperative vS declarative Declarative Declarative models work better in presence of a partial specification of the process scheme Imperative Mining Constraints for Artful Processes (Di Ciccio, C. - DIAG, SAPIENZA Università di Roma)
A real discovered process model “Spaghetti process” [Aalst2011.book] Mining Constraints for Artful Processes (Di Ciccio, C. - DIAG, SAPIENZA Università di Roma)
Declare constraint templates Constraint templates as Regular Expressions (REs) Mining Constraints for Artful Processes (Di Ciccio, C. - DIAG, SAPIENZA Università di Roma)
Declare constraint templates Constraint templates as Regular Expressions (REs) Mining Constraints for Artful Processes (Di Ciccio, C. - DIAG, SAPIENZA Università di Roma)
MINERful by example Scenario Mining Constraints for Artful Processes (Di Ciccio, C. - DIAG, SAPIENZA Università di Roma) • A project meeting is scheduled • We suppose that a final agenda will be committed (“confirmAgenda”) after that requests for a new proposal (“requestAgenda”), proposals themselves (“proposeAgenda”) and comments (“commentAgenda”) have been circulated. • Shortcuts for tasks (processalphabet): • p (“proposeAgenda”) • r (“requestAgenda”) • c (“commentAgenda”) • n (“confirmAgenda”)
MINERful by example Constraints on tasks • Existence constraints • Participation(n) • Uniqueness(n) • End(n) • Relation constraints • Response(r,p) • RespondedExistence(c,p) • Succession(p,n) • The agenda • must be confirmed, • only once: • it is the last thing to do. • During the compilation: • the proposal follows a request; • if a comment circulates, there has been / will be a proposal; • after the proposal, there will be a confirmation, and there can be no confirmation without a preceding proposal. Mining Constraints for Artful Processes (Di Ciccio, C. - DIAG, SAPIENZA Università di Roma)
MINERful by example Testing by replay Mining Constraints for Artful Processes (Di Ciccio, C. - DIAG, SAPIENZA Università di Roma) • In order to validate the algorithm • WetranslateconstraintsintoREs • The overallprocessisexpressed by the intersection of REs • We use a RE-drivenrandom string builder [Xeger] for creating a test-and-validation set • Weanalyze the result and evaluate the performances • In order to seehowitworksnow • Wefollow a run of MINERful over a stringbuilt by Xeger:rrp c rp c r c p c n
MINERful by example Building the “ownplay” of p and n r r p c r pc r c pc n r r p c r pc r c pc n Mining Constraints for Artful Processes (Di Ciccio, C. - DIAG, SAPIENZA Università di Roma) • p • p occurred 3 times in 1 stringγp(3) = 1 • For each m ≠ 3γp(m) = 0 • pdidnotoccuras the first noras the last charactergi(p) = 0gl(p) = 0 • n • γn(1) = 1 • For each m ≠ 1, γn(m) = 0 • noccurredas the last character in 1 stringgi(n) = 0gl(n) = 1
MINERful by example Building the “interplay” of p and n • With respect to the occurrence of p, n occurred… • Never before: 3 timesδp,n(-∞) = 3 • 2 char’s after: 1 timeδp,n(2) = 1 • 6 char’s after: 1 timeδp,n(6) = 1 • 9 char’s after: 1 timeδp,n(9) = 1 • Alternating: • Onwards: 2 timesb→p,n= 2 • Backwards: neverb←p,n = 0 • Lookingat the string • rrp c rp c r c p c n • rrp c rp c r c p c n • rrp c rp c r c p c n • rrp c rp c r c p c n • rrp c rpc r c p c n • rrp c rp c r c p c n Mining Constraints for Artful Processes (Di Ciccio, C. - DIAG, SAPIENZA Università di Roma)
MINERful by example Building the “interplay” of r and p b→r,p= 1 b←r,p= 0 r rp c r p c r c p c n Mining Constraints for Artful Processes (Di Ciccio, C. - DIAG, SAPIENZA Università di Roma)
MINERful by example Workflow discovery by constraints inference Mining Constraints for Artful Processes (Di Ciccio, C. - DIAG, SAPIENZA Università di Roma) • Interplay and ownplay constitute the Knowledge Base of MINERful • The KB construction is such that each new string adds information • The algorithm does not needto read the strings more than once each • Constraints are determined by the evaluation of boolean queries on the KB • This allows the discovery of constraints with a fasterprocedure on a smaller set than the whole input • MINERful is a two-step algorithm
MINERful by example Some queries for inferring constraints Mining Constraints for Artful Processes (Di Ciccio, C. - DIAG, SAPIENZA Università di Roma) • RespondedExistence(c,p) • ¬(δr,p(0) > 0)There is no string where p does not occur, if r is read • Response(r,p) • RespondedExistence(r,p) ∧ ¬(δr,p(+∞) > 0)RespondedExistence(r,p) holds and there is no string where p does not follow r • Precedence(r,p) • RespondedExistence(p,r) ∧ ¬(δr,p(-∞) > 0)RespondedExistence(p,r) holds and there is no string where p does not precede r • Succession(p,n) • Response(p,n) ∧ Precedence(p,n) • …
Conclusions Recap Mining Constraints for Artful Processes (Di Ciccio, C. - DIAG, SAPIENZA Università di Roma) • MailOfMine is a system designed for mining artful processes out of email collections • MINERful is the worflow mining algorithm designed for MailOfMine • MINERful is • Independent on the formalism used for expressing constraints • Modular (two-phase) • Capable of eliminating redundancy in the process model
Conclusions On the asymptotic complexity of MINERful • Linear w.r.t. the number of strings in the testbed|T| • Quadratic w.r.t. the size of strings in the testbed|tmax| • Quadratic w.r.t. the size of the alphabet|ΣT| • Hence, polynomial in the size of the inputO(|T|·|tmax|2·|ΣT |2) Mining Constraints for Artful Processes (Di Ciccio, C. - DIAG, SAPIENZA Università di Roma)
References Cited articles and resources, in order of appearance Mining Constraints for Artful Processes (Di Ciccio, C. - DIAG, SAPIENZA Università di Roma) • [Aalst2011.book] van der Aalst, W.M.P.: Process Mining: Discovery, Conformance and Enhancement of Business Processes. Springer (2011). • [AalstEtAl2009] van der Aalst, W.M.P., van Dongen, B.F., Güther, C.W., Rozinat, A., Verbeek, E., Weijters, T.: Prom: The processminingtoolkit. In de Medeiros, A.K.A., Weber, B., eds.: BPM (Demos). Volume 489 of CEUR Workshop Proceedings., CEUR-WS.org (2009) • [AalstEtAl2004] van der Aalst, W.M.P., Weijters, T., Maruster, L.: Workflow mining: Discovering process models from event logs. IEEE Trans. Knowl. Data Eng. 16(9) (2004) 1128–1142. • [WenEtAl2007] Wen, L., van der Aalst, W.M.P., Wang, J., Sun, J.: Mining process models with non-free-choice constructs. Data Min. Knowl. Discov. 15(2) (2007) 145–180. • [GüntherEtAl2007] Günther, C.W., van der Aalst, W.M.P.: Fuzzy Mining - Adaptive Process Simplification Based on Multi-perspective Metrics. BPM 2007: 328-343. • [WeijtersEtAl2001] Weijters, A., van der Aalst, W.: Rediscovering workflow models from event-based data using little thumb. Integrated Computer-Aided Engineering 10 (2001) 2003. • [MedeirosEtAl2007] Medeiros, A.K., Weijters, A.J., Aalst, W.M.: Geneticprocess mining: anexperi- mental evaluation. Data Min. Knowl. Discov. 14(2) (2007) 245–304. • [AalstEtAl2010] van der Aalst, W., Rubin, V., Verbeek, H., van Dongen, B., Kindler, E., Gnther, C.: Process mining: a two-step approach to balance between underfitting and overfitting. Software and Systems Modeling 9 (2010) 87–111 10.1007/s10270-008- 0106-z.
References Cited articles and resources, in order of appearance Mining Constraints for Artful Processes (Di Ciccio, C. - DIAG, SAPIENZA Università di Roma) • [HillEtAl06] Hill, C., Yates, R., Jones, C., Kogan, S.L.: Beyond predictable workflows: Enhancing productivity in artful business processes. IBM Systems Journal 45(4), 663–682 (2006) • [ACTIVE09] Warren, P., Kings, N., et al.: Improving knowledge worker productivity - the active integrated approach. BT Technology Journal 26(2), 165–176 (2009) • [DiCiccioEtAl11] Di Ciccio, C., Mecella, M., Catarci, T.: Representing and Visualizing Mined Artful Processes in MailOfMine. USAB 2011:83-94 • [DiCiccioMecella12] Di Ciccio, C., Mecella,M.: Miningconstraints for artfulprocesses. In W. Abramowicz, D. Kriksciuniene, V.S., ed.: 15th International Conference on Business Information Systems. Volume 117 of Lecture Notes in Business Information Processing., Springer (2012) (to appear). • [DiCiccioMecella/TR12] Di Ciccio, C., Mecella, M.: MINERful, a miningalgorithm for declarativeprocessconstraints in MailOfMine. Technical report, Dipartimento di Ingegneria Infor- matica, Automatica e Gestionale “Antonio Ruberti” – SAPIENZA, Universita` di Roma (2012). • [AalstEtAl06] van der Aalst, W.M.P., Pesic, M.: Decserflow: Towards a truly declarative service flow language. Proc. WS-FM 2006 • [MaggiEtAl11] Maggi, F.M., Mooij, A.J., van der Aalst, W.M.P.: User-guideddiscovery of declar- ativeprocessmodels. In: CIDM, IEEE (2011) 192–199 • [Xeger] http://code.google.com/p/xeger/