230 likes | 413 Views
Action Rules Discovery /Lecture I/. by Zbigniew W. Ras UNC-Charlotte, USA. Interestingness measure. E = [Cond 1 => Cond 2 ]. Presumptive. Objective. Rule : two conditions occur together, with some confidence. Data Mining Task : For a given dataset D, interestingness measure I D and
E N D
Action Rules Discovery/Lecture I/ by Zbigniew W. Ras UNC-Charlotte, USA
Interestingness measure E = [Cond1 => Cond2] Presumptive Objective Rule: two conditions occur together, with some confidence Data Mining Task: For a given dataset D, interestingness measure ID and threshold c, find association E such that ID(E) > c. Knowledge Engineer definesc
Interestingness Function Two types of Interestingness Measure[Silberschatz and Tuzhilin, 1995]: subjective and objective. Subjective measure: user-driven, domain-dependent. Include unexpectedness [Silberschatz and Tuzhilin, 1995], novelty, actionability [Piatesky-Shapiro & Matheus, 1994]. Objective measure: data-driven and domain-independent. They evaluate rules based on statistics and structures of patterns, e.g., support, confidence, etc.
Objective Interestingness Basic Measures for : Domain: card[] Support or Strength: card[ ] Confidence or Certainty Factor: card[]/card[] Coverage Factor: card[]/card[] Leverage: card[]/n – [card[]/n]*[card[]/n] Lift: n card[]/[card[]*card[]]
Subjective Interestingness • Rule is interesting if it is: • unexpected, if it contradicts the user belief about the domain and therefore surprises the user • novel, if to some extent contributes to new knowledge • actionable, if the user can take an action to his/her advantage based on this rule Unexpectedness [Suzuki, 1997] /does not depend on domain knowledge/ If r = [AB1] has a high confidence and r1 = [A*CB2] has a high confidence, then r1 is unexpected. [Padmanabhan & Tuzhilin] A B is unexpected with respect to the belief on the dataset D if the following conditions hold: B = False [ B and logically contradict each other] A holds on a large subset of D A* B holds which means A*
Actionable rules • Action rules: suggest a way to re-classify objects (for instance customers) to a desired state. • Action rules can be constructed from classification rules. • To discover action rules it is required that the set of conditions (attributes) is partitioned into stable and flexible. • For example, date of birth is a stable attribute, and interest rate on any customer account is a flexible attribute (dependable on bank). The notion of action rules was proposed by [Ras & Wieczorkowska, PKDD’00]. Slowinski at al [JETAI, 2004] introduced similar notion called intervention.
Action Rules Decision table Any information system of the form S = (U, AFl ASt {d}), where • d AFl ASt is a distinguished attribute called decision. • ASt - stable attributes, AFl {d} - flexible Action rule [Ras & Wieczorkowska]: [t(ASt) (b1, v1 w1) (b2, v2 w2) … (bp, vp wp)](x) [(d, k1 k2)](x), where (i)[(1 i p) (biAFl)] E-Action rule [Ras & Tsay]: [t(ASt) (b1, w1) (b2, v2 w2) … (bp, wp)](x) [(d, k1 k2)](x), where (i)[(1 i p) (biAFl)]
Action Rules Discovery (Tsay & Ras) Stable Attribute: {a, c} Flexible Attribute: b Decision Attribute: d a = ? a = 0 Table: Set of rules R with supporting objects c = ? c = ? c = 1 a = 2 c = 0 a = ? T6 T4 T5 c = ? c = 2 Figure of (d, L)-tree T2 T3 (T3, T1) : (a = 2) (b, 21) ( d, L H) (a = 2) (b, 31) ( d, L H) T1 T2 Figure of (d, H)-tree T1
Application domain: Customer Attrition Facts: • On average, most US corporations lose half of their customers • every five years (Rombel, 2001). • Longer a customer stays with the organization, the more • profitable he or she becomes (Pauline, 2000; Hanseman, 2004). • The cost of attracting new customers is five to ten times • more than retaining existing ones. • About 14% to 17% of the accounts are closed for reasons • that can be controlled like price or service (Lunt, 1993). • Action: • Reducing the outflow of the customers by 5% can double • a typical company’s profit (Rombel, 2001).
Action Rules Discovery Decision table S = (U, AFl ASt {d}). Assumption: {a1,a2,...,ap} ASt, {b1,b2,...,bq} AFl, ai,1 Dom(ai), bi,1 Dom(bi). Rule: r = [a1,1 a2,1 ... ap,1 ] [b1,1 b2,1 ... bq,1] d1 stable part flexible part Question: Do we have to consider pairs of classification rules in order to construct action rules?
Action Rules Discovery Decision table S = (U, AFl ASt {d}). Assumption: {a1,a2,...,ap} ASt, {b1,b2,...,bq} AFl, ai,1 Dom(ai), bi,1 Dom(bi). Rule: r = [a1,1 a2,1 ... ap,1 ] [b1,1 b2,1 ... bq,1] d1 stable part flexible part Action rule r[d2 d1] associated with r and re-classification task (d, d2 d1): [a1,1 a2,1 ... ap,1] [(b1, b1,1 ) (b2, b2,1) ... (bq, bq,1)] (d, d2 d1)
Action Rules Discovery Action rule r[d2 d1]: [a1,1 a2,1 ... ap,1] [(b1, b1,1 ) (b2, b2,1) ... (bq, bq,1)] (d, d2 d1) Support Sup(r[d2 d1]) = {x U: (a1(x)=a1,1) (a2(x)=a2,1)...(ap(x)=ap,1) (d(x)=d2)}. /d2-objects which potentially can be reclassified by r[d2 d1] to d1/ Sup(R[d2 d1]) = {Sup(r[d2 d1]): r R}, where R- classification rules extracted from S. /d2-objects which potentially can be reclassified by r[d2 d1] to d1/
Action Rules Discovery Action rule r[d2 d1]: [a1,1 a2,1 ... ap,1] [(b1, b’1,1 b1,1 ) (b2, b’2,1b2,1) ... (bq, bq,1)] (d, d2 d1) Support Sup(r[d2 d1]) = {x U: (b1(x)=b’1,1) (b2(x)=b’2,1) (a1(x)=a1,1) (a2(x)=a2,1) ... (ap(x)=ap,1) (d(x)=d2)}. /d2-objects which potentially can be reclassified by r[d2 d1] to d1/
Action Rules Discovery Let Ud2 = {x U: d(x)=d2}. Then Bd2 d1 = Ud2 - Sup(R[d2 d1]) is a set of d2-objects in S which are d1-resistant. Let Sup(R[ d1]) = {Sup(R[d2 d1]) : d2 d1}. Then B d1 = U - Sup(R[ d1]) is a set of objects in S which are d1-resistant (can not be re-classified to class d1).
Action Rules Discovery Action rules r[d2 d1], r‘[d2 d3] are p-equivalent (), if r/bi = r'/bi always holds when r/bi, r'/bi are both defined, for every bi ASt AFl. Let x Sup(r[d2 d1]). We say that x positively supports r[d2 d1] if there is no action rule r‘[d2 d3] extracted from S, d3 d1, which is p-equivalent to r[d2 d1] and x Sup( r‘[d2 d3]).
Action Rules Discovery Let Sup+(R[d2 d1]) = {x Sup(r[d2 d1]): x positively supports r[d2 d1]}. Confidence Conf(r[d2 d1]) = {card[Sup+(r[d2 d1])]/card[Sup(r[d2 d1])]} Conf(r). Conf(r[ d1]) = {card[Sup+(r[ d1])]/card[Sup(r[ d1])]} Conf(r).
Cost of Action Rule [Tzacheva & Ras] Assumption: S= (X, A, V) is information system, Y X. Attribute b A is flexible in S and b1, b2 Vb. By S(Y, b1, b2) we mean a number from (0, +] which describes the average predicted cost of approved action associated with a possible re-classification of qualifying objects in Y from class b1 to b2. Object x Y qualifies for re-classification from b1 to b2, if b(x) = b1. S(Y, b1, b2) = +, if there is no action approved which is required for a possible re-classification of qualifying objects in Y from class b1 to b2 If Y is uniquely defined, we often write S(b1, b2)instead of S(Y, b1, b2).
Cost of Action Rule Action rule r: [(b1, v1→ w1) (b2, v2→ w2) … ( bp, vp→ wp)](x) (d, k1→ k2)(x) The cost of r in S: costS(r) = {S(vi , wi) : 1 i p} Action rule r is feasible in S, if costS(r) <S(k1, k2). For any feasible action rule r, the cost of the conditional part of r is lower than the cost of its decision part.
Cost of Action Rule Assumption: Cost of r is too high! r = [(b1, v1 → w1) … (bj, vj → wj) … ( bp, vp → wp)](x) (d, k1 → k2)(x) r1= [(bj1, vj1 → wj1) (bj2, vj2 → wj2) … ( bjq, vjq → wjq)](x) (bj, vj → wj)(x) Then, we can compose r with r1 and the same replace term (bj, vj → wj) by term from the left hand side of r1: [(b1, v1 → w1) … [(bj1, vj1 → wj1) (bj2, vj2 → wj2) … ( bjq, vjq → wjq)] … ( bp, vp → wp)](x) (d, k1 → k2)(x)
Class movability-index FS - decision attribute ranking – positive integer associated with a decision value /objects of higher decision attribute ranking are seen as objects more preferably movable between decision classes than objects of lower rank/. Nj+ = {i N: FS(dj) – FS(di) 0}. Class movability-index assigned to Nj, ind(Nj) = {FS(dj)– FS(di): iNj+}
Class movability-index Let Pj(i) = Sup+(r[dj di]) /Pj(i) – all objects in U which can be reclassified from the decision class dj to the decision class di Pj(N) = {Pj(i): i N, ij}, for any N {1,2,…,k} where {d1,d2,…,dk} are all decision classes. Class movability-index (m-index) assigned to dj-object x: indS(x) = max{ind(Nj): Nj{1,2,…,k} x Pj(N)}
Questions? Thank You