150 likes | 267 Views
Sisteme de programe pentru timp real. Universitatea “Politehnica” din Bucuresti 2004-2005 Adina Magda Florea http://turing.cs.pub.ro/sptr_05. Curs Nr. 11. Learning decision rules by evolutionary algorithms Decision rules Representation Genetic operators Fitness function. 2.
E N D
Sisteme de programepentru timp real Universitatea “Politehnica” din Bucuresti 2004-2005 Adina Magda Florea http://turing.cs.pub.ro/sptr_05
Curs Nr. 11 Learning decision rules by evolutionary algorithms • Decision rules • Representation • Genetic operators • Fitness function 2
1. Decision rules • Both discrete and continuous values attributes • Rules of the form: if A1=v1 and x1<A2<=v2 … then Class C1 • Learning set E={e1,..eM} • eE is described by N atributes A1, ..AN and labeled by a class c(e) C • Discrete value attributes – finite set V(Ai) • Continuous value attributes – an interval V(Ai) = [li,ui] 3
Decision rules • A class ckC • Positive examples E+(ck) = {e E: C(e)=ck} • Negative examples – E-(ck) = E – E+(ck) • A decision rule R • if t1 and t2 .. and tr then ck • LHS • RHS – class membership of an example • Rule set RS – disjunctive set of decision rules with the asme RHS • CRS C – denotes the class on the RHS of an RS 4
Decision rules • The EA is called for each class ck C to find the RS separating E+(ck) from E-(ck) • Search criteria – fitness function -> prefers rules consisting of fewer conditions, which cover many ex+ and very few ex- 5
2. Representation • One chromosome encodes an RS • Use variable length chromosomes (as the no. of rules in RS is not knwon) + provide operators which change the no. Of rules • 1 chromosome = concatenation of strings • Each fixed-length string = the LHS of one decision rule (no need for RHS) • 1 string is composed of N sub-strings (LHS) – a condition for each attribute 6
Representation • Discrete value attributes – binary flags • Continous value attributes – li, ui, li<Ai<=ui (+, - ) • li, ui are selected from a finite set of boundary thresholds • Boundary threshold = midpoint between a successive apir of examples in the sequence sorted by increasing value of Ai, that one is ex+ and the other ex- • ex+ ex+ ex+ ex- ex- ex- ex+ ex+ ex- ex- Ai thik-1 thik thik+1 • If a condition is not present – li= - , ui= + 7
Representation Example • 2 cont val attr: Salary, Amount • 1 disc val attr – Purpose (car, house, school) • Class - Accept Salary Amount Purpose - + - 250 1 1 1 if Amount<250 then ACCEPT 100 250 - 500 1 1 1 if 100<salary250 and Amount <500 then ACCEPT 750 + - + 1 1 0 if Salary>750 then ACCEPT and Purpose = (car or house) then ACCEPT 8
3. Genetic operators 4 operators applied to a single sule set: • Changing condition • Positive example insertion • Negative example removal • Rule drop 2 operators applied with 2 arguments to RS1 and RS2: • Crossover • Rule copy 9
Genetic operators Changing condition • A mutation like operator – alters a single condition related to an attribute Ai • If Ai disc – randomly chooses a flag and flip • If Ai cont – randomly replaces a threshold (li or ui) by a boundary threshold Pos ex insertion • Modifies a single dec rule R in RS to allow to cover a new random e+E+(CRS) currently uncovered by R • All conditions in the rule, which conflict with e+ have to be altered. • Ai disc – flag set • Ai cont – li<Ai<=ui because ui<Ai(ex+) – smallest ui’ such as ui’>=Ai; similar if li>= Ai(ex+) 10
Genetic operators Negative ex removal • The negative example removal operator alters a single rule R from the ruleset RS. • It selects at random a negative example e- from the set of all the negative examples covered by R. • Then it alters a random condition in R in such a way, that the modied rule does not cover e-. • If the chosen condition concerns a discrete attribute Ai the flag which corresponds to Ai(e-) is cleared. • If Ai is a continuous-valued attribute then the condition li < Ai ui is narrowed down either to li’ < Ai <= ui or to li < Ai <= ui’, where li is the smallest boundary threshold such that Ai(e-) >= li’ and ui’ is the largest boundary threshold such that ui’ < Ai(e-). 11
Genetic operators Rule drop and rule copy operators are the only ones capable of changing the number of rules in a ruleset. Rule drop • The single argument rule drop removes a random rule from a ruleset RS. Rule copy • Rule copy adds to one of its arguments RS1, a copy of a rule selected at random from RS2, provided that the number of rules in RS1 is lower than maxR. • maxR is an user-supplied parameter, which limits the maximal number of rules in the ruleset. 12
Genetic operators Crossover • The crossover operator selects at random two rules R1and R2from the respective arguments RS1 and RS2. Then it applies an uniform crossover to the strings representing R1and R2. • Uses rank-based fitness asignment 13
4. Fitness • Goal = reduction of the no. of errors • ERS – the set of ex covered by the RS (class of RHS – CRS) • E+RS = ERS E+(CRS) – set of ex+ correctly classified by RS • E-RS = ERS E-(CRS) – set of ex- covered by RS • The total no. of ex+ and ex- POS = |E+(CRS)| NEG=|E-(CRS)| = M – POS • The RS correctly classifies: - pos = |E+RS| ex+ and - NEG-neg ex- where neg=|E-RS| 14
Fitness Ferror = Pr(RS) / Compl(RS) • Pr(RS) = the probability of classifying correctly an example from the learning set by RS • Compl(RS) = the complexity of RS • Pr(RS) = (pos + NEG – neg) / (POS+NEG) • Compl(RS) = (L/N+1) L – total no. of conditions in RS N – no. of attributes - user supplied in [0.001..0. 1] • We are interested in maximizing the probability and minimizing the complexity – to obtain a compact rule set and acorrect classification 15