200 likes | 394 Views
Paper study - Application Of Variable Precision Rough Set Approach To Car Driver Assessment. Presented by: Lichun (Jack) Zhu Course: 60-539 Winter 2006 Instructor: Dr. Christie Ezeife University of Windsor. Agenda. Introduction Rough Set Theory Variable Precision Rough Set Theory
Paper study - Application Of Variable Precision Rough Set Approach To Car Driver Assessment Presented by: Lichun (Jack) Zhu Course: 60-539 Winter 2006 Instructor: Dr. Christie Ezeife University of Windsor
Agenda • Introduction • Rough Set Theory • Variable Precision Rough Set Theory • Linear Hierarchy of Decision Table (HDTL) Algorithm • How the data is prepared • Result interpretation • Summary and Conclusion • Q & A
Introduction • Problem Statement: • Need to find out unsafe car drivers based on history driving records. • Driving records in the database are incomplete and inaccurate. • Solution: A new approach to analyze the data that contains inaccurate information • Variable Precision Rough Set Theory • Linear Hierarchy of Decision Table algorithm • Classification
Introduction to Rough Set Theory • Background • First introduced by Pawlak (1982) • A mathematic method to describe the uncertainty and incompleteness • Basic concept • Terms: Information System S, Universe U, Attributes A (condition attr, decision attr) • S = (U, A)
Introduction to Rough Set Theory • Domain Va: With every attribute a of A, we associate a set Va as domain of a, Such as Vs = {Male, Female} • Indiscerniblity relation I(B): If B ⊂ A, I(B) on U as (x,y) ∈ I(B), if and only if a(x) = a(y) for every a ∈ B, where a(x) is the value of attribute a for turple x. We can see I(B) is a equivalence relation. • B-elementary sets {B1, …Bi,…}: the partition on the universe U/I(B) or simply U/B, we also define B(x) = Bi: x ∈ Bi
Introduction to Rough Set Theory • An example of Information System Table 1. U = {1,2,3,4,5,6}, A={S, G, N, R} Let B = {S, G, N}I(B) = {(1,1), (1,6), (2,2), (3,3), (4,4), (5,5), (6,6)}U/B = {{1,6}, {2}, {3}, {4}, {5}} = {B1, B2, B3, B4, B5}
Introduction to Rough Set Theory • Approximation For interest set X ⊂ U, We define • B-lower(X) = ∪x∈U {B(x): B(x) ≦ X}, • B-upper(X) = ∪x∈U {B(x): B(x) ∩ X ≠ Φ} • BNR B (X) = B-upper(X) – B-lower(X) • For example: if X contains all turples with high risk, X = {2,3,4,6}, then B-lower(X) = {2,3,4}, B-upper(X) = {1,2,3,4,6} BNR B (X) = {1,6}
Introduction to Rough Set Theory Figure 1. Rough Set Concept, U= ∪{B1…B14}, B-lower(X) – Yellow Region, B-upper(X) – Yellow and Green Region BNR B (X) - Green Region
Variable Precision Rough Set Theory • Background Information • Problem of Rough Set: B-lower approximation will always be EMPTY if uncertainty widely exists. • Solution: use probability based approach • presented by Ziarko(1993), Yao and Wong (1992), Slezak and Ziarko (2002) etc • Definations • lower limit l: satisfying 0 ≤ l < P(X) < 1 • l-negative region of X: NEG l (X) = ∪{Bi: P(X|Bi) ≤ l} • upper limit u: satisfying 0 < P(X) < u ≤ 1. • u-positive region of X: POS u (X) = ∪{Bi: P(X|Bi) ≥ u} • (l,u)-boundary region of X: BNR l,u (X) = ∪{Bi: l < P(X|Bi) < u}
Variable Precision Rough Set Theory For data in Table 1, P(X) =4/6 = 2/3≈0.67 If l = 0.25 and u = 0.75 then NEG 0.25 (X)={5}, POS 0.75 (X) = {2,3,4}, BNR 0.25,0.75 (X)={1,6} Table 2. Sample Decision Table DT B,X (U) with P(X) = 0.67, l=0.25, u=0.75 • For example:
Variable Precision Rough Set Theory Figure 2. VPRS Concept, U= ∪{B1…B17}, NEG(X) – White Region, POS(X) – Yellow Region BNR B (X) – Green Region
Linear Hierarchy of Decision Table Algorithm (Ziarko,2002) • Corresponds to Tree-structured Hierarchy of Decision Table Algorithm
Linear Hierarchy of Decision Table (HDTL) Algorithm • Linear Hierarchy of Decision Table (HDTL) Algorithm • Advantage: Linear Hierarchy of Decision Table algorithm effectively eliminates the exponential growth of the decision hierarchy size
Linear Hierarchy of Decision Table (HDTL) Algorithm (supervised approach) Initialization 1. U U’, C C’, D D’ 2. Compute POS u (X) and NEG l (X) Iteration 3. repeat { 4. while (POS u (X) = EMPTY and NEG l (X) = EMPTY) { 5. C new(C, U); define new condition attributes 6. Compute POS u (X) and NEG l (X) } 7. Output DT C,X (U); output decision table based on the union of the positive and negative regions 8. if POS u (X) ∪ NEG l (X) = U then exit. 9. U U – (POS u (X) ∪ NEG l (X)) 10. C new (C, U); define new condition attributes 11. D D|U; restrict decision attributes to the current set of data U 12. Compute POS u (X) and NEG l (X) } There is a problem at this point. When defining the new condition attributes failed, the procedure should terminate. Here embodies the linear approach of generating the dataset for the subsequent layer.
How the data is prepared • Attributes • Sex, Date-of-birth, City-population, Number-of-convictions, Number-of-past-accidents and Has-accident-in-last-year • Data scale: about 29,000 records • Data normalization
Result interpretation • 5 test cycles, generating 5 first layer decision tables and 3 second layer decision tables. • A problem can be found from the testing result In all the presented test cycles, the boundary sets of the first cycle all contain only one combination of attributes. Therefore the generated decision table hierarchy has no difference compared with the Tree-structured Hierarchy Decision Table algorithm at the first two layers. The author did not display his further investigation on the boundary sets that have more than one combination of attributes.
Summary and Conclusion • Strong points • provides a valuable alternative solution that can be used in rule finding and classification based on inaccurate data. • The HDTL algorithm can also avoid the exponent expansion of hierarchical data structures • Weak point • Incomplete of test results provided. The test results does not strong enough to testify the effectiveness and accuracy of Linear Hierarchy Decision Table algorithm.
References • Pawlak, Z, Decision Rules, Bayes Rule and Rough Sets, New Directions in Rough Sets, Data Mining, and Granular-Soft Computing, p.1-9, 7th International Workshop, RSFDGrC’99, Yamaguchi, Japan, November 1999 Proceedings. • Ziarko, W., Incremental Learning with Hierarchies of Rough Decision Tables, Proceedings of North American Fuzzy Information Processing Society Conf. (NAFIPS04), Banff, Alberta (2004) p.802-808.
Q & A Thanking You