Feature Selection Based on Relative Attribute Dependency: An Experimental Study

Feature Selection Based on Relative Attribute Dependency: An Experimental Study Jianchao Han1, Ricardo Sanchez1, Xiaohua Hu2 1Computer Science Dept., California State University Dominguez Hills 2College of Information Science and Technology, Drexel University RSFDGrC 2005, Regina, Canada

Agenda • Introduction • Rough Set Approach • Relative Attribute Dependency Based on Rough Set Theory • A Heuristic Algorithm for Finding Optimal Reducts • Experiment Result • Related Work • Summary and Future Work RSFDGrC 2005, Regina, Canada

Introduction • Data Reduction • Horizontal Reduction – Sampling • Vertical Reduction – Feature Selection • Feature Selection • Statistical Feature Selection • Significant Feature Selection • Rough Set Feature Selection • Search Process • Top-bottom search • Bottom up search • Exhaust search • Heuristic search RSFDGrC 2005, Regina, Canada

Rough Set Based Feature Selection • Bottom-up search • Brute-force search • Heuristic search • Rough Set Theory • Introduced by Pawlak in the 1980s • An efficient tool for data mining, concept generation, induction, and classification RSFDGrC 2005, Regina, Canada

Rough Set Theory-- IS • Information system IS = < U, C, D, {Va}aCUD, f>, • where U={u1,u2,...,un} is a non-empty set of tuples, called data table • Cis a non-empty set of condition attributes • D is a non-empty set of decision attributes • CD=Ø • Va is the domain of attribute a with at least two elements • f is a function: U×(CD)V=aCDVa, RSFDGrC 2005, Regina, Canada

Approximation • Let ACD, and ti, tjU, X be a subset of U and ACD • Define RA={<ti,tj>U×U: aA, ti[a]=tj[a]} • Indiscernibility relation, denoted IND, is an equivalent relation on U , RA is an equivalent relation on U. • Approximation space (U, IND) partitions U into equivalentclasses[A]={A1,A2,…,Am} induced by RA • Lower approximation or positive region of X: LowA(X)={Ai[A] | AiX, 1≤i≤m} • Upper approximation of X based on A: UppA(X) = {Ai[A]|AiX≠Ø, 1≤i≤m} • Boundary area of X: BoundaryA(X) = UppA(X)-LowA(X) • Negative region of X: NegA(X)={Ai[A]|Ai U-X, 1≤i≤m} RSFDGrC 2005, Regina, Canada

Core Attributes and Reduct • Let [D]={D1,D2,…,Dk} be the set of elementary sets partitioned by RD • Approximation aggregation: • LowA([D])=kj=1LowA(Dj) • UppA([D])= kj=1UppA(Dj) • a C is a Core attribute of C, if LowC([D])  LowC-{a} ([D]), dispensable attribute otherwise • RC is a reduct of C in U w.r.t. D if LowR([D])=LowC([D]) and BR, LowB([D])LowC([D]) RSFDGrC 2005, Regina, Canada

Calculation of Reducts • Finding all reducts is NP-hard • Traditional method: decision matrix • Some new methods still suffer from intensive computation of • either discernibility functions • or positive regions • Our method: • New equivalent definition of reducts • Count distinct tuples (rows) in the IS table • Efficient algorithm for finding reducts RSFDGrC 2005, Regina, Canada

Relative Attribute Dependency • Let , -- projection of U on P • Let , , the degree of relative dependencyof Q on D over U • is the number of equivalence classes in U/IND(X). • Theorem.Assume U is consistent. is a reduct of C with respect to D if and only if 1) 2) RSFDGrC 2005, Regina, Canada

Computation Model (RAD) • Input: Adecision table U, condition attributes set C and decision attributes set D • Output: A minimum reduct R of condition attributes set C with respect to D in U • Computation: Find a subset R of C, such that , and . RSFDGrC 2005, Regina, Canada

A Heuristic Algorithm for Finding Optimal Reducts • Given the partition by D, U/IND(D), of U, the entropy, or expected information based on the partition by , of U, is given by where • Entropy E(q) RSFDGrC 2005, Regina, Canada

Algorithm Design • R  C, Q  empty • For each attribute do • Compute the entropy E(q) of q • Q  Q {<q, E(q)>} • While Do • q • Q  Q – {<q, E(q)>} • If Then • R  R – {q} • Return R Algorithm complexity: RSFDGrC 2005, Regina, Canada

Experiments • Data sets – UCI repository • 10 data sets • Various sizes (# of tuples and attributes) • Categorical attributes • Preprocess – remove all inconsistent tuples RSFDGrC 2005, Regina, Canada

Reduct Original Data 800 700 600 500 400 300 200 100 0 AS BCW DER HV LC SPE YS ZOO AUD SOY Experiment Results b) Number of Condition Attributes a) Number of Rows RSFDGrC 2005, Regina, Canada

Classification Accuracy RSFDGrC 2005, Regina, Canada

Related Work • Grzymala-Busse • LERS • Rough measure of rule: • Ours: • Nguyen et al: • Similar: using Radix-sorting • Ours: no discernibility relation, lower and upper approximations maintained • Others RSFDGrC 2005, Regina, Canada

Conclusion • Summary • Relative attribute dependency • Computation model • Algorithm implementation • Experiment • Future work • Refinement • Application • Extension to numerical attributes RSFDGrC 2005, Regina, Canada

RSFDGrC 2005, Regina, Canada

Feature Selection Based on Relative Attribute Dependency: An Experimental Study

Feature Selection Based on Relative Attribute Dependency: An Experimental Study

Presentation Transcript

A Study on Feature Selection for Toxicity Prediction *

Feature selection

Feature Selection

Feature selection

Feature Grouping-Based Fuzzy-Rough Feature Selection

Feature Selection

Feature-based Choice and Similarity in Normal-form Games: An Experimental Study

Resource Selection in Distributed Information Retrieval – an Experimental Study

Feature Selection

Feature Selection

Feature selection

Feature Selection

FEATURE Attribute Attribute

Graph-based Iterative Hybrid Feature Selection

Feature Selection

Feature Selection, Feature Extraction

Feature Selection

Feature selection

Feature Selection

Feature Selection

Feature selection