Feature transformation through rule induction: A case study with the k -NN classifier

Feature transformation through rule induction: A case study with the k-NN classifier Antal van den Bosch Tilburg University, The Netherlands http://ilk.uvt.nl - Antal.vdnBosch@uvt.nl

Outline • General idea • Feature space transform for k-NN • k-NN classification over rules • An implementation using RIPPER • Intermezzo - parameter optimization • Experiments on UCI data • Conclusions

A C F Z B B D Y B C C X f1 f2 f3 c r1 r2 r3 r4 c B B C ? 0 0 0 1 1 0 1 0 0 1 0 1 0 1 1 0 Z Y X ? If f1=A then c=Z If f1=B and f2=B then c=Y If f2=C then c=X If f3=C then c=X Feature transformation

k-NN over rules • Different classification: • No “class of first rule that matches” • Instead, produce • majority class of nearest neighbors • that share the most matching rules with the new instance (weighted, …) • Different outcomes possible • Rule’s class is not considered, only NN’s • Rules become features with weights; can outweigh and outnumber others

Related work • Sébag and Schoenauer (1994) • same transformation, but for local regression; interesting dimension reduction • Generalizing instances to rules in k-NN • Salzberg (1990), NGE (hyperrectangles) • Domingos (1996), RISE (merging with wildcards) • Van den Bosch (1999), FAMBL (merging by disjuncting values) • Van den Bosch (2000) • earlier version only on natural language processing tasks

Implementation • RIPPER (Cohen, 1995) • Sequential covering, MDL-driven • Induces sets of rules per class • Uses partitioning to validate and select rules • Many heuristics, many parameters, fast • Procedure: • Apply RIPPER to training set • Recode training and test set using RIPPER rules • Train and test k-NN (IB1 in TiMBL 5.0)

Variants • Transformed IB1 (T-IB1) • new features replace original • IB1 plus new features (IB1+T) • new features are added to original • Compared against RIPPER and IB1 • 10-fold CV • Unpaired one-tailed t-tests

UCI data sets • Artificial data sets • Fully known underlying concept • Known conditional dependencies • Natural data sets • Partly understood underlying problem • Unknown conditional dependencies

Data specs

Intermezzo • Parameter settings matter, but • Good setting is unpredictable • Parameters interact • Exhaustive wrapping is not an option • Both k-NN (TiMBL) and RIPPER have lots of parameters • Heuristic: Wrapped progressive sampling (Van den Bosch, 2004)

Main idea

WPS parameter spaces • Ripper: • F (min. # inst/r) • a (class order) • n (negation) • S (simplify) • O (# opt. passes) • L (loss ratio) • 648 combinations • IB1 (TiMBL): • k (k-NN) • w (feature wght) • m (sim. metric) • d (distance wght) • L (metric backoff) • 925 combinations

Effect of WPS

End of intermezzo

RIPPER vs T-IB1 vs IB1

Numbers of rules

IB1 vs IB1+T

Result summary

Discussion • T-IB1 ≈ RIPPER • RIPPER classification can be interchanged with k-NN classification • IB1+T outperforms IB1 and RIPPER • Extra features add useful new views on task • Effects mainly on artificial data • Complex “game” rules help k-NN in finding the correct nearest neighbors

One example: tic-tac-toe • Tic-tac-toe • donated by David Aha to UCI repository • 958 possible endings of 3x3 board game • class: whether board constitutes a win for X • Yes or no (no can be a win for O, or a draw) • Typical 100% correct eight rule set: • Check 2 diagonals • Check 3 horizontals • Check 3 verticals for consecutive Xs • Usually RIPPER finds these eight, but sometimes induces other rules

X O O O O X O O O X O O X X X X X O Tic-tac-toe freak rule • Test on O in three locations, not three in a row • X may win • But may mean draw!

X O O O X X X O IB1+T saves the day • Finds a nearest neighbor • Mismatches in two positions • Matches on the rule feature • Represents a draw X O O O X X X O

Future work • More relaxed and redundant rule inducer (more rules per instance) • Bigger context: plug k-NN onto RIPPER, maxent, SVM, Winnow, … • AUC instead of accuracy

Feature transformation through rule induction: A case study with the k -NN classifier

Feature transformation through rule induction: A case study with the k -NN classifier

Presentation Transcript

J1000: 1 kW AM Digital Transmitter

Flowering - Floral Induction

Child and Adolescent Needs and Strengths CANS Assessment Training Introduction

The Case Study

E-Discovery: Approaching the Problems Head On

Case Study Presentation: Too Far Ahead of the IT Curve?

Induction of Spermatogenesis in Azoospermic Men after Varicocele Repair

The Effects of Coyote Removal in Texas: A Case Study in Conservation Biology

CASE STUDY PLSS 3 MILE RULE SUBDIVISION OF SECTIONS (INDIAN LANDS)

CASE STUDY

Design Patterns Case Study: Designing A Document Editor

Induction of Labor

Coordinating Conjunctions Rule 32a

Method and Validation basics —HPLC case study Hua YIN (Assessor)

Managing Feature Models

Induction

Dr. Bill Vicars Lifeprint

Electromagnetic Induction

UWA Safety Induction

Warlord Case Study review and giant bonus with 100 items