Part II: Practical Implementations.

Part II: Practical Implementations.

Modeling the Classes Stochastic Discrimination

Algorithm for Training a SD Classifier Generate projectable weak model Evaluate model w.r.t. training set, check enrichment Check uniformity w.r.t. existing collection Add to discriminant

Dealing with Data Geometry:SD in Practice

2D Example • Adapted from [Kleinberg, PAMI, May 2000]

An “r=1/2” random subset in the feature space that covers ½ of all the points

Watch how many such subsets cover a particular point, say, (2,17) (2,17)

Out In In It’s in 1/2 models Y = ½ = 0.5 It’s in 2/3 models Y = 2/3 = 0.67 It’s in 0/1 models Y = 0/1 = 0.0 In In In It’s in 3/4 models Y = ¾ = 0.75 It’s in 4/5 models Y = 4/5 = 0.8 It’s in 5/6 models Y = 5/6 = 0.83

Out In In It’s in 6/8 models Y = 6/8 = 0.75 It’s in 7/9 models Y = 7/9 = 0.77 It’s in 5/7 models Y = 5/7 = 0.72 In Out Out It’s in 8/10 models Y = 8/10 = 0.8 It’s in 8/11 models Y = 8/11 = 0.73 It’s in 8/12 models Y = 8/12 = 0.67

Fraction of “r=1/2” random subsets covering point (2,17) as more such subsets are generated

Fractions of “r=1/2” random subsets covering several selected points as more such subsets are generated

Distribution of model coverage for all points in space, with 100 models

Introducing enrichment: For any discrimination to happen, the models must have some difference in coverage for different classes.

Class distribution A biased (enriched) weak model • Enforcing enrichment (adding in a bias): require each subset to cover more points of one class than another

Distribution of model coverage for points in each class, with 100 enriched weak models

Error rate decreases as number of models increases Decision rule: if Y < 0.5 then class 2 else class 1

Training Set Test Set • Sparse Training Data: Incomplete knowledge about class distributions

Distribution of model coverage for points in each class, with 100 enriched weak models Training Set Test Set

No discrimination! • Distribution of model coverage for points in each class, with 5000 enriched weak models Training Set Test Set

Models of this type, when enriched for training set, are not necessarily enriched for test set Training Set Test Set Random model with 50% coverage of space

Introducing projectability: Maintain local continuity of class interpretations. Neighboring points of the same class should share similar model coverage.

Class distribution A projectable model • Allow some local continuity in model membership, so that interpretation of a training point can generalize to its immediate neighborhood

Distribution of model coverage for points in each class, with 100 enriched, projectable weak models Training Set Test Set

Promoting uniformity: All points in the same class should have equal likelihood to be covered by a model of each particular rating. Retain models that cover the points whose coverage by current collection is less

Part II: Practical Implementations.

Part II: Practical Implementations.

Presentation Transcript

Lesson 7

Activity Book

Fault Tolerance

Introduction to SIP and Open Source VoIP Implementations

A Practical Approach to Anemia

Adding Practical Security to Your Computer Course

Practical Aspects of Modern Cryptography

Georgia High School Writing Test March 2009

Fault Tolerance

Modern C++ A (Hopefully) Practical Introduction

Practical Functional Behavioral Assessments (FBA) Part 2

Lesson 7

Data Mining : Implementations

The Greatest Invention

Yahoo! vs. Yahoo! Three Large-Scale Mainstream DHTML Implementations

Practical Applications of Immunology

Mandela’s Garden

Practical English, Book II

Chapter 1

CHAPTER Modems

MEGA GOAL 4

practical issues in multiple sclerosis