1 / 23

A Framework for Learning Rules from Multi-Instance Data

A Framework for Learning Rules from Multi-Instance Data. Yann Chevaleyre and Jean-Daniel Zucker University of Paris VI – LIP6 - CNRS. Motivations. The choice of a good representation is a central issue in ML tasks. atomic description. + high expressivity

hidi
Download Presentation

A Framework for Learning Rules from Multi-Instance Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Framework forLearning Rules fromMulti-Instance Data Yann Chevaleyre and Jean-Daniel Zucker University of Paris VI – LIP6 - CNRS

  2. Motivations • The choice of a good representation is a central issue in ML tasks. atomicdescription +high expressivity -Untractability, unless strong biases globaldescription -Low expressivity +Tractable Att/Val representation MI Representation Relational representation • Most available MI learners use numerical data, and generate noneasily interpretable hypotheses • Our goal: design efficient MI learners handling numeric andsymbolic data, and generating interpretable hypotheses, such as decision trees or rule sets

  3. Outline • 1) Multiple-Instance Learninig • Multiple-instance representation, where are the MI-data,the MI learning problem • 2)Extending a propositional algorithm to handle MI data • Method, extending the Ripper rule learner • 3) Analysis of the multiple-instance extension of Ripper • Misleading litterals, unrelevant litterals, litteral selection problem • 4) Experimentations & Applications • Conclusion et future work

  4. instances bag The Multiple Instance Representation: definition Standard A/V representation: Multiple Instance representation: is represented by + examplei {0,1}-valued label li A/V vector xi A/V vector xi,1 is represented by A/V vector xi,2 example i + {0,1}-valued label li A/V vector xi,r

  5. 1 0,n Where can we find MI data? • Many complex objects, such as images or molecules, caneasily be represented with bags of instances • Relational databases may also be represented this way • More complex representations, such as datalog facts, may be MI-propositionalized [zucker98], [Alphonse and Rouveirol 99]

  6. Representing time series as MI data • By encoding each sub-sequence (s(tk), ... ,s(tk+n)) as an instance, the representation becomes invariant by translation s(t) t tk tj • Windows can be chosen of various size to make the representation invariant by rescaling

  7. The multiple-instance learning problem From B+,B- sets of positive(resp. negative) bags, find a consistent hypothesis H unbiased multiple-instance Learning problem Their exists a function f, such that :lab(b)=1 iff x  b, f (x) single-tuple bias Find a function h covering at least one instance per positive bag and no instance from any negative bag multi-instance learning [Dietterich 97] Note: the domain of h is theinstance space, instead of thebag space

  8. Extending a propositional learner • We need to represent the bags of instances as asingle set of vectors b1+ Adding bag-id and label to each instance b2- • Measure the degree of multiple-instance-consistancy of the hypothesis being refined. • Instead of measuring p(r), n(r), the number of vectors covered by r, compute p*(r), n*(r), the number of bags for which r covers at least one instance Single-tuple coverage measure

  9. Extension de l ’algorithme Ripper (Cohen 95) Ripper (Cohen 95) is a fast and efficient top-down rule learner,which compares to C4.5 in terms of accuracy, being much faster • Naive-RipperMi is the MI-extensions of Ripper • Naive-Ripper-MI was tested on the musk (Dietterich 97) tasks. On musk1 (avg of 5,2 instances per bag), it achieved good accuracy.On musk2 (avg 65 instances per bag), only 77% of accuracy.

  10. Y 8 6 4 2 X 2 4 6 8 10 12 Empirical Analysis of Naive-RipperMI • Goal: Analyse pathologies linked to the MI problem and to the Naive-Ripper-MI algorithm. • Misleading litterals • Unrelevant litterals • Litteral selection problem • Analysing the behaviour of NaiveRipperMi on a simple dataset • white triangles bag • white squares bag... • 5 positivebags: • black triangles bag • black squares bag... • 5 negativebags:

  11. Y 6 4 2 2 4 X 6 8 10 12 Analysing Naive-RipperMI • Learning task: induce a rules covering at least one instanceof each positive bag. • Target concept : X > 5 & X < 9 & Y > 3

  12. Y X > 5 & X < 9 & Y > 3 • Target concept : 6 4 2 2 4 X 6 8 10 12 Analysing Naive-RipperMI : misleading litterals • 1st step: Naive-RipperMi induces a rule X > 11 & Y < 5 Misleadinglitterals

  13. Y 6 4 2 2 4 X 6 8 10 12 Analysing Naive-RipperMI : misleading litterals • 2nd step: Naive-RipperMi removes the covered bag(s), andinduces another rule...

  14. Analysing Naive-RipperMI : misleading litterals • Misleading litterals: litterals bringing information gain but contradicting the target concept • Multiple-instance specific phenomenon. • Dispite other single-instance pathologies, (overfitting, attribute selection problem), increasing the number of examples won’t help • The « Cover-and-differentiate » algorithm reduced the chance of finding the target concept If lis a misleading litteral, then l is not. It is thus sufficient, when the litteral l has been induced, toexamin l at the same time. => partitioning the instance space

  15. Y 6 4 2 2 4 X 6 8 10 12 Analysing Naive-RipperMI : misleading litterals • Build a partition of the instance space • Extract the best possible rule : X < 11 & Y < 6 & X > 5 & Y > 3

  16. Y 6 4 2 X 2 4 6 8 10 12 Analysing Naive-RipperMI : irrelevant litterals • In multiple-instance learnig, irrelevant litterals can occur anywhere in the rule, instead of mainly at the end of a rule in the single-instance case • Use global pruning Y < 6 & Y > 3 & X > 5 & X < 9

  17. Y 6 4 2 2 4 X 6 8 10 12 Analysing Naive-RipperMI : litteral selection problem • When the number of instances per bag increases, any litteral covers any bag. Thus, we lack information to select a good litterals

  18. Y 6 4 2 2 4 X 6 8 10 12 Analysing Naive-RipperMI : litteral selection problem • When the number of instances per bag increases, any litteral covers any bag. Thus, we lack information to select a good litterals

  19. Analysing Naive-RipperMI : litteral selection problem • We must take into account the number of covered instances • Making an assumption on the distribution of instances canlead to a formal coverage measure The single distribution model: A bag is made of r instances drawn i.i.d. from a unique distributionD + widely studied in MI learning [Blum98,Auer97,...] + simple coverage measure, and good learnability properties - very unrealistic The two distribution model: A positive (resp. negative) bag is made of r instances drawn i.i.d. from D+(resp.D- ) with at least one (resp. none) covered by f. +more realistic - complex formal measure useful for small number of instances (log # bags) • Design algorithms or measures which « work well » with these models

  20. Analysing Naive-RipperMI : litteral selection problem Y 6 Y > 5 4 Target concept 2 2 4 X 6 8 10 12 • Compute for each positif bag Pr(at least one of the k covered instance  target concept)

  21. Analysis of RipperMi: experiments • Artificial datasets of 100 bags with a variable number of instances per bag. • Target concept: monomials (hard to learn with 2 instances per bag [Haussler89]) Error rate (%) # instances per bag • On the mutagenesis problem : NaiveRipperMi: 78% RipperMi-refined-cov: 82%

  22. Application : Anchoring symbols[with Bredeche] W Perception I seea door What isall this ? segmentation lab = door IF Color = blue AND size > 53 THEN DOOR • Early experiments with NaiveRipperMi reached 80% accuracy

  23. Conclusion & Future work • Many problems which existed in relational learning appear clearly within the multiple-instance framework. • Algorithms presented here are aimed at solving these problems They were tested on artificial datasets. • Other realistic models, leading to better heuristics • Instance selection and attribute selection • Future work: MI-propositionalization, applying multiple-instance learning to data-mining tasks • Many ongoing applications ...

More Related