1 / 49

Jeremy Bolton Paul Gader CSI Laboratory University of Florida

Conjuntive Formulation of the Random Set Framework for Multiple Instance Learning: Application to Remote Sensing. Jeremy Bolton Paul Gader CSI Laboratory University of Florida. Highlights. Conjunctive forms of Random Sets for Multiple Instance Learning:

hasad
Download Presentation

Jeremy Bolton Paul Gader CSI Laboratory University of Florida

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Conjuntive Formulation of the Random Set Framework for Multiple Instance Learning:Application to Remote Sensing Jeremy Bolton Paul Gader CSI Laboratory University of Florida

  2. Highlights • Conjunctive forms of Random Sets for Multiple Instance Learning: • Random Sets can be used to solve MIL problem when multiple concepts are present • Previously Developed Formulations assume Disjunctive relationship between concepts learned • New formulation provides for a conjunctive relationship between concepts and its utility is exhibited on a Ground Penetrating Radar (GPR) data set

  3. Outline • Multiple Instance Learning • MI Problem • RSF-MIL • Multiple Target Concepts • Experimental Results • GPR Experiments • Future Work

  4. Multiple Instance Learning

  5. Standard Learning vs. Multiple Instance Learning • Standard supervised learning • Optimize some model (or learn a target concept) given training samples and corresponding labels • MIL • Learn a target concept given multiplesets of samples and corresponding labels for the sets. • Interpretation: Learning with uncertain labels / noisy teacher

  6. Multiple Instance Learning (MIL) • Given: • Set of I bags • Labeled + or - • The ith bag is a set of Ji samples in some feature space • Interpretation of labels • Goal: learn concept • What characteristic is common to the positive bags that is not observed in the negative bags

  7. Multiple Instance Learning Traditional Classification Multiple Instance Learning {x1, x2, x3, x4} label = 1 {x1, x2, x3, x4} label = 1 {x1, x2, x3, x4} label = 0 x1 label = 1 x2label = 1 x3label = 0 x4label = 0 x5label = 1

  8. EHD: Feature Vector MIL Application: Example GPR • Collaboration: Frigui, Collins, Torrione • Construction of bags • Collect 15 EHD feature vectors from the 15 depth bins • Mine images = + bags • FA images = - bags

  9. Standard vs. MI Learning: GPR Example • Standard Learning • Each training sample (feature vector) must have a label • Arduous task • many feature vectors per image and multiple images • difficult to label given GPR echoes, ground truthing errors, etc … • label of each vector may not be known EHD: Feature Vector

  10. EHD: Feature Vector Standard vs MI Learning: GPR Example • Multiple Instance Learning • Each training bag must have a label • No need to label all feature vectors, just identify images (bags) where targets are present • Implicitly accounts for class label uncertainty …

  11. Random Set Framework for Multiple Instance Learning

  12. Random Set Brief • Random Set

  13. It is NOT the case that EACH element is NOT the target concept How can we use Random Sets for MIL? • Random set for MIL: Bags are sets (multi-sets) • Idea of finding commonality of positive bags inherent in random set formulation • Sets have an empty intersection or non-empty intersection relationship • Find commonality using intersection operator • Random sets governing functional is based on intersection operator • Capacity functional : T A.K.A. : Noisy-OR gate (Pearl 1988)

  14. Random Set Functionals • Capacity functionals for intersection calculation • Use germ and grain model to model random set • Multiple (J) Concepts • Calculate probability of intersection given X and germ and grain pairs: • Grains are governed by random radii with assumed cumulative: Random Set model parameters Germ Grain

  15. x T x T T x x x x T T x x x RSF-MIL: Germ and Grain Model • Positive Bags = blue • Negative Bags = orange • Distinct shapes = distinct bags

  16. Multiple Instance Learning with Multiple Concepts

  17. Multiple Concepts: Disjunction or Conjunction? • Disjunction • When you have multiple types of concepts • When each instance can indicate the presence of a target • Conjunction • When you have a target type that is composed of multiple (necessary concepts) • When each instance can indicate a concept, but not necessary the composite target type

  18. Conjunctive RSF-MIL • Previously Developed Disjunctive RSF-MIL (RSF-MIL-d) • Conjunctive RSF-MIL (RSF-MIL-c) Noisy-OR combination across concepts and samples Standard noisy-OR for one concept j Noisy-AND combination across concepts

  19. Synthetic Data Experiments • Extreme Conjunct data set requires that a target bag exhibits two distinct concepts rather than one or none AUC (AUC when initialized near solution)

  20. Application to Remote Sensing

  21. Disjunctive Target Concepts • Using Large overlapping bins (GROSS Extraction) the target concept can be encapsulated within 1 instance: Therefore a disjunctive relationship exists Target Concept Type 1 NoisyOR Target Concept Type 2 NoisyOR OR … Target Concept Type n NoisyOR Target Concept Present?

  22. What if we want features with finer granularity • Fine Extraction • More detail about image and more shape information, but may loose disjunctive nature between (multiple) instances Constituent Concept 1 (top of hyperbola) NoisyOR Target Concept Present? AND … Constituent Concept 2 (wings of hyperbola) NoisyOR Our features have more granularity, therefore our concepts may be constituents of a target, rather than encapsulating the target concept

  23. GPR Experiments • Extensive GPR Data set • ~800 targets • ~ 5,000 non-targets • Experimental Design • Run RSF-MIL-d (disjunctive) and RSF-MIL-c (conjunctive) • Compare both feature extraction methods • Gross extraction: large enough to encompass target concept • Fine extraction: Non-overlapping bins • Hypothesis • RSF-MIL will perform well when using gross extraction whereas RSF-MIL-c will perform well using Fine extraction

  24. Experimental Results • Highlights • RSF-MIL-d using gross extraction performed best • RSF-MIL-c performed better than RSF-MIL-d when using fine extraction • Other influencing factors: optimization methods for RSF-MIL-d and RSF-MIL-c are not the same Gross Extraction Fine Extraction

  25. Future Work • Implement a general form that can learn disjunction or conjunction relationship from the data • Implement a general form that can learn the number of concepts • Incorporate spatial information • Develop an improved optimization scheme for RSF-MIL-C

  26. Backup Slides

  27. MIL Example (AHI Imagery) • Robust learning tool • MIL tools can learn target signature with limited or incomplete ground truth Which spectral signature(s) should we use to train a target model or classifier? Spectral mixing Background signal Ground truth not exact

  28. MI-RVM • Addition of set observations and inference using noisy-OR to an RVM model • Prior on the weight w

  29. SVM review • Classifier structure • Optimization

  30. MI-SVM Discussion • RVM was altered to fit MIL problem by changing the form of the target variable’s posterior to model a noisy-OR gate. • SVM can be altered to fit the MIL problem by changing how the margin is calculated • Boost the margin between the bag (rather than samples) and decision surface • Look for the MI separating linear discriminant • There is at least one sample from each bag in the half space

  31. mi-SVM • Enforce MI scenario using extra constraints At least one sample in each positive bag must have a label of 1. Mixed integer program: Must find optimal hyperplane and optimal labeling set All samples in each negative bag must have a label of -1.

  32. Current Applications • Multiple Instance Learning • MI Problem • MI Applications • Multiple Instance Learning: Kernel Machines • MI-RVM • MI-SVM • Current Applications • GPR imagery • HSI imagery

  33. HSI: Target Spectra Learning • Given labeled areas of interest: learn target signature • Given test areas of interest: classify set of samples

  34. Overview of MI-RVM Optimization • Two step optimization • Estimate optimal w, given posterior of w • There is no closed form solution for the parameters of the posterior, so a gradient update method is used • Iterate until convergence. Then proceed to step 2. • Update parameter on prior of w • The distribution on the target variable has no specific parameters. • Until system convergence, continue at step 1.

  35. 1) Optimization of w • Optimize posterior (Bayes’ Rule) of w • Update weights using Newton-Raphsonmethod

  36. 2) Optimization of Prior • Optimization of covariance of prior • Making a large number of assumptions, diagonal elements of A can be estimated

  37. Random Sets: Multiple Instance Learning • Random set framework for multiple instance learning • Bags are sets • Idea of finding commonality of positive bags inherent in random set formulation • Find commonality using intersection operator • Random sets governing functional is based on intersection operator

  38. MI issues • MIL approaches • Some approaches are biased to believe only one sample in each bag caused the target concept • Some approaches can only label bags • It is not clear whether anything is gained over supervised approaches

  39. x T x T T x x x x T T x x x RSF-MIL • MIL-like • Positive Bags = blue • Negative Bags = orange • Distinct shapes = distinct bags

  40. Side Note: Bayesian Networks • Noisy-OR Assumption • Bayesian Network representation of Noisy-OR • Polytree: singly connected DAG

  41. Side Note • Full Bayesian network may be intractable • Occurrence of causal factors are rare (sparse co-occurrence) • So assume polytree • So assume result has boolean relationship with causal factors • Absorb I, X and A into one node, governed by randomness of I • These assumptions greatly simplify inference calculation • Calculate Z based on probabilities rather than constructing a distribution using X

  42. Diverse Density (DD) • Probabilistic Approach • Goal: • Standard statistics approaches identify areas in a feature space with high density of target samples and low density of non-target samples • DD: identify areas in a feature space with a high “density” of samples from EACH of the postitive bags (“diverse”), and low density of samples from negative bags. • Identify attributes or characteristics similar to positive bags, dissimilar with negative bags • Assume t is a target characterization • Goal: • Assuming the bags are conditionally independent

  43. It is NOT the case that EACH element is NOT the target concept Diverse Density • Calculation (Noisy-OR Model): • Optimization

  44. Random Set Brief • Random Set

  45. Random Set Functionals • Capacity and avoidance functionals • Given a germ and grain model • Assumed random radii

  46. When disjunction makes sense • Using Large overlapping bins the target concept can be encapsulated within 1 instance: Therefore a disjunctive relationship exists Target Concept Present OR

  47. Theoretical and Developmental Progress • Previous Optimization: • Did not necessarily promote diverse density • Current optimization • Better for context learning and MIL • Previously no feature relevance or selection (hypersphere) • Improvement: included learned weights on each feature dimension • Previous TO DO list • Improve Existing Code • Develop joint optimization for context learning and MIL • Apply MIL approaches (broad scale) • Learn similarities between feature sets of mines • Aid in training existing algos: find “best” EHD features for training / testing • Construct set-based classifiers?

  48. It is NOT the case that EACH element is NOT the target concept How do we impose the MI scenario?: Diverse Density (Maronet al.) • Calculation (Noisy-OR Model): • Inherent in Random Set formulation • Optimization • Combo of exhaustive search and gradient ascent

  49. How can we use Random Sets for MIL? • Random set for MIL: Bags are sets • Idea of finding commonality of positive bags inherent in random set formulation • Sets have an empty intersection or non-empty intersection relationship • Find commonality using intersection operator • Random sets governing functional is based on intersection operator • Example: Bags with target {l,a,e,i,o,p,u,f} {f,b,a,e,i,z,o,u} {a,b,c,i,o,u,e,p,f} {a,f,t,e,i,u,o,d,v} Bags without target {s,r,n,m,p,l} {z,s,w,t,g,n,c} {f,p,k,r} {q,x,z,c,v} {p,l,f} intersection union Target concept = \ {a,e,i,o,u,f} {f,s,r,n,m,p,l,z,w,g,n,c,v,q,k} = {a,e,i,o,u}

More Related