1 / 16

Active learning Query Strategies

Active learning Query Strategies. September 27 2010. Outline. Previous lecture: Uncertainty Sampling Density Weighted Methods This lecture: Query-By-Committee Expected Model Change Expected Error Reduction Variance Reduction . Query-By-Committee.

illias
Download Presentation

Active learning Query Strategies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Active learning Query Strategies September 27 2010

  2. Outline • Previous lecture: • Uncertainty Sampling • Density Weighted Methods • This lecture: • Query-By-Committee • Expected Model Change • Expected Error Reduction • Variance Reduction

  3. Query-By-Committee • Idea: Limit the size of the version space • Maintain a committee of models (ex: set of classifiers) trained on the same labeled dataset • Each model represents different region of the version space • Each member of the committee cast a vote on every query instance • Pick the one that they most disagree

  4. Query-By-Committee Region of disagreement • Pick unlabeled instances within the disagreement region

  5. Query-By-Committee • How to measure the disagreement • Vote Entropy • KL Divergence # of votes that the label receives Consensus probability that the label is correct

  6. Expected Model Change • Pick the instance that would change the current model the most, if its label is known. • How to measure change in the model? • Expected Gradient Length (EGL) • Gradient of the objective function • Assuming training is converged in previous iteration ~

  7. Expected Error Reduction • Pick the instance that reduces the expected generalization error • Minimize the expectation of the loss function • L0 (0/1 loss) • Log loss

  8. Expected Error Reduction • L0 (0/1 loss) New parameters after retraining • Log loss

  9. Discussion • Task: Gene prediction using CRFs • Feature space • Set of labelings • Unlabeled pool is large

  10. Discussion

  11. Variance Reduction • In expected error reduction, closed form of the expectation of the loss function is available for models such as Gaussian random models. • What if it is not available? • Minimize the risk, by minimizing the output variance. Risk = (some noise) + (model bias) + (output variance) Model independent If the model class is fixed, it’s invariant Minimize this term

  12. Variance Reduction Gradient of the predicted output with respect to model parameters Inverse of the Fischer Information matrix

  13. Variance Reduction • Fischer information matrix is the partial derivative of the log likelihood with respect to model parameters • Measures the effect of model parameters on the objective function • Maximize Fisher information matrix: pick the parameter values that will change the model the most • Equivalent to minimizing its inverse, which is equivalent to minimizing the variance

  14. Discussion: Variance Reduction • When parameter space is large? • Unlabeled pool is large • Unlabeled pool is unbalanced

  15. Discussion • Which point would you pick? A or B? • Which of the following sampling strategies would you pick? • Expected Model Change • Expected Error Reduction • Variance Reduction • How about QBC and uncertainty sampling? B A

  16. References 1. B. Settles. Active Learning Literature Survey. Computer Sciences Technical Report 1648, University of Wisconsin-Madison. 2009.

More Related