1 / 8

Active learning

Learn about Active Learning algorithms that actively select informative data samples, enhancing learning performance. Explore Membership Queries, Stream-based and Pool-based Sampling Strategies, and Uncertainty Sampling for better data selection. Understand Conformal Prediction to complement machine learning predictions with reliability measures.

merrittt
Download Presentation

Active learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Active learning • The learning algorithm must have some control over the data from which it learns • It must be able to query an oracle, requesting for labels of data samples that seem to be most informative for the learning process • Proper selection of samples implies better performances with fewer data

  2. Scenarios • Learning with membership queries • Stream-based sampling • Pool-based sampling

  3. Strategies • Uncertainty sampling • Query-by-committee • Density-weighted…

  4. Conformal prediction • Permits complementation of predictions made by machine learning algorithms with some measures of reliability • The label predicted for a new object must make it similar to the old objects • The degree of similarity is used to estimate the confidence in the prediction

  5. Conformal prediction algorithm • Inputs: • Training sample and a test sample • Consider all possible values for the label ; • Compute nonconformity scores and p-values for each possible classification; • Predict the label corresponding to the largest p-value calculated; • Output one minus the second largest p-value as the confidence for the prediction; • Output the largest p-value calculated as the credibility of the prediction.

  6. Nonconformity scores and p-values • Used as nonconformity scores the Lagrange multipliers computed during SVM training • Extended to a multiclass framework in a one-vs-rest approach • P-values:

  7. Active learning algorithm • Inputs • Initial training set T, calibration set C, pool of candidate samples U • Selection tresholdτ, batchsizeβ • Train an initial classifier on T • While a stopping-criterion is not reached • Apply the current classifier to the pool of samples • Rank the samples in the pool using the uncertainty criterion • Select the top β examples whose certainty level fall under the selection threshold τ • Ask teacher to label the selected examples and add them to the training set • Train a new classifier on the expanded training set

  8. Stopping criteria • Pre-specified size for the training set • Exhaustion of the pool of candidate samples • Early-stop • Implemented using the calibration set • Active selection stops if no improvements can be obtained when applying newly trained classifiers to the calibration set

More Related