1 / 14

On discarding, cahching and recalling samples from Active Learning

On discarding, cahching and recalling samples from Active Learning. Written by Ashish Kapoor and Eric Horvitz 2007. Active Learning. Type of supervised learning Abundant unlabeled label-expensive data AL can ask teacher for help Pool-based vs. Stream-based data.

sheng
Download Presentation

On discarding, cahching and recalling samples from Active Learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. On discarding, cahching and recalling samples from Active Learning Written by Ashish Kapoor and Eric Horvitz 2007

  2. Active Learning • Type of supervised learning • Abundant unlabeled label-expensive data • AL can ask teacher for help • Pool-based vs. Stream-based data

  3. Problem with Active Learning • Outdated data can occur because of dynamic environment • One way to handle this is: • Discarding, caching and recalling samples

  4. Motivation • Handwriting recognition • Spam detection • Speech recognition • Tasks where the data might vary over time

  5. Keywords • Value of forgetting = VOF • Value of recalling = VOR • Value of probing = VOP (expected) • Value of information = VOI (expected)

  6. Decision-theoretic active learning • Input Data Stream: XT= {x1,..xt-1,xt,xt+1,..,xT} • Initial Classifier: w0 • Maximum size of the buffer: Sbuff • Size of Horizon: khoriz • set of active points L = {}, cache C = {} and buffer B = {} • for t = 1,..,T • Observe the data xt • B = B  xt • if size(B)> Sbuff {discard the oldest point} • %Seek cycle: pursuing new labels • %Cache cycle: forgetting & storing labeled cases • %Recall cycle: remembering discarded cases • end

  7. %Seek cycle • If VOP(xt,wt-1,khoriz) > 0 • Add to active training set L = L  xt • Update classifier wt

  8. %Cache cycle • For all labeled points x in L • If VOF(x,wt) > 0 • Remove from active training set L = L  xt • Add to cache C = C  x • Update classifier wt

  9. %Recall cycle • For all cached labeled points x in C • If VOR(x,wt) > 0 • Add to active training set L = L  x • Remove from cache C = C  x • Update classifier wt

  10. Decision-theoretic active learning • Input Data Stream: XT= {x1,..xt-1,xt,xt+1,..,xT} • Initial Classifier: w0 • Maximum size of the buffer: Sbuff • Size of Horizon: khoriz • set of active points L = {}, cache C = {} and buffer B = {} • for t = 1,..,T • Observe the data xt • B = B  xt • if size(B)> Sbuff {discard the oldest point} • %Seek cycle: pursuing new labels • %Cache cycle: forgetting & storing labeled cases • %Recall cycle: remembering discarded cases • end

  11. Test • System data like: • keyboard and mouse activity • window in focus • time of day and day of week • name of computer • Is recorded as data • Objective: is user busy or not

  12. Results in tables Program manager data Developer data

  13. Results in grahps

  14. Conclusion • Using VOP,VOF,VOR offers: • Fewer probes • Lower cost • Higher accuracy • For only a small additional cost

More Related