140 likes | 226 Views
On discarding, cahching and recalling samples from Active Learning. Written by Ashish Kapoor and Eric Horvitz 2007. Active Learning. Type of supervised learning Abundant unlabeled label-expensive data AL can ask teacher for help Pool-based vs. Stream-based data.
E N D
On discarding, cahching and recalling samples from Active Learning Written by Ashish Kapoor and Eric Horvitz 2007
Active Learning • Type of supervised learning • Abundant unlabeled label-expensive data • AL can ask teacher for help • Pool-based vs. Stream-based data
Problem with Active Learning • Outdated data can occur because of dynamic environment • One way to handle this is: • Discarding, caching and recalling samples
Motivation • Handwriting recognition • Spam detection • Speech recognition • Tasks where the data might vary over time
Keywords • Value of forgetting = VOF • Value of recalling = VOR • Value of probing = VOP (expected) • Value of information = VOI (expected)
Decision-theoretic active learning • Input Data Stream: XT= {x1,..xt-1,xt,xt+1,..,xT} • Initial Classifier: w0 • Maximum size of the buffer: Sbuff • Size of Horizon: khoriz • set of active points L = {}, cache C = {} and buffer B = {} • for t = 1,..,T • Observe the data xt • B = B xt • if size(B)> Sbuff {discard the oldest point} • %Seek cycle: pursuing new labels • %Cache cycle: forgetting & storing labeled cases • %Recall cycle: remembering discarded cases • end
%Seek cycle • If VOP(xt,wt-1,khoriz) > 0 • Add to active training set L = L xt • Update classifier wt
%Cache cycle • For all labeled points x in L • If VOF(x,wt) > 0 • Remove from active training set L = L xt • Add to cache C = C x • Update classifier wt
%Recall cycle • For all cached labeled points x in C • If VOR(x,wt) > 0 • Add to active training set L = L x • Remove from cache C = C x • Update classifier wt
Decision-theoretic active learning • Input Data Stream: XT= {x1,..xt-1,xt,xt+1,..,xT} • Initial Classifier: w0 • Maximum size of the buffer: Sbuff • Size of Horizon: khoriz • set of active points L = {}, cache C = {} and buffer B = {} • for t = 1,..,T • Observe the data xt • B = B xt • if size(B)> Sbuff {discard the oldest point} • %Seek cycle: pursuing new labels • %Cache cycle: forgetting & storing labeled cases • %Recall cycle: remembering discarded cases • end
Test • System data like: • keyboard and mouse activity • window in focus • time of day and day of week • name of computer • Is recorded as data • Objective: is user busy or not
Results in tables Program manager data Developer data
Conclusion • Using VOP,VOF,VOR offers: • Fewer probes • Lower cost • Higher accuracy • For only a small additional cost