Exploration & Exploitation in Adaptive Filtering Based on Bayesian Active Learning

Exploration & Exploitation in Adaptive Filtering Based on Bayesian Active Learning Yi Zhang, Jamie Callan Carnegie Mellon Univ. Wei Xu NEC Lab America

initialization First Request document stream Delivered docs Filtering System …  (Binary Classifier) (Utility Function) User Profile Learning Accumulated docs Feedback A Typical Adaptive Filtering System

Commonly Used Evaluation If we assume user satisfaction is mostly influenced by what she/he has seen, then a simplified version for utility is: For example: Utility=2R+-N+ (Used in TREC9, TREC10, TREC11 Adaptive Filtering Track trec.nist.gov)

Common Approach in Adaptive Filtering • Set the dissemination threshold where the immediate utility gain of delivering a document is zero: For example: in order to optimize Utility=2R+-N+, system delivers iff P(Rel)>=0.33 Because Uimmediate=2P(Rel)-P(Nrel)>=0

Problem with Current Adaptive Filtering Research • Why deliver a document to the user? • Satisfies the information need immediate • Get user feedback so the system can improve its model of the user’s information need, thus satisfy the information need better in the future • Current research in adaptive filtering: underestimates the utility gain of delivering a document by ignoring the second effect • Related work: active learning, Bayesian experimental design

Solution: Explicitly Model the Future Utility of Delivering a Document • Nfuture : number of discounted documents in the future • Exploitation: estimation of the immediate utility of delivering a new document based on model learned • Exploration: estimation of the future utility of delivering a new document by considering the improvement of the model learned if we can get user feedback obout the document.

Exploitation: Estimate Uimmediate Using Bayesian Inference Let P(|Dt-1) be the posterior distribution of model parameters given training data set Dt-1. Using Bayesian Inference, we have: Ay is the credit/penalty defined by the utility function that model user satisfaction Y=R if relevant, y=N if none relevant

Exploration: Utility Divergence to Measure Loss (1) • If we use while the true model is , we incur some loss (utility divergence (UD)): deliver deliver Document Space

Exploration: Utility Divergence to Measure Loss (2) • We do not know . However, based on our beliefs about its distribution, we can estimate the expected loss of using : • Thus we can measure the quality of Training data D as the expected loss if we use the estimator

x y The Whole Process • Step 1: • Step 2:

Adaptive Filtering: Logistic Regression to Find Dissemination Threshold X: score* indicates how well each document matches the profile Metropolis-Hasting algorithm to sample I for integration. *scoring function is learned adaptively using Rocchio algorithm

Experimental Data Sets and Evaluation Measures

Trec-10 Filtering Data: Reuters Dataset • Active learning is very effective on TREC10 dataset

Trec-9 Filtering Data: OHSUMED Dataset • On average, only 51 out of 300000 are relevant documents. • Active learning didn’t improve utility on TREC9 dataset. But it didn’t hurt either. (The algorithm is robust)

Related Work • Related Work • active learning • Uncertainty about the label of document • Request the label of the most uncertain document • Minimize the uncertainty about future labels • Uncertainty about the model parameters (KL divergence, variance) • Bayesian Experimental Design • Improvement of the utility of the model • Information Retrieval • Mutual Information between document and label

Contribution and Future Work • Our Contribution • Derivation of Utility Divergence to measure model quality • Combining immediate utility and future utility gain in adaptive filtering task • Empirically robust algorithm • Future Work • High dimensional space • Computational issues: variational algorithms, Gaussian approximations, Gibbs sampling, … • Number of training data needed • Other active learning applications • Online marketing • Interactive retrieval • …

The End Thanks

Exploration & Exploitation in Adaptive Filtering Based on Bayesian Active Learning