1 / 17

Efficient Behavior Targeting Using SVM Ensemble Indexing

Efficient Behavior Targeting Using SVM Ensemble Indexing. Jun Li, Peng Zhang, Yanan Cao, Ping Liu, Li Guo Chinese Academy of Sciences State Grid Energy Institute, China. Behavior targeting.

benito
Download Presentation

Efficient Behavior Targeting Using SVM Ensemble Indexing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Efficient Behavior Targeting Using SVM Ensemble Indexing Jun Li, Peng Zhang, Yanan Cao, Ping Liu, Li Guo Chinese Academy of Sciences State Grid Energy Institute, China

  2. Behavior targeting • Behavior Targeting (BT) uses users’ historical behavior data to select the most relevant ads for display. • Example from Yahoo! Research User behavior data ads Targeted users

  3. Regression for BT • Poisson Regression model (Ye Chen, eBay, 2009). • x: ad clicks and views, page views, search queries and clicks. • y: click-through rate (CTR). Poisson dis. View data Poisson reg. on view ad category Click data Poisson dis. Poisson reg. on click Ye Chen et al., Large-scale behavior targeting (KDD’09 best paper award)

  4. Limitations • Limitations: • parameter tuning is very difficult. • the Poisson assumption is not always true for real-world behavior data. • Clicks are typically several orders of magnitude fewer than views. • User interests are not always fixed, but rather transient.

  5. Classification for BT • SVM for classification • Example 1: 3 users on Nikon (www.nikon.com)’s ad a View data SVM for classification View and click data(+) View but no click data(-) ad category Click data Challenges 1,2,3

  6. Classification for BT • Ensemble SVM on data streams • Merits • no complicated parameters • no statistical assumptions • Dynamic model on data streams Challenge 4

  7. Limitations • Time cost is heavy for online computing • ensemble prediction • time cost: A (advertisers)*W(ensemble size)*N(support vectors)*T(features) • Example 2: We collect 2 million behavior events (W = 10) in 1 minute, and prediction result costs 53 minutes.

  8. Solutions • Construct Index structure for Ensemble SVM. Why the index work ? • Trade space for time. • shared features among multiple support vectors • the sparse structure of support vectors Text terms Features map Support vector Document Ensemble SVM Document set P. Zhang et al., knowledge index for online data streams ( KDD 2011 & ICDM 2011)

  9. The index structure • The SVM-index structure • Example 3: based on example 1, consider a SVM with 3 support vectors Inverted hashing table Support vectors Ensemble information Time complexity O(T)

  10. The index structure • Operations • Search: Predict the label of each incoming user data x, • Step 1: searches support vectors in the left inverted indexes • Step 2: calculate x’s class label • Insert: Integrate new classifiers into ensemble • Delete: Drop outdated classifiers from ensemble • Memory See our source codes.

  11. Experiments • Data sets • Search engine data • Comparisons • Possion • E-SVM • E-Index (our method)

  12. Comparisons • Observations E-index has sub-linear prediction time E-SVM consumes more memory

  13. Comparisons Ensemble models are more accurate than Poisson regression model

  14. Comparisons The index method can significantly improve the efficiency, especially when the ensemble size is large.

  15. Related Work • Behavior targeting • Regression models vs. classification models • Stream indexing • Boolean expression indexing in Publish/subscribe systems • Ensemble models • Concept drifting

  16. Conclusions • Contributions • Identify and address the prediction efficiency problem for ensemble models for behavior targeting. • Convert ensemble SVM modelto a document set, and propose a new type of invert text index structure to achieve sub-linear prediction time. • Future work • Index more complicated SVM models with non-linear kernels.

  17. Questions? For source code, visit our website streamming.org/homepages/lijun.html

More Related