260 likes | 878 Views
Learning User Behaviors for Advertisements Click Prediction . Chieh-Jen Wang & Hsin-Hsi Chen National Taiwan University Taipei, Taiwan. Introduction. The commercial value of advertisements on the web depends on whether users click on the advertisements
E N D
Learning User Behaviors for Advertisements Click Prediction Chieh-Jen Wang & Hsin-Hsi Chen National Taiwan University Taipei, Taiwan
Introduction • The commercial value of advertisements on the web depends on whether users click on the advertisements • Predicting potential advertisement clicks of users before target advertisements are displayed is important • advertisement recommendation • advertisement placement • presentation pricing • Problem specification • Given a current search session (q1, q2, ..., q(i-1)), we will predict if there is an ad click event when query qi is submitted.
Related Work • Advertisiment click prediction model • Feature representation • text features (Richardson et al., 2007) • demographics features (Cheng & Cantú-Paz, 2010) • mouse trajectory features (Guo & Agichtein, 2010) • Machine learning algorithm • logistic regression (Richardson, Dominowska, & Ragno, 2007) • maximum entropy (Cheng & Cantú-Paz, 2010) • support vector machines (Broder et al., 2008) • conditional random field (Guo & Agichtein, 2010)
Related Work • User search intent • navigational, informational and transactional (Broder, 2002) • noncommercial/commercial & navigational/informational (Ashkan et al., 2009) • research & purchase (Guo & Agichtein, 2010) • receptive & not receptive (Guo & Agichtein, 2010) • “receptive” (i.e., an advertisement click is expected in a future search within the current session) • “not receptive” (i.e., not any future advertisement clicks are expected within the current session)
Microsoft AdCenter Logs • Time: 2007-08-10 ~ 2007-11-01(84 days) • The Microsoft AdCenter logs include: • 101 million impressions • 7.82 million clicks • 40.6 million sessions (5.06 million sessions contain at least one click) • An impression is defined as a single search results page described by a set of attributes • A session is defined by a repeated search engine usage of intervals of 10 minutes and less, with a total session not longer then 8 hours
Data Purify • For the purposes of promotions, some specific queries are issued or advertisements are clicked by software robots • Filter criteria • issue queries more than 7 times in any 10 second interval • issue queries at two distinct places at the same time • click an advertisement more than one time in any 5 second interval • duplicated impression IDs • Data partition • Training: sessions which contain at least one advertisement click in the first 56 days • Testing: sessions in the last 28 days
Feature Extraction • Feature representation • Every impression qi (1in) in session s = (q1, q2, ..., q(i-1), qi, q(i+1), ..., qn) is represented as a feature vector • qi itself (Current Impression Level) • the first impression q1 (First Impression Level) • the previous n impression q(i-n) (Previous n Impression Level) • all the contextual impressions q1, q2, ..., q(i-1) in s (Contextual Impression Level) • Labeling • click if impression qi contains at least one advertisement click, otherwise non-click.
Feature Extraction from Current Impression Level • These features aim to capture query information, users’ intent and the similarity between current query an previous one • QC (query category) • 14 categories (exclusive of “Regional” and “World”) on the 2nd level of the Open Directory Project (ODP) ontology to represent query categories • QIntent (query intent) • 4,020 intent clusters are learned from MSN Search Query Log excerpt (Wang et al., 2010) • QIntent is specified by the distribution of the top 100 similar intent clusters
Feature Extraction from First Impression Level • These features aim to capture an initial search goal of a session.
Feature Extraction from Previous n Impression Level • These features aim to capture the advertisements clicks information of the previous n impression. • In our experiments, n is set to 1 and 2
Feature Extraction from Contextual Impression Level • These features represent a sequence of users’ behaviors • Weight of intent types of submitted queries (CTQIntent) and clicked advertisements (CTAdIntent) in the access history is defined as: • Pm is a probability of the type m intent • wjdenotes a query or a clicked advertisement in qj • Weight of ODP categories (CTQC & CTAdC) Jelinek-mercer smoothing
Click Prediction Model • Four learning algorithms • Conditional Random Fields (CRF) • Support Vector Machine (SVM) • kernel function (RBF, linear kernel) • parameter optimization (grid algorithm for c and g) • Decision Tree • C4.5 Tree • Back-Propagation Neural Networks • Hidden Layer =2 • Learning rate = 0.8 • Momentum = 0.2
Feature Selection Algorithm • Random Subspace Method (RS) • an ensemble classifier that consists of several classifiers • prediction is through a majority vote from the classifiers • F-Score (FS) & Information Gain (IG) • greedy inclusion algorithm • retain a number of the best terms or features for use by the classier
Performance of Advertisements Click Prediction • Metrics • accuracy (Acc), precision (Prec), recall (Rec), and F-measure (F1) • Baseline • guessing the majority class (non-click) is one baseline. • Markov Model (MM), formulated by query transition.
Conclusion and Future Work • We explore the effects of various intent-related features on advertisements click prediction • CRF model performs better than two baselines and SVM significantly • When random subspace method is introduced to feature selection, the precision of click prediction is increased from 0.1663 to 0.1721 • In the future, we plan to expand our model to consider fine-grained user intent and user interactions • In addition, we will extend this approach to predict which advertisements will be clicked
Thank You Q & A