Learning User Behaviors for Advertisements Click Prediction

Learning User Behaviors for Advertisements Click Prediction Chieh-Jen Wang & Hsin-Hsi Chen National Taiwan University Taipei, Taiwan

Introduction • The commercial value of advertisements on the web depends on whether users click on the advertisements • Predicting potential advertisement clicks of users before target advertisements are displayed is important • advertisement recommendation • advertisement placement • presentation pricing • Problem specification • Given a current search session (q1, q2, ..., q(i-1)), we will predict if there is an ad click event when query qi is submitted.

Related Work • Advertisiment click prediction model • Feature representation • text features (Richardson et al., 2007) • demographics features (Cheng & Cantú-Paz, 2010) • mouse trajectory features (Guo & Agichtein, 2010) • Machine learning algorithm • logistic regression (Richardson, Dominowska, & Ragno, 2007) • maximum entropy (Cheng & Cantú-Paz, 2010) • support vector machines (Broder et al., 2008) • conditional random field (Guo & Agichtein, 2010)

Related Work • User search intent • navigational, informational and transactional (Broder, 2002) • noncommercial/commercial & navigational/informational (Ashkan et al., 2009) • research & purchase (Guo & Agichtein, 2010) • receptive & not receptive (Guo & Agichtein, 2010) • “receptive” (i.e., an advertisement click is expected in a future search within the current session) • “not receptive” (i.e., not any future advertisement clicks are expected within the current session)

Overview

Microsoft AdCenter Logs • Time: 2007-08-10 ~ 2007-11-01(84 days) • The Microsoft AdCenter logs include: • 101 million impressions • 7.82 million clicks • 40.6 million sessions (5.06 million sessions contain at least one click) • An impression is defined as a single search results page described by a set of attributes • A session is defined by a repeated search engine usage of intervals of 10 minutes and less, with a total session not longer then 8 hours

Data Purify • For the purposes of promotions, some specific queries are issued or advertisements are clicked by software robots • Filter criteria • issue queries more than 7 times in any 10 second interval • issue queries at two distinct places at the same time • click an advertisement more than one time in any 5 second interval • duplicated impression IDs • Data partition • Training: sessions which contain at least one advertisement click in the first 56 days • Testing: sessions in the last 28 days

Experiment Datasets

Overview

Feature Extraction • Feature representation • Every impression qi (1in) in session s = (q1, q2, ..., q(i-1), qi, q(i+1), ..., qn) is represented as a feature vector • qi itself (Current Impression Level) • the first impression q1 (First Impression Level) • the previous n impression q(i-n) (Previous n Impression Level) • all the contextual impressions q1, q2, ..., q(i-1) in s (Contextual Impression Level) • Labeling • click if impression qi contains at least one advertisement click, otherwise non-click.

Feature Extraction from Current Impression Level • These features aim to capture query information, users’ intent and the similarity between current query an previous one • QC (query category) • 14 categories (exclusive of “Regional” and “World”) on the 2nd level of the Open Directory Project (ODP) ontology to represent query categories • QIntent (query intent) • 4,020 intent clusters are learned from MSN Search Query Log excerpt (Wang et al., 2010) • QIntent is specified by the distribution of the top 100 similar intent clusters

Feature Extraction from First Impression Level • These features aim to capture an initial search goal of a session.

Feature Extraction from Previous n Impression Level • These features aim to capture the advertisements clicks information of the previous n impression. • In our experiments, n is set to 1 and 2

Feature Extraction from Contextual Impression Level

Feature Extraction from Contextual Impression Level • These features represent a sequence of users’ behaviors • Weight of intent types of submitted queries (CTQIntent) and clicked advertisements (CTAdIntent) in the access history is defined as: • Pm is a probability of the type m intent • wjdenotes a query or a clicked advertisement in qj • Weight of ODP categories (CTQC & CTAdC) Jelinek-mercer smoothing

Overview

Click Prediction Model • Four learning algorithms • Conditional Random Fields (CRF) • Support Vector Machine (SVM) • kernel function (RBF, linear kernel) • parameter optimization (grid algorithm for c and g) • Decision Tree • C4.5 Tree • Back-Propagation Neural Networks • Hidden Layer =2 • Learning rate = 0.8 • Momentum = 0.2

Feature Selection Algorithm • Random Subspace Method (RS) • an ensemble classifier that consists of several classifiers • prediction is through a majority vote from the classifiers • F-Score (FS) & Information Gain (IG) • greedy inclusion algorithm • retain a number of the best terms or features for use by the classier

Overview

Performance of Advertisements Click Prediction • Metrics • accuracy (Acc), precision (Prec), recall (Rec), and F-measure (F1) • Baseline • guessing the majority class (non-click) is one baseline. • Markov Model (MM), formulated by query transition.

Performance of Feature Selection

Top-10 Important Features

Conclusion and Future Work • We explore the effects of various intent-related features on advertisements click prediction • CRF model performs better than two baselines and SVM significantly • When random subspace method is introduced to feature selection, the precision of click prediction is increased from 0.1663 to 0.1721 • In the future, we plan to expand our model to consider fine-grained user intent and user interactions • In addition, we will extend this approach to predict which advertisements will be clicked

Thank You Q & A

Learning User Behaviors for Advertisements Click Prediction

Learning User Behaviors for Advertisements Click Prediction

Presentation Transcript

Advertisements

NEW USER ,CLICK ON REGISTER.

Advertisements

Advertisements

Efficient Decomposed Learning for Structured Prediction

Advertisements

Advertisements

ADVERTISEMENTS

Advertisements

Advertisements

Advertisements

Psychological Advertising: Exploring User Psychology for Click Prediction in Sponsored Search

Advertisements

Advertisements

User Location Prediction using MLPs

Learning to remove Internet advertisements

Learning Behaviors:

One-Click Learning

The Learning Behaviors Scale

ADVERTISEMENTS

Transfer Learning for Link Prediction