120 likes | 244 Views
User Intent Based Online Advertising. ADKDD 2010. Motivation. From the publisher/end user point of view: Most traditional approach for online advertising are content based, and those methods mainly deliver ads according to domains
E N D
User Intent Based Online Advertising ADKDD 2010
Motivation • From the publisher/end user point of view: • Most traditional approach for online advertising are content based, and those methods mainly deliver ads according to domains • However, different task might be performed in the same domain, the following figure shows a user who want to repair his/her car, but the ads delivered have little to do with that intent
Motivation • From the advertiser point of view: • In most current sponsored search system, advertisers bid on terms to have their ads delivered, however, words used in queries are numerous and hard to choose
Our Approach • Definition of Intent: A certain task that a user might perform in a given domain • We propose to sell intent under certain domain instead of sell term in general • We will show that deliver ads considering intent would help improve CTR • We will give an algorithm that can predict user intent in a given domain • We will show that using our algorithm to deliver ads can help improve CTR
Related work • Online advertising • The brief history and some basic algorithms is introduced in [2], [3] • However, the most straightforward idea is hard to use directly due to the diversity of vocabulary in queries and queries are tend to be short • As a result, three kind of approach were proposed to enhance the performance • Design algorithms to select ads [6], [20] as well as leverage more information from single query [4],[7],[8]; however, little of those methods can be directly used in classifying user intent • BT, leverage user past behavior to help deliver ads [10], [11]; however, in those approaches, only bag-of-words are considered, thus it is more likely to deliver ads according to domain
Related work • User Intent Classification • First proposed in [16] and define user intent as Navigational, Informational and Transactional. [17],[18] provided some algorithm to classify user intent. However, this helps little on advertising. • So, in order to help deliver ads, [9] proposed Online Commercial Intention(OCI), however, this can only tell us whether to deliver ads, but not what ads to deliver.
Related work • Random walk
Challenge • As point out by [19], it is hard to assigning labels manually even for the navigational, informational and transactional intent, so it is hard to attain the training data in a single domain. • Moreover, we have hundreds of domains to deal with • As a result, we can not manually label training data for classification. So we propose to use random walk to address this challenge. • The basic assumption of our proposal is that most pages can only accomplish a single task, thus user who visit similar pages may have similar intent
Details of our algorithm • First, for a given domain D and intent I, we use the following algorithm to extract training data: • How we perform Random Walk: 1. Let 2. Let 3. Replace all entities in , and get a pattern list PAT 4. Let 5. Build a bipartite graph V=(PAT,URL), if there are k clicks between pattern pat and URL u, the weight for edge (pat,u) is k, note if query q can be represented by pattern pat, and q clicked u, we say pat click u 6. Manually select 7. Starting from , we use random walk to expend the seeds patterns, and get the final training data
Random Walk • We define transition probability as: , where represents the weight for edge (j,k) • We define the probability that after t step, we can reach point i as , and we have: • And we define • Using the above formula, we can calculate the possibility for any given t and i; and we select pattern pat that have , θ is a manually defined parameter
Random Walk • However, not all URLs meet our basic assumption, so in order to get a cleaner training data set, we have to filter out some pages. • Filter 1: If a URL is clicked by too many different queries and the entropy is large enough, like the home page of yahoo or MSN, we filter out those URLs, in detail: we filter out URL u, having entropy(u) > 0 • Filter 2: If a URL is clicked almost equally by different kind of seed pattern, we filter them out. • For example, if u is clicked 100 times from seed pattern of buy, and clicked 150 times from seed pattern of repair, we filter it out
Prediction Model • Using training data extracted in previous step, we get a set of patterns for each intent in a certain domain • Then we use those patterns to train a multi-class classification model • Finally, we use the classification model to classify query intent, and deliver ads according to the intent in the given domain