Improving Click-Through Rate Prediction Accuracy in Online Advertising by Transfer Learning

ImprovingClick-ThroughRatePredictionAccuracyinOnlineAdvertisingbyTransferLearningImprovingClick-ThroughRatePredictionAccuracyinOnlineAdvertisingbyTransferLearning Yuhan Su1, ZhongmingJin2, Ying Chen2, XinghaiSun2, YamingYang2, FangzhengQiao2, Fen Xia2, Wei Xu1 1TsinghuaUniversity,2Baidu,Inc.

Advertisement

Onlineadsrevenue: three factors Revenue=PV *CTR* ACP NumberofPageViews AverageClickPrice Click-throughRate =#clicks/#views

CTRprediction:thecentertooptimizerevenue

Challenge:smallproductslackdata • Small,niche-marketproducts • Newlydevelopedproducts • LackdataforCTRpredictionmodel. • Largeproductdatato • helpsmallproduct • Differentproductshavedifferentdistribution

Transferlearning:fromsourcetotarget Source Differentdistribution ? Differentdistribution LargeProduct Target Small Product Transferlearning

Ourcontributions Aneffectivetransferlearningapproach Smallproducts CTRprediction Amap-reduceefficientimplementation Realadsdataexperiment

Relatedwork They • Singleadvertisementproduct We • Multipleproducts • CTRprediction • Model • Feature • Transferlearning • Instancetransfer • Featurerepresentationtransfer • Parametertransfer • Relationalknowledgetransfer • Deeptransfer We • Handleamuchlargerdataset They • Fewworkonlargeadsdata

BaiduAllianceAdssystem Product1 (3) Send info (4) Returnbidding priceandmaterials (2) Sendrequest andrelatedinfo ADX Website (1) Surf Product2 User (6) See ads (5) Return ads … Productn

Ourapproach: framework pre-trainatargetmodel; loopfor Ntimes { samplesourcedata; combinedtraining; datareweight; } outputtheensemblemodel;

Ourapproach:intuition Target Target Target Target Source Source Source Source Initialization Sampling Reweighting Training

Ourapproach: sampling strategy Target Source • Sourcedata sampling • Thesamplingprobabilityisproportionaltothegradientonthetrainedmodel. • Intuition:Thelargerthegradientis,themorethemodelneedsthisdatainstance. Sampling

Ourapproach: data reweighting Target Source • Datareweighting • correctly classified: weight will not change • target data misclassified: increase dataweight • source data misclassified: decrease data weight Reweighting

Ourapproach: model ensemble • Modelensemble • As TrAdaboostproves, if the algorithm runs for N iterations, the average weighted training loss on source data from the ⌈N/2⌉-thiteration to the N-thiteration converges to zero. • Theoutputvalueis[0,1]

Experimentsettings • Environment: • Internal MapReduce-like machine learning framework • 100computingnodes • Metric:AUC(AreaunderROCcurve) [0,1] • Datasets:

Experimentresults • Source and target has very different data distribution

Experimentresults • Directly combining source and target does not work

Experimentresults • Our method has better AUC • Our method has less training time. • ( TrAdaboost 220 min vs Our 70 min)

Parametersensitivity: small N makes the ensemble not work • N:numberoftotaliterations; number of ensemble model • N too small: the number of ensemble model is too small so the algorithm does not work well.

Parametersensitivity: large N makes the model overfit • N too large: the number of ensemble model is too large so that the algorithm tends to overfit.

Parametersensitivity: zero alpha only uses target data • Alpha:sampling parameter • α is zero: no source data will be used. The algorithm only uses target data and do model ensemble so this algorithm will become similar to Adaboost.

Parametersensitivity: large alpha uses every source data • α too large: the probability value will be larger then 1. Thus, every source data instance will be sampled.

Data size ratio: # target / # source, neither too small nor too large. • Fix the target size, vary the source size. • Max at about 0.8 • It is necessary to carefully adjust the data size ratio instead of over utilizing the source data.

Promising directions and approach limitations • Our approach shows promising directions • Directlyusedinaverage-click-price(ACP) prediction • Other similartransfer learning scenarios(e.g. user risk prediction) • Some limitations in our approach • Current sampling strategy only uses the information of gradient • Do not take the sparsity of the advertisement data into consideration • How to efficiently do multiple-source transfer is challenging.

Summary • An iterative transfer learning method to deal with CTR prediction in online ads • A map-reduce like implementation makes the approach scalable • Real data experiment shows the effectiveness and promising direction syhmartin@yeah.net http://iiis.tsinghua.edu.cn/en/2014311424/ Thankyou!

Improving Click-Through Rate Prediction Accuracy in Online Advertising by Transfer Learning

Improving Click-Through Rate Prediction Accuracy in Online Advertising by Transfer Learning

Presentation Transcript

Learning User Behaviors for Advertisements Click Prediction

Accuracy of Prediction

Improving Prediction Accuracy of Matrix Factorization Based Network Coordinate Systems

Improving relevance prediction by addressing biases and sparsity in web search click data

Improving Revenue Cycle Accuracy Through Analytics

Psychological Advertising: Exploring User Psychology for Click Prediction in Sponsored Search

Improving QA Accuracy by Question Inversion

Improving QA Accuracy by Question Inversion

Improving accuracy by weighting Euclidean distance

Improving Location Accuracy

Improving Resuscitation through Blended Learning

Predicting Click Through Rate for Job Listings

Improving Student Learning through Podcasting

Prediction and Accuracy

Promote Business Online in Patna through Click By SEO

How To Improve Your Click-Through Rate?

Tips to Improve Click Through Rate in Google Adwords

Transfer Learning for Link Prediction

Predicting Click Through Rate for Job Listings

Improving Location Accuracy

Improving Borrower Pull Through Rate

Click Media Online Advertising Agency