260 likes | 382 Views
Soam Acharya, Prabhakar Krishnamurthy, Ketan Deshpande, Tak W. Yan, Chi-Chao Chang Yahoo! Inc. 2821 Mission College Boulevard Santa Clara, CA 95054. Cassini: A Simulation Framework for Evaluating Designs for Sponsored Search Markets. Topics. Overview/Motivation Requirements Architecture
E N D
Soam Acharya, Prabhakar Krishnamurthy, Ketan Deshpande, Tak W. Yan, Chi-Chao Chang Yahoo! Inc.2821 Mission College BoulevardSanta Clara, CA 95054 Cassini: A Simulation Framework for Evaluating Designs for Sponsored Search Markets
Topics • Overview/Motivation • Requirements • Architecture • Methodology • Applications/Results • Future Directions
Cassini Overview • What is it? • Discrete Event Simulation System • support simulations of different marketplace designs, policies and technologies • Provide rapid assessment of revenue/search (RPS), click-through-rate (CTR) and cost-per-click (CPC) impact • Compare % change vs. a baseline • Other metrics calculated depend on specific experiment
Contribution • General purpose sponsored search auction simulator • Built-in support • Auction structure, ranking, and payment policies, budgets • Advertisers, campaigns, bids • User click model • Search events • Extensible, modular architecture
Motivation and • Alternative: live tests • Problems • Expensive • Time consuming • Preparation, SLAs • Must run long enough for statistical significance • Incomplete • Not possible to explore all aspects of marketplace • Eg. advertiser long term effects
Topics • Overview/Motivation • Requirements • Architecture • Methodology • Applications/Results • Future Directions
Requirements for a Simulation Framework • Mimic Sponsored Search Auction mechanisms • Ranking, budgeting, pricing • User behavior • Click model • Use actual log traces as input • Advertiser behavior • Advertiser action controls • Performance • Process large quantities of data • Need to complete large numbers of runs quickly • Others: • Extensible • Support for market mechanisms
Output DB Overall Architecture Query Trace Budget Filtering Ad Server Ad Information Ranking External Ad Ranker Ad Information Pricing Offline Click Model Generation Click Generator Click Model Budget & Advertiser Management YSM Impression & Click Logs Metric Computation Simulation Log Output
Topics • Overview/Motivation • Requirements • Architecture • Methodology • Applications/Results • Future Directions
Methodological Issues • Sponsored search auctions are complex • Advertisers adapt to events and outcomes • Users adapt to market structure and policies and auction outcomes • Advertiser budgets introduce dependencies between markets • Input and event space is multidimensional with interactions • Simulation of joint distribution can be too time consuming
Approach • Simplifying assumptions in current version of Cassini • Advertiser actions are at equilibrium • Static user click model • Each auction is independent • Except when budget management designs are being evaluated • Approach • Take samples of actual historical search traffic • Focus on only the most significant sources of variation in traffic • Week-end vs week-day traffic • Samples from different times in history • Sampling • Using full day traffic for simulation is infeasible • Random sample of searches works well except with budgets • Best option: Ignore budgets unless it is the focus of experimentation • A very small proportion of traffic can provide reasonably good estimates of RPS (revenue per search) • With budgets, estimates are biased upwards • Otherwise, reasonably small (almost) closed micro-markets can be used
Micro-market Sampling • A micro-market is a collection of accounts and keywords such that All spend due to these accounts and keywords occurs within the collection • Run simulations with multiple micro-markets
Topics • Overview/Motivation • Requirements • Architecture • Methodology • Applications/Results • Future Directions
Applications of Cassini at Yahoo! • Screened candidates of ranking algorithms for live testing • Evaluated different design options for matching algorithms • Estimated the potential of budget optimization • Others
Topics • Overview/Motivation • Requirements • Architecture • Methodology • Applications/Results • Future Directions
Cassini – Future Directions • Advertiser bidding agent • Support automated, adaptive bidding agent • Allow different bidding strategies to be implemented • Bidding languages • Scale to full traffic • Open interface (other groups within Yahoo) • Self-service architecture
Related Work • Simulations of sponsored search auction designs • Feng, Bhargava, Pennock; Kitts, LeBlanc • Simulations of other types of auctions • Yankee Auctions (Bapna, Goes, Gupta); FCC Spectrum Auctions (Csirik et al) • Bidding Agents, Bots • Wurman, Wellman, and Walsh; Jennings; Powell
Design Decisions • Query driven metaphor • Allow collaboration: • leverage models from other groups • Multiple iterations for the same set of inputs • Maintain as much state as possible: • new metrics easily computed • Generate as much state as possible • Copious quantities of log files • Turn off for performance
Advertiser Actions • Static • Adjust bids, budgets • Target • Individual advertisers • Groups • Predefined • Randomly pick a certain percentage of advertisers within each group • Eg. Select 25% of advertisers in cluster E
Cassini Implementation Notes • # of lines: • 22K lines of C++, Perl, shell, SQL • 56K lines of C++ libraries • Performance • Single instance per machine • 80K unique queries over one day • Several million queries • Order of hours • Capacity: memory bound • Speed: Disk I/O bound
Advertiser Actions Examples Setting 1 Setting 2
Simulation Setup • Inputs • Bid Landscape • Accounts, ads, bids • Other • Budgets • Advertiser actions • Bid and budget changes (stochastic) • Events • Search • Clicks • Advertiser actions • Calibration • User click model • What do we want to use simulation for?
Design Exploration • Use reference simulation run to verify “invariants”: data and parameters • Similar to production set-up whose performance is well-understood • Compare to actual performance over a number of data samples • Comparison to bucket tests
Future Directions • Open interfaces for click model, ranking algo, matching algo, etc. • New click models • Self-service – ease of use, user interface, job management • Leverage work from Yahoo Pipes, other log/data processing groups. • Better analysis support – pre- and post- simulation analysis • See above