300 likes | 421 Views
Unseating the Giants. Monte Zweben CEO, Splice Machine October 16, 2014. The Big Squeeze. Data growing much faster than IT budgets. Source: 2013 IBM Briefing Book. Source: Gartner, Worldwide IT, Spending forecast, 3Q13 Update. Traditional RDBMSs Giants Overwhelmed…
E N D
Unseating the Giants Monte Zweben CEO, Splice Machine October 16, 2014
The Big Squeeze Data growing much faster than IT budgets • Source: 2013 IBM Briefing Book • Source: Gartner, Worldwide IT, Spending forecast, 3Q13 Update
Traditional RDBMSs Giants Overwhelmed… Scale-up becoming cost-prohibitive Splice Machine | Proprietary & Confidential
Scale-Out: The Future of Databases Dramatic improvement in price/performance • Scale Up • (Increase server size) • Scale Out • (More small servers) • $ • vs. • $ • $ • $ • $ • $
Unseating the Giants • Scale-Up • Giants • Scale-Out Challengers vs.
New application chooses NoSQL Splice Machine | Proprietary & Confidential
21,103,424 Websites 19Bn Daily impressions 145 RTB advertising supply partners ###MM WW CONSUMERS 91,999 DEVICES ADVERTISER ROCKET FUEL
Rocket Fuel: New Application Publishers Exchanges AdExchange Data Providers Ad networks Rocket Fuel Platform Real-Time Bidding Auto Optimization Advertisers
$2.38965 $0.6782 $1.7234 $0.09 $1.78964 $1.6782 $1.7234 $0.809 $2.42 1.25 $2.11 $1.26 $2.178 $2.056 $0.809 $2.42 1.25 $2.11 $1.26 $2.78 $1.56 $1.809 $2.42 1.25 $2.11 $1.26 $2.78 $0.56 $2.42 1.25 $2.11 $1.26 $2.78 $0.756 $0.809 $2.42 1.25 $2.11 $1.26 $2.78 $1.256 $1.809 $2.42 1.25 $2.11 $1.26 $2.78 $0.586 $2.009 1.25 $2.11 $1.26 $2.78 $1.56 $0.00 User Brand Affinity Time of Day Geo/Weather Site/Page [ + ] [ + ]
B S Apollo Reporting & Campaign Tools H E A User Lookup & Update User Profile Store 8 R T B E X C H A N G E S User Lookup 2 Rocket Fuel Ad Tag Bid Call Master Database 1 7 Ad Servers Pixel Servers Bid Servers Load Balancer 3 Load Balancer Campaign Framework Evaluate Ads 9 6 Exchange Publishers Tag for Selected Ad & Bid Ad Creative Tag Ad Server Logs Bidder Logs 4 5 Direct Publishers Likelihood Scores & Bid Value Eligible Ads ETL Response Prediction Models Ad-Rejection Hourly Refresh Response Prediction Models D S H F Hourly Refresh Slaves Apollo Ad-hoc Analytics Tools Master
HBase: Proven Scale-Out • Auto-sharding • Scales with commodity hardware • Cost-effective from GBs to PBs • High availability thru failover and replication • LSM-trees
Rocket Fuel: Results World class request velocity on over 10 PBs of data
Web application replaces Oracle Splice Machine | Proprietary & Confidential
Before Architecture: Oracle Oracle too expensive, too slow, and too difficult to scale and modify • Shutterfly Website • Uploader • App • Consumers • Photo File Storage • Metadata Storage
After Architecture: MongoDB Flexibility and scalability of NoSQL ideal for simple web app • Shutterfly Website • Uploader • App • Consumers • Photo File Storage • Metadata Storage
MongoDB Architecture Document data model sharded across commodity servers
MongoDB: Compelling Results vs. Oracle • 9x faster • through parallelized queries • ⅕ cost • with commodity scale out • Increased agility • with flexible schema and “shard on demand”
Existing OLTP & OLAP Apps Replace Oracle Splice Machine | Proprietary & Confidential
Before Architecture: Oracle RAC Oracle RAC too expensive and too slow, with queries up to ½ hour • Email Marketing • Social Feeds • Web/eCommerce Clickstreams • 1st Party/CRM Data • ETL • 3rd Party Data (e.g., Axciom) • Operational Reports for Campaign Performance • Ad Hoc Audience Segmentation • POS Data • Data Quality
After Architecture: Hadoop RDBMS RDBMS functionality with proven scale-out from Hadoop • Email Marketing • Social Feeds • Web/eCommerce Clickstreams • 1st Party/CRM Data • ETL • 3rd Party Data (e.g., Axciom) • Operational Reports for Campaign Performance • Ad Hoc Audience Segmentation • POS Data • Data Quality
Hadoop RDBMS: Best of Both Worlds Hadoop • Scale-out on commodity servers • Proven to 100s of petabytes • Efficiently handle sparse data • Extensive ecosystem • RDBMS • ANSI SQL • Real-time, concurrent updates • ACID transactions • ODBC/JDBC support
Distributed, Parallelized Query Execution • Parallelized computation across cluster • Moves computation to the data • Utilizes HBaseco-processors • No MapReduce HBaseCo-Processor HBase Server Memory Space LEGEND
Hadoop RDBMS: Compelling Results vs. Oracle • 3-7x faster • through parallelized queries • ¼ cost • with commodity scale out • 10-20x price/perf • with no application, BI or ETL rewrites
Scale-Up vs. Scale-Out Scale-Up: Top Reasons Willing to pay for engineered systems Lots of custom code (e.g., PL/SQL) Proven reliability Avoid risk of newer technologies Less migration required Scale-Out: Top Reasons Reduce costs by 4x-10x Increase performance by 3x-10x Ease of scalability Support for flexible schemas Huge ecosystem of open source tools
Unseating the Giants: Why is it different this time? It’s not just technology – user requirements fundamentally changed • Seismic User Shift • Budgets flat • Massive increase in data: • Volume • Velocity • Variety • No longer acceptable to throw data away • Disruptive Tech: • Scale-Out • Leverage commodity H/W • Reduce costs by 4-5x • Increase perf by 5-10x • Increase agility
Questions? Monte Zweben CEO, Splice Machine mzeben@splicemachine.com Visit Booth 246 www.splicemachine.com