1 / 20

One Size Fits All: An Idea Whose Time has Come and Gone Michael Stonebraker

One Size Fits All: An Idea Whose Time has Come and Gone Michael Stonebraker. Alternate Title. The elephants are selling 30 year old “bloatware” That is not good at anything And you should send them to the “home for old software”. Three Financial Services Markets.

Download Presentation

One Size Fits All: An Idea Whose Time has Come and Gone Michael Stonebraker

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. One Size Fits All: An Idea Whose Time has Come and GoneMichael Stonebraker

  2. Alternate Title • The elephants are selling 30 year old “bloatware” • That is not good at anything • And you should send them to the “home for old software”

  3. Three Financial Services Markets • Stream processing (electronic trading) • Tick stores (data warehouses) • OLTP (transaction processing)

  4. Stream Processing (Electronic Trading) • A feed comes out of the wall • Compute a “secret sauce” looking for events of interest • Trade based on the result • But only if you are more nimble than the next guy….

  5. Traditional RDBMS ModelOutbound Processing • Store the data before processing! • Latency • What if the data is not important? • Too many processes! • Optimized for business data processing • Where you don’t trust the app. Processing Memory Updates Disk Queries Too slow to be interesting!

  6. Stream Processing Engine with StreamSQL • Database paradigm (SQL) a good one • But need a different architecture • Straight through processing • No task switches • Lightweight scheduling Inbound Processing StreamBase Application Streambase Application Alerts Actions Event Data Memory Disk Queries

  7. StreamSQL Application Example • Example: Every minute for every stock I am trading: • Calculate VWAP (vol. weighted avg. price) for my trades & all trades • Alert whenever my personal trading execution is inferior to market • 5 Streambase operators, 30 min to build • Streams of “tuples” (time-series data) flow through query • Queries run continuously Market_Feeds Alerts My_Buys ”

  8. Essentially all applications entail a mix of stored and real-time data StreamSQL covers both kinds of data in a single paradigm A rule engine must switch paradigms StreamSQL amenable to compilation Know what is the next event to process In contrast, hard to figure this out in a rule engine StreamSQL Will Dominate Rule Engines

  9. Performance Benchmark Financial Services Application: Construct a virtual feed of “first arrivers” on a low end Linux machine • Relational DB: 11,000 messages/sec • Streambase: 300,000 messages/sec • Another StreamSQL vendor: 20,000 messages/sec Result: Streambase was a factor of 27 faster

  10. Tick Stores (and Other Warehouse Applications) • Store all market data for the last 10 years • To back test “secret sauce” models • To answer ad-hoc queries – “how many times has X happened” • Typical size – 100 Tbytes • Append only

  11. Terminology -- “Row Store” Record 1 Record 2 Record 3 Record 4 E.g. DB2, Oracle, Sybase, SQLServer, …

  12. Rotate Your Thinking 90 Degrees • Column stores read only the columns required • Not all of them • Compression works better • By a factor of 2-3 against the elephants • No record headers • Which are big ticket items • No padding to byte or word boundaries

  13. Benchmark Summary • Vertica has been baked off about 30 times • Typically against the incumbent • Has yet to win by less than a factor of 30 against a row store • Beats most other column stores by around 10X • KX is the only system to come within an order of magnitude

  14. Maybe Elephants are Good at OLTP…… • OLTP is a main memory market • Not a disk-based one • Transactions are short and have no I/O or user stalls • Run to completion (single threaded) • Disaster Recovery (and HA) a requirement • Build it into the bottom of the system

  15. TPC-C Performance on a Low-end Machine • Elephant • 850 TPS (1/2 the land speed record per processor) • H-Store (so far – a university prototype) • 70,416 TPS (41X the land speed record per processor) Factor of 82!!!!!

  16. Implications for the Elephants • They are selling “one size fits all” • Which is 30 year old legacy technology that is good at nothing

  17. Pictorially: Streaming data DBMS apps OLTP Data Warehouse

  18. The DBMS Landscape – Performance Needs Streaming data high low high high OLTP Data Warehouse

  19. One Size Does Not Fit All -- Pictorially Elephants get only “the crevices” Streambase Open source H-Store successors Vertica

  20. Thank You Member London Office107-111 Fleet StreetLondon EC4A 2ABUnited Kingdom+44 (0)20 7936 9050 Corporate Headquarters181 Spring StreetLexington, Massachusetts 02421+1 866 STRMBAS+1 866 787 6227+1 781 761 0800 Reston, Virginia Office11921 Freedom Drive, Suite 550 Reston, VA 20190+1 703 608 6958 New York City Office220 West 42nd Street, 20th FloorNew York, New York 10036+1 866 STRMBAS+1 866 787 6227

More Related