240 likes | 340 Views
WHAT THE MARKET-LEADING DBMS VENDORS DON’T WANT YOU TO KNOW Disruption is gathering steam. Curt Monash. Analyst since 1981 Covered DBMS since the pre-relational days Also analytics, search, etc. Own firm since 1987 Publicly available research
E N D
WHAT THE MARKET-LEADING DBMS VENDORS DON’T WANT YOU TO KNOW Disruption is gathering steam
Curt Monash • Analyst since 1981 • Covered DBMS since the pre-relational days • Also analytics, search, etc. • Own firm since 1987 • Publicly available research • Blogs, including DBMS2 (www.dbms2.com -- the source for most of this talk) • Feed at www.monash.com/blogs.html • White papers and more at www.monash.com
Database diversity • Mike Stonebraker, PhD • “One size doesn’t fit all” • Curt Monash, PhD • “Horses for courses” • “Database diversity” • Mike and Curt • The world needs 9 to 11 different kinds of data management software
The case for grand integrated DBMS • Theoretical relational model has great advantages • Actual relational DBMS are versatile and modular • Software developers have economies of scale • Vendor consolidation theoretically saves effort and money • So does database consolidation
The case for database diversity • Different kinds of data require fundamentally different kinds of data management software • Putting all that together in one system is extremely hard • Nobody has ever done it well
Application and use cases • High-end e-commerce • 100-terabyte analytics • High-volume call center • Media-heavy web startup • Simple departmental application • General enterprise or SaaS app • End-user or ISV
Data management distinctions • Fundamental • Data manipulation language • Data access method • Practical • Type of data • Type of hardware • Administrative burden • Performance stresses and metrics
Major components of DBMS cost • License and maintenance • Especially maintenance • Hardware, power, facilities • Mainly for VLDB analytics • Installation and ongoing administration • Time-to-benefit is a factor too • Programming • Sometimes a differentiator
11 kinds of data management software • High-end OLTP/general-purpose DBMS • Mid-range OLTP/general-purpose DBMS • Row-based analytic RDBMS • Column- or array-based analytic RDBMS • Text search engines • XML and OO DBMS (but these may merge with search) • RDF and other graphical DBMS (but these may merge with relational) • Event/stream processing engines (aka CEP) • Embedded DBMS for devices • Sub-DBMS file managers (e.g. MapReduce/Hadoop) • Science DBMS
High-end OLTP/general-purpose DBMS • Oracle, DB2, MS SQL Server, et al. • Amazing throughput and scale-up • Bullet-proofing • 24/7 • Security certifications • Datatype extensibility • Expensive, expensive, expensive
Mid-range OLTP/general-purpose DBMS • Three main groups • Crippled high-end (“Express” editions) • ISV/VAR-focused (Progress, several non-relational) • Open source-based (Postgres, MySQL) • Some are comparable to (or better than) the systems that ran the world in the 1990s • What does the Postgres family still lack? • Generally inexpensive
Row-based analytic RDBMS • Data warehouses should be in separate instances • But that’s not enough • Sequential vs. random reads • MPP vs. SMP • Teradata, Netezza, DATAllegro
Column- or array-based analytic RDBMS • Retrieving whole rows carries penalties • I/O • Optimization • Columnar is better • But not in all use cases • MOLAP may be superceded
Text search engines • “85% of all information is in text” … • … and 16.9% of all statistics are made up out of thin air • There really are a lot of words out there • And search interfaces are hugely important • Text search has its own data access methods • May play more nicely with columnar than row-based RDBMS • Watch integrations with other analytic datatypes • Attivio (relational, a little XML) • Mark Logic (a lot of XML)
XML and OO DBMS • Reasons for logical XML structures • Schema flexibility • Dressed-up text • XML is the transport format, and it’s too complex to unpack • The data came from neither an RDMS nor text store in the first place • Native XML data access methods • Like text and object • So far mainly in niches
RDF and other graphical DBMS • “Semantic web” is overhyped … • … but the world DOES need ontology management systems • Much depends on path length • Analytic RDBMS may do the job
Event/stream processing engines • Design point = super-low latency … • … but there are other applications • Data is “executed against” queries rather than vice versa • Could be the future of BI … • … and of social networking
Embedded DBMS for devices • Products • Sybase SQL Anywhere • solidDB – focused on caching post-acquisition? • Cloudscape – vaporized? • McObject – tiny startup • Features • Load-and-forget • Zero-DBA • Small-footprint • Sometimes -- subsettable library
Matching analytic DBMS to use cases • 100 Tb data mart • 50 Tb enterprise data warehouse • 5 Gb – 5 Tb OLTP offload
Matching OLTP/general DBMS to use cases • Market leader • High-end e-commerce • High-volume call center • Mid-range • Web startup • It depends on how locked-in you are • Simple departmental application • General enterprise or SaaS app
Clayton Christensen’s “disruption” narrative • Market leaders have many advantages, including top technology. • Followers come up with good technology too. • The leaders stay ahead by making their products ever better and more complex. • The followers sell into new or non-mainstream markets, at prices the leaders can’t match. So they dominate new markets. • Old markets turn into low-margin commodity-fests. • Unless they diversify, old leaders are doomed.
That’s what’s happening here • Much DBMS complexity is without benefit • Other complexity only benefits a few high-end customers • Data warehouse specialists exploit radically superior technology (e.g., MPP) • Open source vendors have radically different price points and business models • Open source adoption has been strongest in non-traditional markets.
And the big vendors know it • Oracle is diversifying furiously • Oracle has announced a clear focus on top-end customers • IBM is obviously focused on the high end too • Oracle and (to some extent) IBM are buying alternative DBMS technologies • Microsoft and IBM aren’t dependent on the DBMS business anyway