160 likes | 179 Views
From Startup to Enterprise. A Story of MySQL Evolution. Vidur Apparao, CTO Stephen O’Sullivan, Manager of Data and Grid Technologies April 2009. Data Size Growth. The data volume for a majority of companies increases 50-100% every year. (IDC) Common Solutions: Rely on Moore’s Law
E N D
From Startup to Enterprise A Story of MySQL Evolution Vidur Apparao, CTO Stephen O’Sullivan, Manager of Data and Grid TechnologiesApril 2009
Data Size Growth • The data volume for a majority of companies increases 50-100% every year. (IDC) • Common Solutions: • Rely on Moore’s Law • Spend more money • But there are other ways… 2009 LiveOps, Inc.
About LiveOps • Technology Platform for Contact Centers • On-Demand, Multi-tenanted Contact Center Platform • Virtual Call Center of 20,000 independent home agents • Eight years of continuous growth • Founded in 2000 • Profitable since 2006 • 300 employees 2009 LiveOps, Inc.
LiveOps’ Data • Main data classes: • Configuration data – low GBs, slow change • Logging data – low TBs, ever increasing • System state – MBs, rapid change • Customer-specific data – high GBs, versioned • Largest table has 1.4 billion rows • Tenant key as a column on all tables • Multi-site deployment for high availability Web Applications Telephony Applications Configuration Data Transaction data Session data Configuration Tools Reporting Tools Monitoring Tools 2009 LiveOps, Inc.
Phase 1: The Basic Model • Application servers connecting to a single DB • Replication to a slave for backup & load balancing MySQL Replication R/W Master Slave/Backup Web & Telephony Applications 2009 LiveOps, Inc.
Primary Drivers for Change Scale Availability Performance
Options for Improving (Write) Scale • Sharding • Partition data into distinct databases based on a sharding key • Functional Segmentation • Separate functional data classes into distinct databases • MySQL Partitioning LiveOps choice: Sharding, Functional Segmentation 2009 LiveOps, Inc.
Options for Improving (Query) Performance • Replication & Load balancing • Distribute query load across multiple replicants • Separation of DB roles • Separate fast from slow, OLTP from OLAP • Caching • Reduce dependency on the database for queries • Consistent query tuning/optimization LiveOps choice: Load balancing, separation of roles 2009 LiveOps, Inc.
Options for Improving Availability • Application resilience • Remove requirement of direct write access and/or degrade gracefully • MySQL Cluster • Multi-master replication • Split tables or databases between ring replicating masters • DRBD or SAN HA LiveOps choice: Application resilience, multi-master 2009 LiveOps, Inc.
Phase 2: A Pure MySQL Solution Monitoring DB Monitor/ Load Balancer Queries Config. Push Logging Writes Queries Web & Telephony Applications Config + Session Reporting and Analytics Writes R/W Masters 2. Functional segmentation of data between multiple masters. 1. Data writers use store-and-forward pattern for fault tolerance. 3. Multi-master replication and quick recovery processes on master failure. 5. Load balancing using DB monitoring and pushed configuration files. 4. Replication to a farm of read-only replicants within and across data centers. 6. Separation of DB roles based on type and cost of query. Read-only Replicants w/ Roles All of these techniques still don’t get us to horizontal data scalability 2009 LiveOps, Inc.
Horizontal Scalability Options • Distributed Storage Systems • DFS for unstructured file storage • BigTable/HBase for structured data storage • Various vendors with distributed RDBMSs • Grid Processing • MapReduce and Hadoop 2009 LiveOps, Inc.
Our Approach • Take logging data out of the transactional databases • Reduce replication load • Store logging data as text files in a DFS • Use MapReduce for ETL into OLAP databases • Leverage open source tools like ActiveMQ and Hadoop 2009 LiveOps, Inc.
Phase 3: MySQL w/ Horizontal Scalability Reporting and Analytics R/W Masters Read-only Replicants Data Marts Web & Telephony Applications Audit Process Broker Map Reduce ActiveMQ Brokers Repository Process 5. Hadoop as MapReduce system for ETL. 3. Log files moved via ActiveMQ to a log repository and DFS. 2. Logging data now written to log files on local disk. 1. MySQL continues as OLTP solution. 6. MySQL as a data mart. 4. Audit process to reconcile data between log files and DFS. Backup Storage Array DFS Horizontal scalability is now in reach! Hadoop 2009 LiveOps, Inc.
Learned Best Practices • Know your data • Build and enforce a data access layer • Put your data only where you need it • Experiment early and often 2009 LiveOps, Inc.
Conclusions • MySQL, other open source technology, and commodity hardware can be used to build a horizontally scalable data solution • Companies today are left to chart their own evolutionary paths • Collaboration and communication between companies in this area can help everyone 2009 LiveOps, Inc.
Thank You. Vidur Apparao, vidur@liveops.com Stephen O’Sullivan sosullivan@liveops.com