1 / 28

NoSQL with MySQL

NoSQL with MySQL. Yekesa Kosuru Distinguished Architect, Nokia Peter Zaitsev, Percona. Agenda. Part 1 : Problem statement Part 2 : Background (What is, Why, ACID, CAP) Part 3 : Building a Key-Value Store Part 4 : NoSQL with MySQL. Part 1: Problem statement. NoSQL : No Formal Definition.

telyn
Download Presentation

NoSQL with MySQL

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. NoSQL with MySQL Yekesa Kosuru Distinguished Architect, Nokia Peter Zaitsev, Percona

  2. Agenda • Part 1 : Problem statement • Part 2 : Background (What is, Why, ACID, CAP) • Part 3 : Building a Key-Value Store • Part 4 : NoSQL with MySQL

  3. Part 1: Problem statement

  4. NoSQL : No Formal Definition • No Formal Definition • Not Only SQL ? • Don’t use SQL ? • Limit expressive power ? • Underlying issues ? • ACID (Strong Consistency ?, Isolation ?) • SQL • Something else ? • Facts • Hardware will fail • Software will have bugs • Growing traffic, data volumes, unpredictable • No downtime or expose errors to users

  5. Problem to solve • Key Value Store Problem Statement • A consumer facing system with foll. characteristics : • Host large volumes of data & queries • Responsive with predictable performance • Always available (survives failures, software/hardware upgrades) • Low cost of ownership • Minimal administration • Availability : How to service customers (24 X 7) despite failures, upgrades • Scalability : How to scale (capacity, transactions, costs…) • Predictability : Predictable responsiveness • Not suited for all applications

  6. Part 2: BACKGROUND

  7. ACID • Transactions simplify job of developers • Client/developer point of view • Atomicity: Bundled operations • All of the operations (multiple updates, deletes, inserts) in the transaction will complete, or none will • Consistency: The database is in a consistent state when the transaction begins and ends • Referential integrity, and other constraints • All reads must see latest data • Isolation: The transaction will behave as if it is the only operation being performed upon the database. Serialized updates i.e locks • Durability: Upon completion of the transaction, the operation will not be reversed. • ACID in single database is efficient

  8. RequestCommit Prepare Prepare Prepare Prepare Prepare Prepare Prepare Commit Prepare Prepare Prepare Prepared Issues Scaling DBMS • Databases extend ACID to span multiple nodes using 2PC • Shared disk & shared nothing architectures • Cost of 2PC (N nodes involved in transaction) • Consistency & Isolation over 2PC makes it expensive • 3N+1 Message Complexity • N+1 Stable Writes • Locks – lock resources • Availability is a pain point • 2PC is blocking protocol , uses single coordinator and not fault tolerant • Coordinator failure can block transaction indefinitely • Blocked transaction causes increased contention • Rebalancing • Upgrades • Total Cost of Ownership

  9. CAP • Introduced by Prof. Brewer in year 2000 • Consistency, Availability, (Network) Partition Tolerance • Industry proved that we can guarantee two out of three • System point of view • Consistency: Defines rules for the apparent order and visibility of updates • Variations on consistency • Eventual Consistency - eventually replicas will be updated – order of updates may vary - all reads will eventually see it • Availability : All clients can read & write data to some replica, even in the presence of failures • Partition-tolerance : operations will complete, even if network is unavailable i.e disconnected network

  10. Part 3: Building key-value store

  11. Important Considerations • Functional • Simple API ( Get, Put, Delete - like Hashtable) • Primary key lookups only (limit expressive power) • No joins, No non-key lookups, No grouping functions, No constraints … • Data Model (Key-Value, Value = JSON) • Read-Your-Write Consistency • SLA • Highly Available & Horizontally Scalable • Performance (low latencies & predictable) • Uniform cost of queries • Operational • Symmetric, Rolling Upgrades, Backups, Alerts, Monitoring • Total Cost of Ownership (TCO) • Hardware, Software, Licenses …

  12. Making it Available • Make N replicas of data • Use Quorum (R + W > N) • Overlap R & W to make it strongly consistent • Handle Faults (partition tolerance and self heal) • Real faults • Machine down, NIC down … • Eventual Consistency (data) • Prefer A & P in CAP • If a node is down during write, read repair later – let write proceed • Intermittent faults • X and Y are clients of Z; X can talk to Z but Y can’t • Server does not respond within timeout, stress scenarios … • Inconsistent cluster view can cause inconsistencies in data • If two concurrently writing clients have different views of cluster, resolve inconsistency later - let write proceed N = Number of replicas R = Number of Reads W = Number of Writes R + W > N 2 + 2 > 3

  13. Making it Scalable • Sharding (or partition data) works well for Capacity • Large number of logical shards distributed over few physical machines • How to shard • Consistent Hashing • Hash(key) say “2”, move clockwise to find nearest node “B” • Hash to create a uniform distribution • Allows for adding of machines • Without redesign of hashing • Without massive movement of data • Load balancing • If a machine can’t handle load, move shards to another machine A,B,C = nodes 1,2,3,4 = keys

  14. Vector Clocks • Determine history of a key-value, think data versioning • Need to interpret causality of events, to identify missed updates & detect inconsistencies • Timestamps are one solution, but rely on synchronized clocks and don't capture causality • Vector clocks are an alternative method of capturing order in a distributed system • Node (i) coordinating the write generates a clock at the time of write • Construct: ( nodeid : logical sequence) • Node A receives first insert : A:1 • Node B receives an update : A:2 • Clock: (A:1) is predecessor of (A:1, B:1) • Clock: (A:2) and (A:1, B:1) are concurrent • Reads use vector clocks to detect missed updates & inconsistencies

  15. Server Node C Server Node B Client Server Node A Eventual Consistency Example 3/2/2 3 replicas Application PUT V1 A:1 Write V1 A:1 Write V1 A:1 Write PUT NODE 1 DOWN PUT Write V2 (A:1,B:1) V2 (A:1,B:1) PUT NODE 1 UP Outage Upgrade GET Read V1 A:1 Detect Missing Update Read V2 (A:1,B:1) V2 (A:1,B:1) Read GET V2 (A:1,B:1) Read Repair V2 (A:1,B:1) Missed update added

  16. PART 4: nosql with mysql

  17. NoSQL with MySQL • Key-Value store implemented on top of MySQL/ InnoDB • MySQL serves as persistence layer, Java code handles the rest • Atomicity limited to one operation (auto commit), Consistency & Isolation limited to one machine (recall issues with 2PC) • Why MySQL ? • Easy to install, tune and restore • Fast, Reliable, Proven • Performance : • MVCC, row level locking • Buffer pool, insert buffering, direct IO • Data Protection & Integrity : • Automatic crash recovery • Backup, Recovery (full and incremental) • Leverage expertise across ACID and BASE systems

  18. Implementation • InnoDB Engine (innodb_file_per_table) • Simple schema • Auto increment PK • App Key, Value • Metadata • Hash of app key (index) • Clustered Index and Secondary Index • KV use simple queries – no joins or grouping functions • Database is a shard (pick right number of shards) • Backup & Recovery

  19. Tuning • Queries are so simple that tuning is focused on IOPS, Memory, TCP, Sockets, Timeouts, Connection Pool, JDBC, JVM , Queuing etc • Memory to disk ratio • Connector/JuseServerPrepStmts=true useLocalSessionState = truecachePrepStmts = truecacheResultSetMetadata = truetcpKeepAlive=truetcpTrafficClass=24autoReconnect=truemaxReconnects=...allowMultiQueries=trueprepStmtCacheSize=…blobSendChunkSize=…largeRowSizeThreshold=…locatorFetchBufferSize=… • CentOS 5.5, XFS File System

  20. Scalability Test (TPS/IOPS) EC2’s Characteristics m1.xlarge image User Dataset = 256GB (240M 1K values) Machines = 5 Spindles=4 per box XFS File System Buffer Pool=12GB/machine Memory:Disk=1:4 Workload R/W=100/0 Random Access Pattern N/R/W=1/1/1 Incremental load to 100 threads T P S Client Threads

  21. Know Your IOPS

  22. Performance Test Characteristics User Dataset = 900GB Machines = 9 Spindles=8 (15K)/machine XFS File System (RAID0) Buffer Pool=16GB/machine Memory:Disk=1:6 Cache=16% Workload R/W=100/0 Recency Skew i.e Recent Access=80/20 N/R/W=3/2/2 Incremental load to 10K (TPS) S E C S TPS

  23. Tips By Peter Zaitsev Percona

  24. Why XFS

  25. MySQL for NoSQL • Tune for Simple Queries • Memory is the most important • innodb_buffer_pool_size • Are we looking at intensive writes ? • innodb_log_file_size • RAID with BBU • XFS + O_DIRECT (avoid per inode locking) • Restrict Concurrency on database • Connection pool size or innodb_thread_concurrency

  26. More on Scalability • Is contention problem • Use MySQL 5.5 or Percona Server • Tune innodb_buffer_pool_instances • Index update contention • Use Multiple Tables • Or MySQL Partitions if you can't • Eliminate Query Parsing Overhead • Consider HandlerSocket • NoSQL Interface to InnoDB Storage Engine

  27. Work on Result Stability • Ensure you're getting stable results • InnoDB Flushing can cause non uniform performance

  28. THANK YOU Yekesa.kosuru@nokia.com

More Related