DOLLY: Virtualization-Driven Database Provisioning for the Cloud

VEE 2011 paper available at: http://lass.cs.umass.edu/papers.html DOLLY: Virtualization-Driven Database Provisioning for the Cloud Emmanuel Cecchet Joint work with Rahul Singh, Upendra Sharma and PrashantShenoy

PROVISIONING IN THE CLOUD • Virtualization • No shared disk • Provisioning based on volume and machine load Internet Frontend/ Load balancer App. Servers Provisioning logic Databases

WHY IS IT HARD TO ADD A DB REPLICA? Administration issues Install/setup new database Backup/Restore database content Configure new replica Synchronize live content How much time does it take? Depends on the database size Depends on the backup/restore technique From minutes to hours… Dolly: VM cloning to blackbox the database

Dolly • Database replication in the Cloud • Provisioning with Dolly • Prototype & Evaluation

SPAWNING A REPLICA Multi-master middleware-based replication Three main phases Backup Restore Replay add replica 1 4 resynchronize 2 3 restore snapshot backup Client SQL requests Management console Replication middleware Transactional log Load balancer new replica DB1 DB2

SPAWNING A REPLICA WITH CLONING Backup & Restore replace by VM cloning VM3 VM3 DB2 DB2 OS OS 4 add replica resynchronize 1 VM1 VM2 2 3 clone clone OS OS Client SQL requests Management console Replication middleware Transactional log Load balancer DB1 DB2 3

Cloning: Backup/Restore in constant time • Filesystem snapshot/copy is DB agnostic • Only depends on VM size

MODELING SPAWNING TIME Predictable backup and restore times are required Replay time can be estimated from current write throughput (wt) replica spawning time backup restore replay time ri bi updates

WHEN TO SNAPSHOT? Time to spawn from a live replica Time to spawn from an existing snapshot Faster to take a new snapshot j to spawn a new replica than using old snapshot i if:backupj+restorej<restorei +replayi

DOLLY OVERVIEW Input capacity prediction write prediction Output schedule of snapshots schedule of replica spawning admission control if needed Dolly write predictions capacity predictions Snapshot scheduler Capacity Provisioning HA adjuster Spawning options Write throttling Paused pool cleaner Scheduler start/stop clone/ snapshot write throttling/ read throttling delete VM/ snapshot Predictors Admission Control Management API Monitoring Free pool Manager reclaim

PROVISIONING REPLICAS • Dolly does not provide predictors • Dolly can work with any predictor (see [Eurosys09]) Workload prediction Capacity prediction Write prediction

CLOUD COST FUNCTIONS Adapt the provisioning decisions to the cloud platform specifics Cost can be $ on public cloud or time on private cloud

PROVISIONING REPLICAS Parse capacity provisioning predictions Decrease capacity by pausing VMs Increasing capacity Check if we can reuse a paused VM Check if we can spawn from an existing snapshot Choose cheapest options according to spawn_costfunction Perform admission control if all replicas cannot be provisioned in time

SNAPSHOT SCHEDULING How to snapshot? Clone a paused VM Pause an active VM to clone it When to snapshot? At time j whenbackupj+restorej<restorei+replayi If new snapshot is scheduled, re-run capacity provisioning Prediction window must have minimum size

IMPLEMENTATION C-JDBC/Sequoia replication middleware OpenNebula Cloud management middleware Cost functions private cloud: minimize resource utilization time Amazon EC2: minimize cost VM4 VMclone VM5 VM3 VM2 VM1 OS OS OS OS OS OS Dolly Private EC2 Recovery Log Dump table Log table DB3snapshot New replica New replica admission control TPC-W load injector predictions Sequoia driver write throttling SQL requests add/remove replica snapshot/pause/… Sequoia controller JMX Management API Scheduler Backupers Dolly OpenNebula Loadbalancer start/stop/ clone/… DB1 DB2 DB3 OpenNebula clone clone Backup server or NAS

IMPLEMENTATION – COST FUNCTIONS Private cloud: minimize resource utilization Amazon EC2: minimize cost

WORKLOAD DESCRIPTION TPC-W online bookstore e-commerce benchmark Snapshot s0 available at t0

Overprovisioning with 6 replicas – 1h snapshot

Reactive provisioning – 2h snapshot replicas available replica spawning triggered here

Dolly – 30m Prediction Window Private Cloud Amazon EC2 s2 s1 cheaper to leave instances online

VM cloning Solves administration issues by blackboxing the database Constant time backup/restore needed to predict replica spawning time New provisioning algorithm Decouples capacity provisioning from snapshot scheduling Cost functions to optimize for cloud platform specifics CONCLUSION VEE 2011 paper available at: http://lass.cs.umass.edu/papers.html

Bonus Slides

Reactive provisioning – 15m snapshot

Reactive provisioning – 1h snapshot

Dolly – 10m Prediction Window Private Cloud Amazon EC2 s1 s1 s2 s2

SPAWNING IN A PUBLIC CLOUD Storage decoupled from computing resource Starting a new instance clones the volume Vol2 Vol4 Vol2 Vol3 DB2 DB2 DB2 DB2 DB1 DB1 OS OS OS OS DB1 snapshot stop register start restart Vol1 Vol1 Vol1 OS OS OS

BACKUP/RESTORE TECHNIQUES Database native tools Vendor specific or 3rd party ETL Understand database semantics Filesystem copy Low-level data copy Need to know what to copy VM cloning Copies database content + configuration + OS Unused space can be compressed

DATABASE SIZES

BACKUP/RESTORE PERFORMANCE (1/3) Performance depends on database content

BACKUP/RESTORE PERFORMANCE (2/3) • File copy is the most effective for small databases

BACKUP/RESTORE PERFORMANCE (3/3) • VM cloning most effective on large databases

BACKUP/RESTORE SUMMARY

DOLLY MAIN ALGORITHM Capacity provisioning depends on available snapshots Snapshots scheduled according to capacity demand Decouple capacity provisioning from snapshot scheduling if (predictor.capacity_changes || predictor.write_workload_changes) { do { schedule = capacity_provisioning(predictions) snapshot_schedule = snapshot_scheduling(predictions) } while (snapshot_schedule schedules new snapshots) scheduler.schedule(snapshot_schedule) scheduler.schedule(capacity_schedule) } if (time since last operation > threshold) { paused_pool_cleaner.release_old_paused_vms(); paused_pool_cleaner.delete_old_snapshots(); }

RELEASING RESOURCES Paused VMs VM never re-used if cost to resume > cost to spawn from last snapshot Snapshots Old snapshots can be released based on cost to keep them around Free server pool Can reclaim servers with paused VMs when pool is empty

TPC-W EVALUATION Multi-tier online bookstore benchmark 4GB VM for the database Large EC2 instances from EBS volumes with CloudWatch

PHASE 1 – CAPACITY PROVISIONING Pause VMs when capacity decreases Resume VMs when capacity increases Spawn VMs from snapshot when additional capacity required Not cost effective to pause VMs for less than 1 hour with EC2

PHASE 2 – SNAPSHOT SCHEDULING Faster (but more costly) to spawn a new replica and take a new snapshot from it to spawn 3 replicas at d3 Faster to snapshot paused VM and spawn replica from it for d5 Cost of volume storage on EC2 more expensive than IO cost

PHASE 3 – CAPACITY PROVISIONING Schedule new replica spawning using new snapshots Next snapshot scheduling does not generate new snapshots and the algorithm terminates

FINAL SCHEDULING Different scheduling for private and public cloud Minimize resource (energy) usage for private cloud Minimize cost for for public cloud

DOLLY: Virtualization-Driven Database Provisioning for the Cloud