270 likes | 761 Views
Relational Cloud: A Database as a Service for the Cloud. What we will be discussing today related to relational cloud? 1)Efficient multi tenancy. 2)Scalability 3)Privacy Why cloud computing?. Why cloud?. Attractive for 3 reasons: 1)Lower Hardware and energy costs
E N D
Relational Cloud: A Database as a Service for the Cloud What we will be discussing today related to relational cloud? 1)Efficient multi tenancy. 2)Scalability 3)Privacy Why cloud computing?
Why cloud? • Attractive for 3 reasons: 1)Lower Hardware and energy costs 2)The cost incurred proportional to actual usage-Both to software licensing and administration tools. 3)provides a better(upto 100% utilization) of hardware resources.
Classical Database Architecture Donald Kossmann,Tim Kraska,Simon Loesing:An Evaluation of Alternative Architectures for Transaction Processing in the Cloud.
Distributed Databases • Database is logically partitioned into each partition and that controlled by a separate database server. • Partitioning, however, has scalability limitations with regard to dealing with a fluctuating workload. • To achieve better scalability and fault tolerance, partitioning +replication.
key technical features of Relational Cloud • A workload-aware approach to multi-tenancy that identifies the workloads • The use of a graph-based data partitioning algorithm to achieve scalability. • An adjustable security scheme that enables SQL queries to run over encrypted data
Efficient multi-tenancy • Multitenancy :a single instance of the software runs on a server, serving multiple client (tenants). • VM approach: pack each individual DB instance into a VM and multiple VMs on a single physical machine • Disadvantage. • Paper uses a single database server on each machine, which hosts multiple logical databases.
Elastic scalability • Whenever a resource is overloaded what will you do? • support scale-out • Data is partitioned amongst multiple nodes to achieve higher throughput. • Workload-aware partitioner uses graph portioning
System Design • Partition each database into one or more nodes, when the load on a database exceeds the capacity of a single machine. • Place the database partitions on the back-end machines .load the Database ,migrate and replicate the data for availability. • secure the data and process the queries.
Data Partitioning • Two purposes: • to scale a single database to multiple nodes • to enable more granular placement. • Relational Cloud uses a workload-aware partitioning strategy.
Schism • a novel graph-based, datadriven partitioning system for transactional workloads. • a database and its workload using a graph, where tuples nodes and transactions edges connecting the tuples • Partitioning the graphminimum-cut partitioning of the graph into k partitions.
Graph Representation Schism: A Workload Driven Approach to Database Replication and Partitioning.
Partitioning and replication • Example: consider the tuple (1, carlo, 80k) from Figure. This tuple 3 transactions and is therefore represented by 4 nodes. • This cost is the number of transactions that update the tuple in the workload (2)
PLACEMENT AND MIGRATION • Resource allocation is a major challenge Problems include: • monitoring the resource requirements of each workload, predicting the load multiple workloads will generate when run together on a server. • Solution: kairos(monitoring and consolidation engine )
PRIVACY • secrecy of data in an SQL database • This paper presents CryptDB, provides provable privacy guarantees without having to trust the DBMS server or the DBAs who maintain and tune the DBMS
CryptDB • data in an encrypted format, and execute SQL queries over encrypted data without having access to the decryption keys. • adjustable query-based encryption in an onion of encryptions, from weaker forms of encryption to stronger forms of encryption that reveal no information
Adjustable query-based encryption • Start out the database with the most secure encryption scheme • Adjust encryption dynamically • Strip off levels of the onions:
Onions of encryptions RND DET RND SEARCH OPE JOIN OPE-JOIN HOM Any value Any value int value Onion 1 Onion 2 Onion 3 CryptDB: A Practical Encrypted Relational DBMS
Different Techniques • randomized encryption (RND)-> maximum security • deterministic encryption (DET)->weaker privacy, however allows server to check for equality • order-preserving encryption (OPE)->even more relaxed in that it enables inequality checks and sorting operations • homomorphic encryption (HOM)->enables operations over encrypted data such as additions,etc.
Implementation SQL Interface Server • No change to the DBMS • Should work on most SQL DBMS CryptDB PK tables Unmodified DBMS Query Encrypted Query Frontend CryptDB UDFs (user-defined functions) Encrypted Results Results
Example RND DET DET emp: SELECT * FROM emp WHERE salary = 100000 SEARCH rank name salary JOIN Any value UPDATE table1 SET col3onion1 = DecryptRND(key, col3onion1) SELECT * FROM table1 WHERE col3onion1 = x5a8c34
Current progress • various components of Relational Cloud and are in the process of integrating them into a single coherent system, prior to offering it as a service on a public cloud. • implemented the distributed transaction coordinator along with the routing, partitioning, replication, and CryptDB components • developed a placement and migration engine that monitors database server statistics, OS statistics, and hardware loads, and uses historic statistics to predict the combined load placed by multiple workloads.
Conclusions • Provided an efficient multi-tenancy system ,a graph-based partitioning method to spread large databases across many machines. For privacy, we developed the notion of adjustable privacy. • Demonstrating an integrated prototype at CIDR 2011.