240 likes | 328 Views
Data Management in the Cloud “Consistency Rationing in the Cloud: Pay Only When It Matters” Authors: Tim Kraska , Martin Hentschel , Gustavo Alonso, and Donald Kossmann VLDB ’09, August. Latasha A. Gibbs CSCE 824 – Secure Database Systems Spring 2013 University of South Carolina. AGENDA.
E N D
Data Management in the Cloud“Consistency Rationing in the Cloud: Pay Only When It Matters”Authors: Tim Kraska, Martin Hentschel, Gustavo Alonso, and Donald KossmannVLDB ’09, August Latasha A. Gibbs CSCE 824 – Secure Database Systems Spring 2013 University of South Carolina
AGENDA • INTRODUCTION • USE CASES • CONSISTENCY RATIONING • POLICIES • IMPLEMENTATION • SUMMARY • FUTURE WORK & QUESTIONS
INTRODUCTION • Promise of high scalability and low cost • Existing solutions differ in the level of consistency provided • Implement database-like facilities on top of cloud storage • High consistency means high cost per transaction • Lower consistency is cheaper • Not all data needs to be treated with the same level of consistency
AT WHAT PRICE? • CONSISTENCY LEVEL is measured in terms of the number (#) of service calls needed to enforce CONSISTENCY LEVEL
A & B • Category A – Serializability • Expensive in in terms of monetary costs and performance • Serializability is provided via 2PL • Data should be put in Category A when up-to-date views are a must • Category B – Adaptive • Level of consistency depends on situation • Switches between session consistency and serializability at runtime • Policies are designed to make the switch automatic and dynamic
C • Category C – Session Consistency has been identified as the minimum consistency level that does not result in excessive complexity for the developer • After some time the system will converge and become eventually consistent • Session consistency is cheap • Permits extensive caching • When inconsistencies cannot occur, cloud databases should place data in C category
AGENDA • INTRODUCTION • USE CASES • CONSISTENCY RATIONING • POLICIES • IMPLEMENTATION • SUMMARY • FUTURE WORK & QUESTIONS
USE CASES CONTINUED… • Strategy based on update frequency • Selection of consistency protocol is based on the likelihood of conflicts • Parts of the document that are updated frequently would be handled by strong consistency guarantees in for instance (Category A) Collaborative Editing
CONSISTENCY RATIONING? Since strong consistency is expensive… • Use the analysis of categories A, B, and C to categorize the data • Apply different consistency strategies for each category INCONSISTENCY COST TRANSACTION COST
AGENDA • INTRODUCTION • USE CASES • CONSISTENCY RATIONING • POLICIES • IMPLEMENTATION • SUMMARY • FUTURE WORK & QUESTIONS
POLICIES Five different policies are created to adapt the consistency guarantees for data items in Category B • General Policy • Time Policy • Fixed Threshold Policy • Demarcation Policy • Dynamic Policy
GENERAL POLICY • Works on the basis of conflict probability • Looks into the probability of conflict on a given data item and switches to serializability if probability is high enough • Probability of conflicting update on a record is given by the formula below:
AGENDA • INTRODUCTION • USE CASES • CONSISTENCY RATIONING • POLICIES • IMPLEMENTATION • SUMMARY • FUTURE WORK & QUESTIONS
HOW IS DATA ABOUT DATA USED? • Each collection contains meta data about its type • Given the collection a record belongs to, the system checks to see which consistency level should be enforced • For example, if a record is classified as A data, serializability with strong guarantees is performed • For B data, meta data contains the name of the policy and additional parameters
EXPERIMENTS • Database hosted on S3, clients connect to the database via applications servers that run on Amazon’s EC2 • Based on the TPC-W benchmark • Relax the requirement for strong consistency guarantees • All experiments were scheduled to run for 300 seconds and were repeated 10x • Consistency categories were assigned to the data types of the TPC-W benchmark for A data, C data, and Mixed data
AGENDA • INTRODUCTION • USE CASES • CONSISTENCY RATIONING • POLICIES • IMPLEMENTATION • SUMMARY • FUTURE WORK & QUESTIONS
SUMMARY • It is possible to assign a very precise monetary cost to consistency protocols • Optimization is based on allowing the database to exhibit inconsistencies if it helps to reduce the cost of a transaction and does not cause higher penalty costs • Consistency Rationing lowers overall cost and improves performance in cloud-based database systems • Step towards probabilistic consistency guarantees… “One small step for man, one giant leap for mankind.” –Neil Armstrong
FUTURE OUTLOOK • Faster statistical methods • Automatic optimization • New policies • Implementation on other platforms • Emergency rationing
REFERENCES [1] Abadi, Daniel, J. “Data Management in the Cloud: Limitations and Opportunities”. In IEEE Data Engineering Bulletin, 32 (1) Yale University. 2009. [2] Coy, Steven P. “Security Implications of the Choice of Distributed Database Management System Model: Relational vs. Object-Oriented”. University of Maryland. 2008. [3] Niccolai, James. “Four Companies Rethink Databases for the Cloud”. Computer World. 23 June. 2011. Web. 16 February 2013. [4] Abbadi, Amr El, Agrawal, Divyakant, and Das, Sudipto. “Big Data and Cloud Computing: Current State and Future Opportunities”. In the Proceedings ofEDBT 2011, ACM March 22-24, 2011. [5] Valduriez, Patrick. “Principles of Distributed Data Management in 2020?”. DEXA’11 In the Proceedings of the 22nd International Conference on Database and Expert Systems Applications. Volume 1. [6] Lu, Yanbin and Tsudik, Gene. “Privacy-Preserving Cloud Database Querying”. In the Journal of Internet Services and Information Security (JISIS). Vol.1. No. 4, November 2011. [7] http://www.hadoop.apahe.org/ [8] http://www.relationalcloud.com/ [9] Curino, C., Jones, E., Popa, R., Malviya, N., Wu, E. Madden, S., Balakrishnan, H., and Zeldovich, N. “Relational Cloud: A Database-as-a-Service for the Cloud”. In the Proceedings of the 5th Biennial Conference on Innovative Data Systems Research. January 2011. [10] Özsu, M. Tamer and Valduriez, Patrick. Principles of Distributed Database Systems. New York: Pearson Education, Inc., 2011. Print. [11] Alonso, Gustavo, Hentschel, Martin, Kossmann, Donald, and Kraska, Tim, “Consistency Rationing: Pay Only When It Matters”. In the Proceedings of International Conference of Very Large Databases (VLDB). 2009.