1 / 24

Latasha A. Gibbs CSCE 824 – Secure Database Systems Spring 2013 University of South Carolina

Data Management in the Cloud “Consistency Rationing in the Cloud: Pay Only When It Matters” Authors: Tim Kraska , Martin Hentschel , Gustavo Alonso, and Donald Kossmann VLDB ’09, August. Latasha A. Gibbs CSCE 824 – Secure Database Systems Spring 2013 University of South Carolina. AGENDA.

nikita
Download Presentation

Latasha A. Gibbs CSCE 824 – Secure Database Systems Spring 2013 University of South Carolina

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Management in the Cloud“Consistency Rationing in the Cloud: Pay Only When It Matters”Authors: Tim Kraska, Martin Hentschel, Gustavo Alonso, and Donald KossmannVLDB ’09, August Latasha A. Gibbs CSCE 824 – Secure Database Systems Spring 2013 University of South Carolina

  2. AGENDA • INTRODUCTION • USE CASES • CONSISTENCY RATIONING • POLICIES • IMPLEMENTATION • SUMMARY • FUTURE WORK & QUESTIONS

  3. INTRODUCTION • Promise of high scalability and low cost • Existing solutions differ in the level of consistency provided • Implement database-like facilities on top of cloud storage • High consistency means high cost per transaction • Lower consistency is cheaper • Not all data needs to be treated with the same level of consistency

  4. AT WHAT PRICE? • CONSISTENCY LEVEL is measured in terms of the number (#) of service calls needed to enforce CONSISTENCY LEVEL

  5. A & B • Category A – Serializability • Expensive in in terms of monetary costs and performance • Serializability is provided via 2PL • Data should be put in Category A when up-to-date views are a must • Category B – Adaptive • Level of consistency depends on situation • Switches between session consistency and serializability at runtime • Policies are designed to make the switch automatic and dynamic

  6. C • Category C – Session Consistency has been identified as the minimum consistency level that does not result in excessive complexity for the developer • After some time the system will converge and become eventually consistent • Session consistency is cheap • Permits extensive caching • When inconsistencies cannot occur, cloud databases should place data in C category

  7. AGENDA • INTRODUCTION • USE CASES • CONSISTENCY RATIONING • POLICIES • IMPLEMENTATION • SUMMARY • FUTURE WORK & QUESTIONS

  8. USE CASES CONTINUED… • Strategy based on update frequency • Selection of consistency protocol is based on the likelihood of conflicts • Parts of the document that are updated frequently would be handled by strong consistency guarantees in for instance (Category A) Collaborative Editing

  9. CONSISTENCY RATIONING? Since strong consistency is expensive… • Use the analysis of categories A, B, and C to categorize the data • Apply different consistency strategies for each category INCONSISTENCY COST TRANSACTION COST

  10. AGENDA • INTRODUCTION • USE CASES • CONSISTENCY RATIONING • POLICIES • IMPLEMENTATION • SUMMARY • FUTURE WORK & QUESTIONS

  11. POLICIES Five different policies are created to adapt the consistency guarantees for data items in Category B • General Policy • Time Policy • Fixed Threshold Policy • Demarcation Policy • Dynamic Policy

  12. GENERAL POLICY • Works on the basis of conflict probability • Looks into the probability of conflict on a given data item and switches to serializability if probability is high enough • Probability of conflicting update on a record is given by the formula below:

  13. AGENDA • INTRODUCTION • USE CASES • CONSISTENCY RATIONING • POLICIES • IMPLEMENTATION • SUMMARY • FUTURE WORK & QUESTIONS

  14. IMPLEMENTATION

  15. BASIC PROTOCOL

  16. HOW IS DATA ABOUT DATA USED? • Each collection contains meta data about its type • Given the collection a record belongs to, the system checks to see which consistency level should be enforced • For example, if a record is classified as A data, serializability with strong guarantees is performed • For B data, meta data contains the name of the policy and additional parameters

  17. EXPERIMENTS • Database hosted on S3, clients connect to the database via applications servers that run on Amazon’s EC2 • Based on the TPC-W benchmark • Relax the requirement for strong consistency guarantees • All experiments were scheduled to run for 300 seconds and were repeated 10x • Consistency categories were assigned to the data types of the TPC-W benchmark for A data, C data, and Mixed data

  18. OPTIMIZATION

  19. PENALTIES

  20. AGENDA • INTRODUCTION • USE CASES • CONSISTENCY RATIONING • POLICIES • IMPLEMENTATION • SUMMARY • FUTURE WORK & QUESTIONS

  21. SUMMARY • It is possible to assign a very precise monetary cost to consistency protocols • Optimization is based on allowing the database to exhibit inconsistencies if it helps to reduce the cost of a transaction and does not cause higher penalty costs • Consistency Rationing lowers overall cost and improves performance in cloud-based database systems • Step towards probabilistic consistency guarantees… “One small step for man, one giant leap for mankind.” –Neil Armstrong

  22. FUTURE OUTLOOK • Faster statistical methods • Automatic optimization • New policies • Implementation on other platforms • Emergency rationing

  23. REFERENCES [1] Abadi, Daniel, J. “Data Management in the Cloud: Limitations and Opportunities”. In IEEE Data Engineering Bulletin, 32 (1) Yale University. 2009. [2] Coy, Steven P. “Security Implications of the Choice of Distributed Database Management System Model: Relational vs. Object-Oriented”. University of Maryland. 2008. [3] Niccolai, James. “Four Companies Rethink Databases for the Cloud”. Computer World. 23 June. 2011. Web. 16 February 2013. [4] Abbadi, Amr El, Agrawal, Divyakant, and Das, Sudipto. “Big Data and Cloud Computing: Current State and Future Opportunities”. In the Proceedings ofEDBT 2011, ACM March 22-24, 2011. [5] Valduriez, Patrick. “Principles of Distributed Data Management in 2020?”. DEXA’11 In the Proceedings of the 22nd International Conference on Database and Expert Systems Applications. Volume 1. [6] Lu, Yanbin and Tsudik, Gene. “Privacy-Preserving Cloud Database Querying”. In the Journal of Internet Services and Information Security (JISIS). Vol.1. No. 4, November 2011. [7] http://www.hadoop.apahe.org/ [8] http://www.relationalcloud.com/ [9] Curino, C., Jones, E., Popa, R., Malviya, N., Wu, E. Madden, S., Balakrishnan, H., and Zeldovich, N. “Relational Cloud: A Database-as-a-Service for the Cloud”. In the Proceedings of the 5th Biennial Conference on Innovative Data Systems Research. January 2011. [10] Özsu, M. Tamer and Valduriez, Patrick. Principles of Distributed Database Systems. New York: Pearson Education, Inc., 2011. Print. [11] Alonso, Gustavo, Hentschel, Martin, Kossmann, Donald, and Kraska, Tim, “Consistency Rationing: Pay Only When It Matters”. In the Proceedings of International Conference of Very Large Databases (VLDB). 2009.

  24. QUESTIONS?

More Related