1 / 8

Scaleable Structured Datastorage for Web 2.0

Scaleable Structured Datastorage for Web 2.0. Michael Armbrust, David Patterson October, 2007. RAD Lab 5-year Mission. Today’s Internet systems complex, fragile, manually managed, rapidly evolving To scale Ebay, must build Ebay-sized company “Moon shot” mission statement:

Download Presentation

Scaleable Structured Datastorage for Web 2.0

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Scaleable Structured Datastorage for Web 2.0 Michael Armbrust, David Patterson October, 2007

  2. RAD Lab 5-year Mission • Today’s Internet systems complex, fragile, manually managed, rapidly evolving • To scale Ebay, must build Ebay-sized company • “Moon shot” mission statement: Enable a single person to Develop, Assess, Deploy, and Operate the next-generation IT service • “The Fortune 1 Million” by enabling rapid innovation • Create core technology to enable vision via synergy across systems, networking, and Statisical Machine Learning • Making datacenter easier to manage enables vision of single person to analyze, deploy and operate a scalable IT service

  3. If Datacenter is the computer… • What is the programming language? • What are the libraries? • How do trace/monitor programs? • What is the simulator? • What is Computer Aided Design? • What is the Operating System? • What is the Database System?

  4. Storage Status Quo • Current status of data storage for Web 2.0 apps • Large relational databases running on expensive hardware • Manual horizontal and vertical partitioning of data • Problem: Requires redesign at each scaling milestone • Goal: Scaleable structured data storage for Web 2.0

  5. Web 2.0 App Characteristics • Need to scale to YouTube or MySpace sizes • Require geographic replication • Short transactions • No ad-hoc queries • Willing to trade relaxed consistency for scalability and availability • Photos, not financials

  6. Relaxed Consistency • Some things can be updated lazily • Eventual consistency is often acceptable • However users should see their own writes immediately • Need to provide simple choices to developers

  7. Our Idea • Large scale distributed database underneath • Runs on 1000+ of shared nothing commodity servers • ActiveRecord-like layer in Ruby on Rails vs. SQL • Provides simple relationships and consistency guarantees between models • has_many • belongs_to • searchable_by (for full-text search) • Pre-compute joins for quick reads

  8. Related Work (we know of) • G. DeCandia, D. Hastorun, et al. Dynamo: Amazon’s highly available key-value store. In SOSP. 2007. [5] M. Stonebraker and U. Cetintemel. one size fits all: an idea whose time has come and gone. pp. 211. 2005. • M. Stonebraker, S. R. Madden, et al. The end of an architectural era (its time for a complete rewrite). In VLDB. Vienna, Austria, 2007. • D. J. Abadi, A. Marcus, S. R. Madden, and K. Hollenbach. Scalable semantic web data management using vertical partitioning. In VLDB, Vienna, Austria, 2007. • F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber. Bigtable: A distributed storage system for structured data. In OSDIユ06: Seventh Symposium on Operating System Design and Implementation, November 2006.

More Related