1 / 11

Bridging the Gap between Real World Data, Schema, and Environment

This talk by Michael Franklin from UC Berkeley discusses the challenges of the relational model in handling real world data and proposes the need for new data and query models to better represent the ambiguity and user requirements.

jsokolowski
Download Presentation

Bridging the Gap between Real World Data, Schema, and Environment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. My CIDR Epiphany:Real World Data, Schema, and Environment Michael Franklin UC Berkeley Post SIGMOD PC Research Symposium (old persons track) February 11, 2005 Michael Franklin, UC Berkeley

  2. How it Happened or why it sometimes pays to hang around until the end of a conference • The “gloom and doom” panel • DeWitt’s gong show challenge • Grappa consumption & staying up too late • A great last session on sensor/stream processing, including: • Jennifer Widom’s Trio Talk • Shawn Jeffery’s HiFi Talk • Sam Madden’s Probabilistic Sensor Net Talk Michael Franklin, UC Berkeley

  3. The SIGMOD Credo Codd made relations, all else is the work of man. Leopold Kronecker (paraphrased by Raghu Ramakrishnan) Michael Franklin, UC Berkeley

  4. Database Management: Then Michael Franklin, UC Berkeley

  5. Database Management: Now Michael Franklin, UC Berkeley

  6. RM has been tremendously successful, but at a cost • Shoehorn the world into regular, flat tables. • This works particularly well for data that looks like regular, flat tables. • Ignore inconvenient facts about real world. • Source of a multi-billion $/yr consulting industry. • But, new applications, environments, devices, user expectations, are finally reaching a tipping point — stretching the model beyond its inherent capabilities. Michael Franklin, UC Berkeley

  7. Relational Model Assumptions: Real World Data All data in the database is 100% Valid The facts in the database are self-consistent Anything outside of the DB does not exist Time and space are just regular attributes Data items unambiguously map to real world entities Michael Franklin, UC Berkeley

  8. RM Assumptions: Schema All data conforms to a strict schema These schemas and their relationship to the data don't change much Everyone agrees on the meaning of the data No one cares where the data came from Michael Franklin, UC Berkeley

  9. RM Assumptions: Environment Users know exactly what they want to ask of the database Users want absolute answers (no satisficing) Queries can be independent of the user’s context All data is always available Michael Franklin, UC Berkeley

  10. Bridging the Physical Divide • We need to build systems that more realistically model the real world (and all its ambiguity) • We need to build systems that support users and conform to their goals, requirements, and habits (not vice versa) • This is going to require new data and query models, and likely another 30 years of work to get it right. Michael Franklin, UC Berkeley

  11. RM Assumption Cheat Sheet(A baker’s dozen) • All data in the database is 100% Valid • The facts in the database are self-consistent • Anything outside of the DB does not exist • Time and space are just regular attribute • Data items unambiguously map to real world entities • All data conforms to a strict schema • These schemas and their relationship to the data don't change much • Everyone agrees on the meaning of the data • No one cares where the data came from • Users know exactly what they want to ask of the database • Users want absolute answers (no satisficing) • Queries can be independent of the user’s context • All data is always available Real World Data Schema Environment Michael Franklin, UC Berkeley

More Related