320 likes | 434 Views
Relational Databases. COMP 416 Fall 2010 Lecture 21. What’s a database?. A collection of data Examples of collections of data? Library Web Stacks of papers on your desk Set of baseball cards. Are all of these things databases?. Web vs. Library. What’s the big difference? Organization
E N D
Relational Databases COMP 416 Fall 2010 Lecture 21
What’s a database? • A collection of data • Examples of collections of data? • Library • Web • Stacks of papers on your desk • Set of baseball cards. • Are all of these things databases?
Web vs. Library • What’s the big difference? • Organization • In what ways is the library organized? • Databases have organization.
Library vs Baseball Cards • What’s the primary object in these two collections? • Libraries: books • Baseball card collections: baseball cards • How alike are books? • Somewhat, but large variatations • How alike are baseball cards? • Very alike. • Things in databases are highly structured.
One library vs another • What’s the difference between the Brauer Library and the House Library? • Different purposes leads to different priorities (in organization and content). • Databases are built for a purpose. • The more specific the purpose, the more specific its structure and organization.
So, what’s a database • A database is an collection of structured information organized for a specific purpose.
Relational Databases • Relational databases are the most prevalent type of database used. • Information is organized into related tables. • Each table captures information about a different entity. • Columns are different fields of information (attributes of the entity). • Each row represents one instance (a specific example of the entity).
Design Goals • What kinds of information do we want to keep track of? • What do we want to do with that information?
Entities • First step in database design is to identify entities. • Think of entities as “things” that you want to know information about. • What do we care about for our bookstore? • Books (duh?)
Attributes • Next step is to identify attributes of those entities. • An attribute is labeled piece of information (i.e., a name/value pair) • In general, we expect every instance of a particular entity to have specific values for a set of common attributes.
Book Author(s) Title Publisher Genre Price Book Entity
Normalization • Not all database designs are equal. • Experience and research has shown that certain structures and relationships are easier to maintain and process than others. • Normalization: a process through which a database design is “cleaned up” • Well-defined set of “normal forms” which are the incremental result of this process.
1NF • First Normal Form • All attributes are single-valued. • All instances have a unique identifier.
Book Author(s) Title Publisher Genre Price Book Entity Revisited • Is our book entity in 1NF?
Author Book First Title Last Publisher Birthday Genre Price Bookstore Entities (1NF) • Multi-valued attributes generally indicate the need for a new entity.
Unique Identifiers • What in our book and author entities can act as a unique identifier? • Often (almost always) the best way to create a unique identifier is to create an artificial one. • Book ID, Author ID. • Assigned by the database itself. • No inherent semantics.
Book Author ID ID Title First Publisher Last Genre Birthday Price Book Entities (1NF) v2
Book Author ID ID Title First Publisher Last Genre Birthday Price Modeling Relationships • Two relationship types. • One-to-Many • Many-to-Many • For now, we’ll just model this pictorially like this:
2NF • Second Normal Form • Already in 1NF • Non-identifying attributes are dependent on the entity’s unique identifier. • Rule of thumb: if the same value appears multiple times for a particular attribute, think hard if what you really need is another entity.
Book Author ID ID Title First Publisher Last Genre Birthday Price Bookstore Entities • What might we pull out into an entity?
Publisher Book ID ID Name Title Author Price Address ID State First State Abbrev. Last Birthday Genre ID Genre Name Bookstore Entities (2NF)
3NF • Third Normal Form • In 2NF • No attributes dependent on each other. • What part of our data model violates this? • To fix, generally want to pull the dependent attributes out into their own entity.
Publisher Book ID ID Name Title Author Price Address ID First Last Birthday State Genre ID ID Long Name Abbrev. Genre Name Bookstore Entities (3NF)
Logical vs Physical Design • Result so far is “logical” database design. • Still need to implement this design as a specific database. • Relational databases: • Each entity associated with a table. • Attributes are columns of the table. • Each attribute is given a data type. • Unique identifiers are “primary keys” • Relationships are embodied as “foreign keys” • An attribute whose value is the unique identifier in another table.
Implementing 1-to-many • To implement a 1-to-many relationship, add an attribute on the “many” side which is the unique identifier of the “one” side.
Publisher Book ID ID Name Title Author Address Price ID PubID StateID GenreID First Last Birthday State Genre ID ID Long Name Abbrev. Genre Name Implementing 1-to-many
Resolving M-to-M • Many-to-many relationships are hard to implement in a database. • Why is this? • Foreign key attribute which is supposed to implement the relationship requires multiple values. • This breaks 1NF structure. • How might we fix it?
Junction Entities • A junction entity is an abstract entity provides a level of indirection for a many-to-many relationship.
Publisher Book BookAuthor ID ID ID Name Title Author BookID Address Price AuthorID ID PubID StateID GenreID First Last Birthday State Genre ID ID Long Name Abbrev. Genre Name Adding BookAuthor Junction
SQL • Structured Query Language (SQL) • The language in which we express actions to be performed on a relational database. • Standardized to allow portability across different products. • SQL92 (aka SQL2) is the latest standard. • Product specific differences and extensions still exist, but much better than before.
MySQL • MySQL • Open-source • Great for small to mid-sized organizations. • Fast, efficient, cheap • Doesn’t support full SQL but a good portion of it.
Web App Model DB On Disk Browser Web Server HTTP Requests HTTP Responses JavaScript Database PHP Programmable, dynamic interface to the document SQL Programmable, dynamic, document construction Structured, table-based, information storage