140 likes | 299 Views
Introduction. CSCI 6441 Database. Misunderstood topics Normalization Database design Performance SQL Advanced topics Time in databases Translucency Performance Realistic experience Realistic team size Accountability Emerging requirements Current Developments Big data NOSQL
E N D
Introduction CSCI 6441Database
Misunderstood topics • Normalization • Database design • Performance • SQL • Advanced topics • Time in databases • Translucency • Performance • Realistic experience • Realistic team size • Accountability • Emerging requirements • Current Developments • Big data • NOSQL • Cloud Computing Next: 6442
Early applications: • Programs wrote information into files on disk • Programs included lots of information about the files • Where they were stored • Type of storage • Exact format of each record • Changing programs is, in general, very hard • Programming is exacting work • Testing takes lots of time • People change jobs • Early programs were very hard to change • If data moved, programs had to change • If data changed, programs had to change • Events tend to force changes in data Why?
It was discovered that many programs fit a paradigm: • They stored some data • Then later they changed it • Although hard problems of changing structure of data remained • Many useful applications could be built on this notion of a “stored data base” • Data base systems were developed to help manage the data • They provided uniform backup, recovery • Later, they even made changing the data easier A Discovery
Earlier database systems: hierarchies, networks as data models • Data could be moved around easily • Relationships represented as physical connections • Structure of relationship imbedded in applications • When structure changed, programs had to change • Relational: independent table as data model • Relationships “represented” by equal values of data • Structure of relationships invisible to applications • Relationships change as data value change • Much greater ease of change Relational Principles
Inventor of the relational approach • Received Turing Award • Mathematician at IBM Research • Was looking for a true formalism for data TeddCodd
Relational Database: a set of relations Relational Database
Relation: a set of ordered pairs • Ordered pair: a pair of values, such that interchanging the two values changes the meaning • That is, <a,b>=<b,a> iff a=b and b=a • Specifying a relation by enumeration: R={<a,b>,<c,d>,<e,f>} • This is a relation consisting of three ordered pairs. Relation
Ordered pairs can model more than two values through nesting: • <a, b, c> == <<a,b>, c> • <a, b, c, d> == <<a,b>, c, d> • And so on • This extends the ordered pair so that it can model a tuple of any length • Now a relation starts to look like our notion of a file, with each tuple corresponding to our notion of a record Relation and File
Relation is a set of ordered pairs (modeling a set of tuples), so: • 1. exchanging order of values within a tuple changes the meaning of the tuple • 2. exchanging the order of tuples within a relation does not change the meaning of the tuple • 3. duplicate tuples are not allowed The Definition
Now we build a database as a collection of independent relations, each describing instances of a single entity type • For example: • Employee (employee#, job, salary, department) • Department (department#, departmentname, location) Data Modeling
We need a way to insert data into the database, retrieve data from the database, and changes values that are stored in the database • We define a data language that can be used from any programming language to do that • The data language (SQL) has a lot of power and can save a lot of programming work if you understand it Data Language
Now we’ll talk about course mechanics Mechanics