210 likes | 229 Views
File and Database Design. SYS364. Today’s Agenda. WHTSA DBMS, RDBMS, SQL A place for everything and everything in its place. Entity Relationship Diagrams to figure it out Normalization to make it work. Why DB?. Data & Information is lifeblood of most systems, esp. business systems
E N D
File and Database Design SYS364
Today’s Agenda • WHTSA DBMS, RDBMS, SQL • A place for everything and everything in its place. • Entity Relationship Diagramsto figure it out • Normalizationto make it work
Why DB? • Data & Information is lifeblood of most systems, esp. business systems • faster and fewer programs • OLTP for accurate and efficient data • easy retrieval for EIS, DSS, OLAP and data warehouse creation • easy interface to heterogeneous systems via SQL, ODBC, JDBC
Why else? • DB analysts make more money than programmers.
Data Terminology and Concepts • It’s a flat (sequential) file if it needs fscanf. Application programs parse the byte stream into data. • DBs use Entities, fields, records, files and keys so programs do not have to know about physical storage of data, only logical
Flat file data • Quote, comma delimited bytes interpreted by program into data 222222222,"Evans","Gil",234.56,"CPO","3","arrpiano"333333333,“Young",“Dave",34.56,"CPA","4","bass"111111111,"Ferguson","Maynard",123.45,"CPAC","5","trumpet"444444444,“Bickert",“Ed",456.78,"CPA","6",“guitar"666666666,"Koffman","Moe",678.94,"CPAC","5","flute"555555555,"Clarke","Terry",567.88,"CPA","5","drums"888888888,"Krall","Diana",876.54,"CPAC","5","vocalfem"777777777,"Eikhart","Shirley",789.01,"CPAC","5","composer"999999999,"Peterson","Oscar",-9.87,"CPA","6","piano"
DB file • DB knows about the data
Relational Model Advantages • logical and physical characteristics of the DB are separated. e.g. order of rows and columns is immaterial • much more easily understood by humans • powerful operators available, enable complex operations with brief commands. • sound framework for the design of DBs
DBMS library or Relational Database or SQL collection or E-R Diagram:
Key Fields • May be one or two or more fields (combination / multivalued / composite keys) • Primary keys are unique and minimal • Candidate keys (possible Primaries) • Secondary keys (alternate key, views) • Foreign keys (field(s) in one file is a primary key in another file - a relationship! • Referential Integrity - checks relationship
Data Relationships and Entity-Relationship Diagrams • ERD • Graphical model that shows the relationships among system entities • Entities are drawn as rectangles • Labeled with singular nouns • Relationships are drawn as diamonds • Labeled as active verbs
Types of Relationships • One to One (1:1) • One to Many (1:M) • Many to Many (M:N)
Cardinality • how many relationships among entities • Analysts need to understand cardinality to design files and databases that reflect accurately all relationships among entities • 0, 1 or many relationships between Entities • Entity can be mandatory • Entity can be optional • Crows Foot Notation
Creating an ERD • Identify the Entities • Determine all significant events or activities for two or more entities • Analyze the nature of the interaction • Draw the ERD
Try it out yourself • Seneca has courses, sections, professors, and students
Normalization • A process by which you identify and correct inherent problems in record design • Involves three stages • First Normal Form • Second Normal Form • Third Normal Form
First Normal Form • 1NF = no repeating attributes (columns) • Is this déjà vu all over again? • If so, reorganize the data into separate relations (tables) • You are unique – just like everyone else • Establish a minimal & unique primary key • Include a key to identify the repeating columns now in their own table
Second Normal Form • 1NF with no partial dependencies, i.e. where the value of a non-key column is dependent on only part of the key (usually on one column of a multi-column key) • 1NF record with a primary key that is a single column is automatically in 2NF • See next slide for translation
2NF • Who does this belong to? • Each field belongs to the whole primary key • If not, split the row into new tables • When does it matter? (not in the textbooks) • The Item Master file contains a price field. If only the current price matters, this is where the price belongs. • An Invoice Item file has its own copy of the price field because the price is time dependent – it belongs with the invoice at the time of sale.
Third Normal Form • 2NF with no transitive dependencies, i.e. where a non-key column is dependent on another non-key column • To go from 2NF to 3NF, remove the dependent column • Do we know this already? • If have Unit-Price & Qty, do not need Extended-Price • If Postal Code, what don’t you need?
Summary • Entity Relationship Diagrams • Who is in our model? • How do they relate? • Normalization • Confirms ERD • Organizes data: • Data is stored only once • Keys uniquely and minimally identify relations