330 likes | 350 Views
GUS: 0262 Fundamentals of GIS. Lecture Presentation 3: Relational Data Model Jeremy Mennis Department of Geography and Urban Studies Temple University. File Structures. A file: “STUDENT”. field. ID Last First Grade. 3 Smith Jane A 1 Wood Bob C
E N D
GUS: 0262Fundamentals of GIS Lecture Presentation 3: Relational Data Model Jeremy Mennis Department of Geography and Urban Studies Temple University
File Structures A file: “STUDENT” field ID Last First Grade 3 Smith Jane A 1 Wood Bob C 2 Kent Chuck B 4 Boone Dan B record
File Structures Simple ordering is based on order of entry into the file Ordered Sequential ordering is based on numeric or alphabetical ordering Indexed an index provides pointers to certain positions in the file
Databases and Data Models A database is a collection of data files that is structured (organized) to facilitate data storage, manipulation, and retrieval. A database management system (DBMS) is a software package that performs these database functions
Databases and Data Models • A data model is a particular way of conceptually organizing multiple data files in a database: • Hierarchical, Network, Relational • Hierarchical and network data models have generally been replaced by the relational data model. • Relational DBMSs (and their derivatives) dominate the (non-GIS) database market: Oracle, Informix.
Relational Data Model Composed of a set of tables called relations Records (rows) in the table are called tuples Fields (columns) in the table are called domains We will use the terms tables, records, and fields
Relational data model Each record represents a logical entity (e.g. a student) (or a relationship) Each field represents an attribute (property) of the logical entity Student ID Last First Grade Class 1 Wood Bob C Geog357 2 Kent Chuck B Geog115 3 Smith Jane A Geog357 4 Boone Dan B Geog357
Relational data model Each table has a primary key, one field (or a combination of fields) that has a unique value for each and every record in the table Student ID Last First Grade Class 1 Wood Bob C Geog357 2 Kent Chuck B Geog115 3 Smith Jane A Geog357 4 Boone Dan B Geog357
Relational data model Tables can be related (joined or linked) together based on their keys. Student Class ID Last First Grade Class Name #Stud Instructor 1 Wood Bob C Geog357 2 Kent Chuck B Geog115 3 Smith Jane A Geog357 4 Boone Dan B Geog357 Geog357 48 Mennis Geog115 120 Brower Geog20 120 Fountain
Relational data model Primary key Foreign key Primary key Student Class ID Last First Grade Class Name #Stud Instructor 1 Wood Bob C Geog357 2 Kent Chuck B Geog115 3 Smith Jane A Geog357 4 Boone Dan B Geog357 Geog357 48 Mennis Geog115 120 Brower Geog20 120 Mennis
Relational data model Student Class ID Last First Grade Class Name #Stud Instructor 1 Wood Bob C Geog357 2 Kent Chuck B Geog115 3 Smith Jane A Geog357 4 Boone Dan B Geog357 Geog20 120 Brower Geog115 120 Mennis Geog357 48 Mennis Instructor Name Office Mennis 332 Brower 517
Normal Forms The process of structuring tables and table relationships in a logical way that minimizes data redundancy. 3 rules or steps in normalization • first normal form • second normal form • third normal form
Normal Forms First normal form only one value per field for each record Violates first normal form Student ID Last First Grades Classes 1 Wood Bob C, B Geog357, Geog20 2 Kent Chuck B, D Geog115, Geog356 3 Smith Jane A, B Geog357, Geog20 4 Boone Dan B, A Geog357, Geog455
Normal Forms Second normal form each non-primary key field must be totally dependent on the entire primary key (and not on only part of the primary key) primary key Student Year Last First Grade Status Violates second normal form because Status is dependent only on Year, not on Year/Last/First 2 Wood Bob C sophomore 4 Kent Chuck B senior 3 Smith Jane A junior 3 Boone Dan B junior
Normal Forms To resolve second normal form violation - create separate tables Student Year primary key primary key Last First Grade Year Last First Grade Year Status Wood Bob C 2 Kent Chuck B 4 Smith Jane A 3 Boone Dan B 3 Wood Bob C Kent Chuck B Smith Jane A Boone Dan B 1 freshman 2 sophomore 3 junior 4 senior
Normal Forms Third normal form every field that is not a primary key must be totally and directly dependent on the primary key (no transitive dependency) Student ID Last First Grade Class Instructor 1 Wood Bob C Geog357 Mennis 2 Kent Chuck B Geog115 Brower 3 Smith Jane A Geog357 Mennis 4 Boone Dan B Geog357 Mennis
Normal Forms Third normal form every field that is not a primary key must be totally and directly dependent on the primary key (no transitive dependency) Student ID Last First Grade Class Instructor Violates third normal form because Instructor is dependent on Class, not on the primary key ID 1 Wood Bob C Geog357 Mennis 2 Kent Chuck B Geog115 Brower 3 Smith Jane A Geog357 Mennis 4 Boone Dan B Geog357 Mennis
Normal Forms To resolve third normal form violation - create separate tables Instructor is dependent on primary key Name Class is dependent on primary key ID Student Class ID Last First Grade Class Name #Stud Instructor 1 Wood Bob C Geog357 2 Kent Chuck B Geog115 3 Smith Jane A Geog357 4 Boone Dan B Geog357 Geog357 48 Mennis Geog115 120 Brower Geog20 120 Mennis
Why Normalization? 1. If the instructor to a class changed - all students with that class would have to have their instructors changed 2. Every time a student changed a class, the instructor would also have to be changed Student ID Last First Grade Class Instructor 1 Wood Bob C Geog357 Mennis 2 Kent Chuck B Geog115 Brower 3 Smith Jane A Geog357 Mennis 4 Boone Dan B Geog357 Mennis
Why Normalization? ID Last First Grade Class Instructor These update problems may result in logical inconsistencies in the database 1 Wood Bob C Geog357 Mennis 2 Kent Chuck B Geog115 Brower 3 Smith Jane A Geog357 Mennis 4 Boone Dan B Geog357 Mennis Original table Student ID Last First Grade Class Instructor 1 Wood Bob C Geog115 Mennis 2 Kent Chuck B Geog115 Brower 3 Smith Jane A Geog357 Knight 4 Boone Dan B Geog357 Mennis Logical inconsistency Updated table Logical inconsistency
Why Normalization? When the table is in third normal form, these logical inconsistencies cannot take place. When an instructor is changed, the change is enforced for all students When a student changes classes, the change is instructor is automatically enforced Class ID Last First Grade Class Name #Stud Instructor 1 Wood Bob C Geog357 2 Kent Chuck B Geog115 3 Smith Jane A Geog357 4 Boone Dan B Geog357 Geog357 48 Mennis Geog115 120 Brower Geog20 120 Mennis
Relational Algebra Operations on the relational data model are defined by relational algebra join projection selection
Relational Algebra Join: Match records in both tables based on a common field Geography Classes Instructor Result of Join Class Instructor Instructor Office Class Instructor Office Geog357 Mennis Geog115 Brower Geog20 Fountain Geog435 Karnes Mennis 332 Brower 423 Fountain 125 Karnes 312 Geog357 Mennis 332 Geog115 Brower 423 Geog20 Fountain 125 Geog435 Karnes 312
Relational Algebra Projection: reduces one table in the attribute dimension (a selection of a subset of fields, for all records)
Relational Algebra Projection: List all Geography classes, but not the instructors Geography Classes Result of Projection Class Instructor Class Geog357 Mennis Geog115 Brower Geog20 Fountain Geog435 Karnes Geog357 Geog115 Geog20 Geog435
Relational Algebra Selection (restriction): reduces one table in the record dimension (a selection of a subset of records, for all fields) Criteria for selection is called a predicate
Relational Algebra Selection: Find Geography classes taught by Mennis Geography Classes Class Instructor Geog357 Mennis Geog115 Brower Geog20 Fountain Geog435 Karnes Result of Selection Class Instructor Geog357 Mennis
SQL • Structured (Standard) Query Language • Formal language for interacting with relational databases • Implementation and language for relational algebra
SQL SQL - basic syntax SELECT <fields> FROM <tables> WHERE <condition> SELECT Class, Instructor FROM Geography Classes WHERE Instructor = “Mennis” Geography Classes Class Instructor Geog357 Mennis Geog115 Brower Geog20 Fountain Geog435 Karnes Result of Selection Class Instructor Geog357 Mennis
SQL SELECT Class, Office FROM Geography Classes, Instructor WHERE Class = “Geog357” or Instructor = “Karnes” or Office = 125 ORDER BY Office Instructor Geography Classes Result of SQL Query Class Instructor Name Office Class Office Mennis 332 Brower 423 Fountain 125 Karnes 312 Geog357 Mennis Geog115 Brower Geog20 Fountain Geog435 Karnes Geog20 125 Geog435 312 Geog357 332