1 / 9

Introduction to Data Modeling and Normalization

This course provides an introduction to data modeling and normalization, covering topics such as repeating groups, update anomalies, delete anomalies, and insert anomalies. Students will learn the benefits of normalization and how to design efficient and consistent data models.

dianecombs
Download Presentation

Introduction to Data Modeling and Normalization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Department of Computer and Information Science,School of Science, IUPUI Data Modeling Introduction and Normalization Dale Roberts, Lecturer Computer Science, IUPUI E-mail: droberts@cs.iupui.edu

  2. Sample Table

  3. Sample Table Problems • Repeating Groups The course ID, description, and instructor are repeated for each class. If a student needs a third class, you need to go back and modify the table design in order to record it. While you could add CourseID3, CourseID4, CourseID5, etc., along with the associated description and instructor fields, no matter how far you take it there may one day be someone who wants one more class. Additionally, adding all those fields when most students would never use them is a waste of storage. • Update Anomalies Let's say that after entering these rows, you discover that Bruce McKinney's course is actually titled "Intro to Advanced Visual Basic". In order to reflect this change, you would need to examine all the rows and change each individually. This introduces the potential for errors if one of the changes is omitted or done incorrectly. • Delete Anomalies If you no longer wished to offer Joe Garrick's Intro to DAO class, you would need to null out all references to the course. Once you remove all references to the course, you have lost all information about the course such as Course Id, Course Name, and Instructor. Should not offering the course mean that the course no longer exists? • Insert Anomalies Perhaps the department head wishes to add a new class - let's call it "Advanced DAO Programming" - but hasn't yet set up a schedule or even an instructor. What would you enter for the student, advisor, and instructor names?

  4. Normalization Defined • 1st Definition... the process of taking a wide table with lots of columns but few rows and redesigning it as several narrow tables with fewer columns but more rows. A properly normalized design allows you to use storage space efficiently, eliminate redundant data, reduce or eliminate inconsistent data, and ease the data maintenance burden...Jose’s Database Programming Corner www.citilink.com/~jgarrick/vbasic/database • 2nd DefinitionProcedure to ensure that a data model conforms to some useful standards. ... to minimize the duplication of data, to provide the flexibility necessary to support different functional requirements, and to enable the data modeler to verify the business requirements.Oracle 7.3 Developer’s Guide, p. 466 • 3rd DefinitionNormalization is the process of putting things right, making them normal... ...separating elements of data into affinity groups, and defining the normal, or “right”, relationships between them. Oracle: The Complete Reference

  5. Forms of Normalization • First Normal Form (1NF) • no repeating groups. • Second Normal Form (2NF) 1NF + no nonkey attributes depend on a portion of the primary key. • Third Normal Form (3NF) 2NF + no attributes depend on other nonkey attributes. “Each column depends on the key, the whole key, and nothing but the key, so help me Codd.”

  6. Database Management: Normalization • First Normal Form (1NF) • Each column contains values about the same attribute, and each table cell value must be a single value. • Each column has a distince name, order of columns is immaterial. • Each row is distinct, rows cannot be duplicate for the same key • The sequence of rows is immaterial. • Second Normal Form (2NF) • All non-key attributes must be fully dependent on the whole key. • Third Normal Form (3NF) • Each nonkey attribute should be dependent only on the relation’s key, not on any other nonkey.

  7. The Dangers in a Relational Database • How to reduce the confusion… • Normalization1NF:2NF:3NF: • English names for tables and columns • English code names

  8. The Dangers in a Relational Database

  9. Acknowledgements • McFadden and Hoffer. Database Management • Loney, Kevin. Oracle Database 10g The Complete Reference • Ullman, Jeff. Database Systems The Complete Book. • DatabaseAnswers.com

More Related