230 likes | 790 Views
An Introduction to Dimensional Data Warehouse Design. Presented by Joseph J. Sarna Jr. JJS Systems, LLC. Relational Normal Form. Most relational databases are set to 3 rd normal form 1 st Normal form – Tables have unique keys and no repeating groups or multi-value fields
E N D
An Introduction to Dimensional Data Warehouse Design Presented by Joseph J. Sarna Jr. JJS Systems, LLC
Relational Normal Form • Most relational databases are set to 3rd normal form • 1st Normal form – Tables have unique keys and no repeating groups or multi-value fields • 2nd Normal form – Every attribute is dependent ont the entire key of the table • 3rd Normal form – Attributes are dependent only on the key. No derived elements
Why Dimensional Data Warehouses? • Business needs to analyze data so that it can: • Understand trends • Predict future behavior and needs • Personalize contact with customers • Be competitive • All of this in a speedy manner, with the ability to do “What if’s”
Drawbacks to Relational Data Structures • Data is not structured for analytical usage • Multiple Joins are resource intensive • Missing data from external sources, context history, not operational sources
What Is a Dimensional Data Warehouse? “A structured repository of validated and integrated historical information accessible to business people to provide the basis for both tactical and strategic business decisions.” • Centralized extract and staging • Separate from operational system • Structured for analysis • Historically contexted
Dimensional Data Warehouse Architecture Relational Data External Data Enterprise Data Data Distribution Acquisition, Staging, Cleaning, Transformation Data Warehouse Storage Analytical Applications
Levels of Design • Detail Level • Dimensional Normal form • Value and feasibility • Analytical Level • Structured for the required analyses • Summary Level • Summaries for user requirements • Better response time
Dimensional Normal Form • Normalized for maintainability • De-normalized for performance, based on rules • 2 level structure, therefore only one level of joins required for queries
Dimensional Schema • Fact Tables • contain related measures • Usually the largest tables • Usually appended to • Can contain detail or summary data • Measures are usually additive • Dimension Tables • Contain descriptors • Utilize business terminology • Textual and discrete data • Attributes through which the table measures are analyzed
Resources • Books • The Data Warehouse Toolkit, Ralph Kimball • The Data Warehouse Lifecycle Toolkit, Ralph Kimball, et al • Data Warehouse Design Solutions, Adamson / Venerable • Websites • http://www.ralphkimball.com/ • http://www.atre.com • http://www.microsoft.com/sql/evaluation/overview/dataware.asp