160 likes | 344 Views
Tips and Tricks for Dimensional Modeling. By Shawn Jackson. Overview. Set of techniques and concepts used in data warehouse design Intended to support end-user queries and is oriented around understandability and performance Uses the concepts of facts (measures) and dimensions (context)
E N D
Tips and Tricks for Dimensional Modeling By Shawn Jackson
Overview • Set of techniques and concepts used in data warehouse design • Intended to support end-user queries and is oriented around understandability and performance • Uses the concepts of facts (measures) and dimensions (context) • Facts are typically (but not always) numerical values that can be aggregated • Dimensions are groups of hierarchies and descriptors that define the facts
Kimball University:10 Essential Rules of Dimensional Modeling (#1-5) • Load detailed atomic data into dimensional structures • Store data at the lowest grain • Use summary tables/views to improve performance as necessary • Structure dimensional models around business processes • Fact tables should be based on a business event • Complement single process fact tables with consolidated fact tables that combine metrics from multiple processes at the same level of detail • Ensure that every fact table has an associated date dimension table • Ensure that all facts in a single fact table are at the same grain or level of detail • Resolve many-to-many relationships in fact tables
Kimball University:10 Essential Rules of Dimensional Modeling (#6-10) • Resolve many-to-one relationships in dimension tables • Store report labels and filter domain values in dimension tables • Don’t store codes and descriptions in the fact table • Make sure the full description of the code is in the dimension table • Make certain that dimension tables use a surrogate key • Create conformed dimensions to integrate data across the enterprise • Date dimension is a common example • Single version of the truth • Continuously balance requirements and realities to deliver a DW/BI solution that's accepted by business users and that supports their decision-making
Slowly Changing Dimensions • Type 0 • Type 1 • Type 2 • Type 3 • Type 4 • Type 6
SCD Type 0 • Rows are added but never changed • Missing true business / natural key • Typically are only used in derived dimensions • Type 0 attributes are more common
SCD Type 1 • Rows can be updated or added based upon business key • Historical information is not tracked
SCD Type 2 • Rows are only added • A version number or effective dates are used to keep track of history
SCD Type 3 • Rows are updated but not added • Historical information is preserved through extra columns
SCD Type 4 • Combination of type 1 and type 2 dimensions • Rows are updated in the type 1 table and added in the type 2 table
SCD Type 6 / hybrid • Combines type 1, 2 and 3 in one table
Roleplaying Dimensions • Recycled for multiple applications within the same database • Date dimension is commonly used (sale date, delivery date) • Can be used to get different views of data
Factless Fact Tables • Tracking events • Many to many joins