250 likes | 1.16k Views
Dimensional Modeling. Overview and Fundamentals. What Is Dimensional Modeling?. Dimensional Modeling. An Overview Structure Terminology Benefits The Fundamentals Facts and Dimensions The 4 Step Design Process. What does Data Warehouse look like. Kimball Data Warehouse
E N D
Dimensional Modeling Overview and Fundamentals
Dimensional Modeling • An Overview • Structure • Terminology • Benefits • The Fundamentals • Facts and Dimensions • The 4 Step Design Process
What does Data Warehouse look like • Kimball Data Warehouse • Back Room Staging Area (The kitchen) • Data Presentation Area (The Dinning Room) • Star Schema • Corporate Information Factory (CIF) • Normalized Dimensions • Dual ETL Loading, Warehouse and Data Marts • Operational Data Store (ODS) • Hybrid Data Warehouse
Dimensions The time independent, textual and descriptive attributes by which users describe objects. Combining all the attributes including hierarchies, rollups and sub-references into a single dimension is denormalization. Often the “by” word in a query or report Not time dependent Facts Business Measurements Most Facts are Numeric Additive, Semi-Additive, Non-Additive Built from the lowest level of detail (grain) Very Efficient Time dependent Terminology
Star Schema • Singe data (fact) table surrounded by multiple descriptive (dimension) tables
Benefits • Performance (Integer relationships, natural partitioning, Single joins benefit SQL optimizer) • Source system independence and multiple integration • Supports Change management • Usability/Simplicity (easy to read, interpret, join, calculate) • Presentation (Consistency, Taxonomy, Labeling) • Reuse (Conformed dimensions reduce redundancy, Role-plays)
Dimension Change Strategy • Type 1: Is used when the old value of the attribute has no significance or can be discarded. • Easy and Fast • Type2: Partitions history so that fact tables properly reflect original values. • Requires use of Surrogate Keys • Causes table growth due to additional history rows • Users must be aware of the added complexity • Effective Dates used secondary to cleaner fact joins
Dimension Change Strategy • Type 3: Additional attribute used to capture changes. • Used less frequently then Type 1 or 2. • Simultaneously supports two views of the world. • Does not trend changes over time. • Current and Prior or Current and Original Attributes • Hybrid Type: Combination 1, 2 & 3 changes • New attribute for predictable series (such as yearly changes) • Type 2 changes with prior or original attributes included • Expanded dimension table for durable key inclusion in fact • Added complexity to users
Dimension Role Playing • A single table that plays multiple roles (using views) to create synonym dimension attributes. • Most common role playing dimension is the Date Dimension. i.e. separate role playing dimensions for order date and ship date.
Modeling Design Process • Identify the Business Process • Source of “measurements” • Identify the Grain • What does 1 row in the fact table represent or mean? • Identify the Dimensions • Descriptive context, true to the grain • Identify the Facts • Numeric additive measurements, true to the grain
Step 1 - Identify the Business Process • This is a business activity typically tied to a source system. • Not to be confused with a business department or function. An Orders dimensional model should support the activities of both Sales and Marketing. • “If we establish departmentally bound dimensional models, we’ll inevitably duplicate data with different labels and terminology.”
Step 2 - Identify the Grain • The level of detail associated with the fact table measurements. • A critical step necessary before steps 3 and 4. • Preferably it should be at the most atomic level possible. • “How do you describe a single row in the fact table?”
Step 3 - Identify the Dimensions • The list of all the discrete, text-like attributes that emanate from the fact table. • They are the “by” words used to describe the requirements. • Each dimension could be though of as an analytical “entry point” to the facts. • “How do business people describe the data that results from the business process?”
Step 4 - Identify the Facts • Must be true to the grain defined in step 2. • Typical facts are numeric additive figures. • Facts that belong to a different grain belong in a separate fact table. • Facts are determined by answering the question, “What are we measuring?” • Percentages and ratios, such as gross margin, are non-additive. The numerator and denominator should be stored in the fact table.
For More Information • Articles, Design Tips and Newslettershttp://www.kimballgroup.com • Designing A Scalable DW/BI Systemhttp://msevents.microsoft.com/cui/WebCastEventDetails.aspx?EventID=1032297070&EventCategory=4&culture=en-US&CountryCode=US • Microsoft BI Using the Kimball Methodhttp://msevents.microsoft.com/cui/WebCastEventDetails.aspx?EventID=1032297084&EventCategory=4&culture=en-US&CountryCode=US • Using SSIS to Populate a Kimball Method Data Warehousehttp://msevents.microsoft.com/cui/WebCastEventDetails.aspx?EventID=1032297072&EventCategory=5&culture=en-US&CountryCode=US