100 likes | 193 Views
Data Warehousing. DSCI 4103 Dr. Mennecke. Chapter 4. Complex, multi-stage transactions. In some instances, processes might involve multiple conditional transactions (e.g., one type of payment requires a credit check, another requires specialized invoicing, etc.)
E N D
Data Warehousing DSCI 4103 Dr. Mennecke Chapter 4
Complex, multi-stage transactions • In some instances, processes might involve multiple conditional transactions (e.g., one type of payment requires a credit check, another requires specialized invoicing, etc.) • A choice needs to be made about whether to have multiple fact tables associated with each process
Multiple fact tables • Criteria in deciding • What are the user’s analytical requirements • Will separate processes be examined together or separately? • Are there really multiple separate business processes? • If so, then two or more fact tables are probably more appropriate since these processes may operate independently • Are multiple source systems involved? • If yes, use separate fact tables • How do the processes relate to dimensions? • If separate processes have different dimensions that they interact with, then separate fact tables may be implied
Slowly changing dimensions • What do we do when dimension information changes over time? (e.g., a product category changes)
Dealing with slowly changing dimensions • Overwrite the old attribute value with the new attribute value • Advantage: easy to manage • Disadvantage: lose historical information
Dealing with slowly changing dimensions • Add a new, unique dimension row (e.g., product description 2 will be added to product dimension 1) • Advantage: this maintains the historical information and is more in line with the goal of the data warehouse • Disadvantage: • Requires the creation and management of a new surrogate key for the new dimension record • May result in the creation of very large dimension tables • Tracking across the dimension change may be difficult
Dealing with slowly changing dimensions • Add a new column to the dimension table to hold information about the old dimension information (e.g., the old product category) • Advantage: allows tracking of dimensions over the change in dimensional information • Disadvantage: becomes unwieldy when multiple changes occur
Dealing with slowly changing dimensions • Predictable changes with multiple version overlays creates columns representing the dimension attributes value for each time period • Advantage: allows tracking of dimensions over time • Disadvantage: only handles periodic and predictable changes
Dealing with rapidly changing dimensions • Quickly changing dimensions cannot easily be handled using the previous techniques • Split quickly changing attributes into a separate dimension table • Minidimension: A foreign key for the split table is located in the fact table (e.g., a demographic category such as age would be applied to the customer when the purchase was made) • Outrigger: a foreign key for the split table is located in the slowly changing dimension