170 likes | 457 Views
Procurement Kimball & Ross, Chapter 4. Overview. Value chain reinforcement Blended versus separate transaction schema Slowly changing dimension techniques. Procurement Case Study. Procurement involves a wide range of activities: Negotiation of contracts
E N D
Overview • Value chain reinforcement • Blended versus separate transaction schema • Slowly changing dimension techniques
Procurement Case Study • Procurement involves a wide range of activities: • Negotiation of contracts • Issuing of purchase requisitions & POs • Tracking receipts • Authorizing payments • Common analytic requirements: • Which materials/products are purchased most frequently? Who supplies them? At what prices? • Across the enterprise, are there opportunities to negotiate contracts by consolidating suppliers, single sourcing? • Are we purchasing from the preferred vendors? • How are vendors performing?
Procurement Transactions • Business process to model: Procurement • Transactions: purchase requisitions, purchase orders, shipping notifications, receipts, and payments • Grain: 1 row per procurement transaction • Key dimensions: • Transaction date • Product • Vendor (1 row for each vendor) • Contract terms (1 row for each set of terms negotiated with a vendor) • Procurement transaction type • Transaction date and product are conformed dimensions • Measured facts: • Procured units • Transaction amount See Figure 4.1 – Procurement fact table with multiple transaction types
Multiple- vs. Single-Transaction Fact Tables: Issues • Business users view purchase orders, shipping notices, warehouse receipts, and vendor payments as separate & unique processes • Several procurements transactions come from different source systems • Purchasing system: purchase requisitions and purchase orders • Warehousing system: shipping notices and warehouse receipts • Accounts payable system: vendor payments • Several transaction types have different dimensionality • E.g., discounts applicable to vendor payments but not to other types of transactions • Control numbers such as PO # generated during procurement process are candidates for degenerate dimensions • Design decision: • Build separate fact tables for each transaction type, or • Build a blended transaction fact table with a transaction type dimension • No simple answer.
Multiple- vs. Single-Transaction Fact Tables - Con’t. • Questions to ask: • What are the users’ analytic requirements? • How do users usually analyze data? • Do they analyze multiple transaction types together or do they usually look at a single transaction type? • Are there multiple unique business processes? Yes leaning towards separate tables • Are multiple source systems involved? Yes leaning towards separate tables • What is the dimensionality of the facts? Do some dimensions apply only to some transaction types? Yes leaning towards separate tables • Solution: multiple transaction fact tables
Multiple Fact Tables • See Figure 4.2 – Multiple fact tables for procurement processes • Advantages: • Richer, more descriptive dimensions and attributes • Simplified staging activities, since operational data exist in separate source systems • Loading data into separate fact tables will be less complex than attempting to integrate from multiple sources • Disadvantages: • More time to manage and administer: more tables to load, index, and aggregate
Slowly Changing Dimensions • We have assumed dimensions to be independent of time, but some dimensions (other than natural keys) may change slowly with time • Need to track change, without full-blown normalized structure; without making every dimension time-dependent • For each attribute in our dimension tables, we must specify a strategy to handle change
Techniques for Dealing with Dimension Change – Type 1: Overwrite the Value • Overwrite the old attribute value in the dimension row, replacing it with the current value. the attribute always reflects the most recent assignment • See example on p. 96. • The type 1 response is easy to implement, but it does not maintain any history of prior attribute values • Question: Is there a business need for retaining the old attribute value?
Techniques for Dealing with Dimension Change – Type 2: Add a Dimension Row • Create a new dimension row reflecting the new attribute • Two separate surrogate keys – one for old row and one for new • In the example (p.97) Product Key is used as the primary key instead of the SKU number, which is the natural key and is the same for both rows • Could also use a “most recent row indicator” to tell us which of the two rows is the current • Fact table is again untouched • Adding a dimension row is the primary technique for accurately tracking SCD attributes • Advantage: • New dimension row automatically partitions history in fact table – pre-change fact rows use the pre-change surrogate key • No need to revisit preexisting aggregation tables • Disadvantage: • Accelerated dimension table growth • Does not allow us to associate the new attribute value with the old fact file or vice versa
Techniques for Dealing with Dimension Change – Type 3: Add a Dimension Column • Add a new dimension column containing the old attribute value (E.g., “Prior Department”) • Overwrite the old value with the new • More appropriate when there is a need to associate new attribute values with old fact history • E.g., business need to track both old and new values of department attribute both forward and backward • Management can use either value for analysis • Allows for observing new and historical fact data by either the new or prior attribute values • Used less frequently • Inappropriate to track numerous intermediate attribute values
Hybrid Slowly Changing Dimension Techniques • Two approaches that combine the basic SCD techniques: • Predictable changes with multiple version overlays • Unpredictable changes with single- version overlay • These approaches provide more flexibility at the cost of greater complexity
Predictable Changes with Multiple Version Overlays • Used in cases of sales organization realignments • Example: Over a 5-year period the sales organization is reorganized five times. • At first sight, candidate for Type 2 approach (add dimension row), but more complex business requirements. E.g., • Report each year’s sales using the district map for that year • Report each year’s sales using the district map from an arbitrary different year • Report an arbitrary span of years’ sales using a single district map from a chosen year. • Type 3 is also inappropriate because >2 district maps • Because changes are predictable, an extension of Type 3 is possible Multiple District columns: • Current District; District 2001; District 2002; …
Unpredictable Changes with Single-Version Overlay • Preserve historical accuracy surrounding unpredictable attribute changes while supporting the ability to report historical data according to the current values • Issue a new dimension row (type 2) to capture the change and add a new dimension column to track the historical value (type 3). Also, overwrite “Current Department” value (Type 1). • See example on page 104.
More Rapidly Changing Dimensions • Break off the rapidly changing attributes into one or more separate dimensions • Two foreign keys in fact table: • Primary dimension table • Rapidly changing attribute(s)
Acknowledgements • Ralph Kimball & Margy Ross