150 likes | 276 Views
Data Warehousing (Kimball, Ch.5-12). Dr. Vairam Arunachalam School of Accountancy, MU. Agenda. Value Chain “Clean” construction of DDW Financial Services Subscription Businesses Insurance Factless fact tables Decision Points in DDW construction. Value Chain.
E N D
Data Warehousing(Kimball, Ch.5-12) Dr. Vairam Arunachalam School of Accountancy, MU
Agenda • Value Chain • “Clean” construction of DDW • Financial Services • Subscription Businesses • Insurance • Factless fact tables • Decision Points in DDW construction Dr. Vairam Arunachalam
Value Chain • Concept: integrated view of value-adding components of business process • Example on Demand side: • Finished Good inventory • Manufacturing Shipments to Distribution Center • Distribution Center Inventory • Distribution Center Shipments to Retail Stores • Retail Store Inventory • Retail Store Sales Dr. Vairam Arunachalam
Value Chain (contd.) • Example on Supply side: • Purchase Orders • Receiving • (Raw) Materials Inventory • Process Control • BOM • Finished Goods Inventory • Manufacturing Plans Dr. Vairam Arunachalam
Value Chain (contd.) • Issues related to integration of value chain information (I.e., drill-across): • Shared dimensions • Differences in physical dimension tables • Common dimension tables as a solution • Design Principle: • All constraints on dimensional attributes must evaluate to exactly the same set of dimensional entities from one db to another in the value chain Dr. Vairam Arunachalam
Value Chain (contd.) • Dimensions with reduced detail (e.g., manufacturing lot nos. versus SKUs) • Derived dimensions supporting aggregates (e.g., construction of derived roll-up product dimension and fact tables) Dr. Vairam Arunachalam
“Clean” construction of DDW • Design principle: • A master file, usually the source of unique identification, must be maintained on a regular basis. This needs QA on the p-key and other fields. • Snowflaking: the good (remember normalization?) and the bad (issue of browsing performance) -- Fig.6.2 • Demographic minidimensions -- Fig.6-3 Dr. Vairam Arunachalam
“Clean” DDW (contd.) • Slowly changing dimensions (implications, pro and con): • Type 1 (Overwriting old values; losing ability to track history) • Type 2 (Creating an additional dimension record; segmenting history) • Type 3 (Creating new fields with new attribute values within original dimension record, while keeping original attribute values; describing history both backward and forward) Dr. Vairam Arunachalam
Financial Services • Core fact tables: Household data warehouse (Fig.7.1) • Dirty dimensions • Semiadditive account balances • Heterogeneous products (Fig.7.3): • Design principles: • create a core fact and core dimension tables for crossing types, and a custom fact and custom dimension tables for querying • primary core facts duplicated in custom fact tables Dr. Vairam Arunachalam
Subscription Businesses • Accounting concept underlying payments in advance (I.e., deferred revenues) • Design principle: • Combine transaction-grained fact table with a monthly snapshot-grained fact table in order to get at transaction frequency/timing and earned income in a given period • Cable TV sales transaction and sales monthly snapshot databases (Figs.8.1 & 8.2) Dr. Vairam Arunachalam
Insurance • Good illustration of several important concepts: • business process • grain, dimensions (including degenerate and dirty dimensions) • core & custom dimension and fact tables • transaction & snapshot schemas • heterogeneous products • slowly changing dimensions • minidimensions Dr. Vairam Arunachalam
Insurance • Initial policy transaction and snapshot schemas (Figs.9.1 and 9.3) and claims transaction and snapshot schemas (Figs.9.2 and 9.4) Dr. Vairam Arunachalam
Factless Fact Tables • Concept: no measured facts (still useful) • Types: • event tracking (e.g., which hospital procedures were performed most extensively?) • coverage (e.g., which customers did not purchase any products?) • Hospital patient procedure schema (Fig.10.2) Dr. Vairam Arunachalam
Decision Points in DDW construction 1. Processes -> fact table identification 2. Grain of fact table 3. Dimensions of fact table 4. Facts 5. Dimension attributes 6. Slowly changing dimensions 7. Aggregations, heterogeneity, minidimensions, queries 8. Historical duration of db 9. Timeframe for data extraction/loading into DW Dr. Vairam Arunachalam