200 likes | 291 Views
Data Warehouse (IV): The Star Schema. Developed by: Dr Eddie Ip Modified by : Dr Arif Ansari. Outline. DW design Multidimensional design (star schema) Examples. DW Design. OLTP systems use normalized data structure Problem with normalized structure: hard to understand
E N D
Data Warehouse (IV): The Star Schema Developed by: Dr Eddie Ip Modified by : Dr Arif Ansari
Outline • DW design • Multidimensional design (star schema) • Examples
DW Design • OLTP systems use normalized data structure • Problem with normalized structure: • hard to understand • queries require understanding of SQL • not optimized to support decisional queries
DW Design • Ralph Kimball (1996) : The DW Lifecycle Toolkit • Dimensional Model • Denormalized structure • Schema suitable for decisional support
Dimensional Design • Divide variables into two categories • A. Things we want to track/measure • B. The way how things we measure are broken into
Dimensional Design • Example 1 • “Show me gross margin by product category.” (from manager of store of sporting goods) • The thing he wants to track/measure : gross margin • The way he wants gross margin to be broken into : by product category
Dimensional Design • Example 2 • “Show me average account balances by educational level.” (bank manager) • The thing he wants to measure: ? • The way he wants it to be broken into: ?
Dimensional Design • Example 3 • “What is the return rate by supplier?” (store manager) • The thing he wants to measure: ? • The way he wants it to be broken into: ?
Dimensional Design • Example 4 • “ What is the sales of PC by region by quarter?” (Frito-Lay manager) • The thing he wants to measure: ? • The way he wants it to be broken into: ?
Dimensional Design • The thing (variable) we want to track/ measure = measure or fact • The way we want it to be broken into = dimension
Example • Kitchen appliances manufacturer • Sells products to customers (retailers) through sales reps • p4-8 Adamson & Venerable (reader)
Fact tables • Store numeric data that are of interest to the business • e.g. Retail: sales amount, quantities sold Telecom: length of call, average no. of calls /day Banking: average daily balance
Fact tables • E.g. (more) Airline: ticket cost, baggage weight Finance: ROI, average 60-day share price • fact tables contain lots of records : “deep” tables
Dimension tables • Contain context of facts • Technically: store fields that describe the facts • e.g. Retail: store (attributes: name, zip), product category (attributes=?), time (attribute=?) Telecom: ?? Banking: ??
Star Schema • Fact table (at the center) plus its related dimension tables (on the sides) • Example (p. 6-7 A & V)
Dimension Fact
Dimensional Design • Design to answer business questions • DW: to provide answers to business users • Dimensional design -- user’s view of the DB • Capacity should be built in the DW at the design stage: what facts, what dimension • OLTP: to ensure operations run well • OLTP Design -- IT professional’s view of the DB
Star schema • Issues • Granularity • Multiple fact tables in a single star • different granularity • different processes • core & custom facts • snapshot & transaction
Design principles • Facts : fully normalized (no redundancy) • PK = concatenation of dimension keys • Dimensions : Denormalized (all relevant attributes in one table) • Different processes, different stars • Different grains, different fact tables
Summary • DW design • Dimensional design • Star Schema • Design Principles