140 likes | 552 Views
Dimensional Modeling Primer. Chapter 1 Kimball & Ross. Concepts Discussed. Business driven goals Data warehouse publishing Major components Importance of dimensional modeling for the presentation area Facts & dimension tables Myths of dimensional modeling Pitfalls to avoid.
E N D
Dimensional Modeling Primer Chapter 1 Kimball & Ross
Concepts Discussed • Business driven goals • Data warehouse publishing • Major components • Importance of dimensional modeling for the presentation area • Facts & dimension tables • Myths of dimensional modeling • Pitfalls to avoid
Different Information Worlds • Users of operational system turn the wheels of an organization • Users of data warehouse watch the wheels of the organization turn • Warehouse users have drastically different needs than users of operational systems
Returning Themes • We have mountains of data but we cannot access it • We need to slice the data in different ways • Need to make it easy for business users to access the data • Just show me what is important • It drives me craze when different people present the same metrics with different numbers • Fact-based decision making
Goals of Data Warehouse • Make an organization’s information easily accessible • Present the information in a consistent manner • Adaptive and resilient to change • Secure and protects information • Serves as a foundation for improved decision making • Business users must accept the data warehouse if it is to be useful
Publishing Metaphor • Data warehouse manager is a “publisher” of the right data • Responsible for publishing data collected from a variety of sources and edited for quality and consistency
Components of a Data Warehouse • Operational source systems • Data staging area • Data presentation area • Data access tools
Data Staging Area • Key structural requirement is that is it off-limits to business users and does not provide query and presentation services. • Correct misspellings, resolve domain conflicts, deal with missing elements, parse into standard formats, combine data from multiple sources. • Normalized structures sometimes called “enterprise data warehouse” – it is a misnomer (Kimball).
Data Staging Area • Dominated by simple activities sorting and sequential processing. • Normalized data is acceptable, although this is not the end goal.
Data Presentation • Series of integrated data marts. Data mart is data from a single business process. Wedge of the overall pie. • Data must be presented, stored and accessed in dimensional schema.
Data Presentation • Should not be in normalized form. • They must contain detailed atomic data in addition to data in summary form, because the queries are ad hoc and cannot be predicted. • Facts and dimensions – called conformed.
Presentation Area • If it is based on a relational data base, it is called start schema. • If it is multidimensional database, or OLAP, then the data is stored in cubes.
Data Access Tools • Querying is the whole point of DW. • Can be as simple as an ad hoc query tool or as complex as a data mining or a modeling application. • Parameter driven analytic operations. • 80 to 90 of the users are served by canned applications.
Additional Considerations • Meta data • Operational data store