210 likes | 572 Views
Data Warehousing. COT4810 4/3/2008. Data Warehousing. Overview Purpose History Methods Design Tools Other aspects of DW. The Purpose of Data Warehousing. Atomic data has a value now Sometimes the value needs to be realized through analytics Reporting Data mining.
E N D
Data Warehousing COT4810 4/3/2008
Data Warehousing • Overview • Purpose • History • Methods • Design • Tools • Other aspects of DW
The Purpose of Data Warehousing • Atomic data has a value now • Sometimes the value needs to be realized through analytics • Reporting • Data mining
The Purpose of Data Warehousing • Warehouses enable Decision Support • Turn static atomic data into valuable information • Used to support a variety of other Business Intelligence (BI) platforms
The History of Data Warehousing • Concept developed by IBM Researchers in 1980’s • Really takes off in early-mid 1990s • 1991, Bill Inmon • 1996, Ralph Kimball
What is a Data Warehouse? • Bill Inmon’s definition: “A data warehouse is a subject oriented, integrated, non-volatile, time variant collection of data [to support management decisions].”
What is a Data Warehouse? • Subject oriented • Data is arranged by subject area instead of application • Integrated • Data is collected and consistently stored from multiple sources
What is a Data Warehouse? • Non-volatile • Data is static • It doesn’t matter when the question is asked • Time variant • Allows for analysis of data over time
OLTP and OLAP • Online Transaction Processing (OLTP) • Online Analytical Processing (OLAP)
OLTP and OLAP • They work well together… • OLAP supports offloaded OLTP data • No longer needs to accumulate in the application database (or on stored media) • Less data, less index maintenance • Allows OLTP to perform at peak transaction efficiency • Smaller OLTP database = better performance
Data Marts • A Data Mart hosts a specific subset of data • Multiple Data Marts can compose what BI/DSS views as the Data Warehouse
Inside the Warehouse • Data is captured from source from via an ETL process • Extract source data • Transform source data • Load into Warehouse • Typically data is organized into Fact and Dimension tables
Fact Tables • Numerical data representing a specific activity • Details about the sale of a product • How many • How much • When • What store • Fact tables exist at the center of the Star Schema
Dimension Tables • Characteristics of the facts that offer a business perspective • Details about when a product was sold • Was it a weekday or weekend • Was it a holiday • First of the month, last of the month
Star Schema (ROLAP) http://msdn2.microsoft.com/en-us/library/aa902672(SQL.80).aspx
Cubes (MOLAP) • Takes multiple dimensions and optimizes a view http://www.selectorweb.com/images/olap_cube.gif
Other Aspects of the Data Warehouse • Data Mining • User Interface design • Requirements for Business Intelligence predictions or forecasts • Data “Webhouse”
Questions? • Why did Data Warehousing get a boost in the 1990s? • What are the steps in the ETL process?
References • Kimball, Ralph. The Data Warehouse Toolkit. Wiley Publishing. 2nd Edition. April, 2002. • Inmon, W. H. Building the Data Warehouse. Wiley Publishing. 4th Edition. October, 2005. • Dave Browning and J. Mundy. “Data Warehouse Design Considerations.” Microsoft SQL Server Technical Articles. December, 2001.