220 likes | 390 Views
09. Data Warehouse (DW) & On-line Analytic Processing (OLAP). Rev: Feb, 2013 Euiho (David) Suh , Ph.D. POSTECH Strategic Management of Information and Technology Laboratory (POSMIT: http://posmit.postech.ac.kr) Dept. of Industrial & Management Engineering POSTECH. Contents. Cost. Bond.
E N D
09. Data Warehouse (DW) &On-line Analytic Processing (OLAP) Rev: Feb, 2013 Euiho (David) Suh, Ph.D. POSTECH Strategic Management of Information and Technology Laboratory (POSMIT: http://posmit.postech.ac.kr) Dept. of Industrial & Management EngineeringPOSTECH
Cost Bond Sales HR Finance 1. Data Warehouse1) Introduction of Data Warehouse Definition of Data Warehouse Integrated A data warehouse is a collection of data in support of management’s decisions Non-volatile Time variant Cleaned Data Warehouse Query & Distribute to End User Scattered Information Customer • Data Warehouse • Stores static data that has been extracted from other databases in an organization • Central source of data that has been cleaned, transformed, and cataloged • Data is used for data mining, analytical processing, analysis, research, decision support
Data Warehouse Source Data Data Mart Enterprise server Workgroup server Query, Reporting tool SQL External file SQL OLAP tool SQL EIS/DSS Application SQL SQL OLTP System RDB Datamining Application SQL SQL Slice/Dice MDB Web browser Back up file Infra, Data integration and Administration Application development, Data access & Use * Building the Data Warehouse *Use of Data Warehouse 1. Data Warehouse1) Introduction of Data Warehouse Data Warehouse Architecture Data Warehouse architecture
1. Data Warehouse1) Introduction of Data Warehouse Data Warehouse Architecture external data Data Manager Component Data Delivery Component source data Data Acquisition Component Middleware Component Data Access Component warehouse data Information Directory Component Design Component warehouse metadata external metadata Management Component Technical architecture for a data warehousing system
1. Data Warehouse2) Concepts for Data Warehouse Introduction of Database Relational Structure Object-Oriented Structure • Definition of database • Integrated collection of logically related data elements • Common Database Structures (Types) • Hierarchical • Early DBMS structure • Records arranged in tree-like structure • Relationships are one-to-many • Network • Used in some mainframe DBMS packages • Many-to-many relationships • Relational • Most widely used structure • Data elements are stored in tables • Row represents a record; column is a field • Can relate data in one file with data in another, if both files share a common data element • Multidimensional • Variation of relational model • Uses multidimensional structures to organize data • Data elements are viewed as being in cubes • Popular for analytical databases that support Online Analytical Processing (OLAP) • Object-Oriented • Store data together with the appropriate methods for accessing it i.e. encapsulation • Information is represented in the form of objects as used in object-oriented programming
1. Data Warehouse2) Concepts for Data Warehouse Metadata and Data Marts • Metadata • Data about data (similar to catalog card in library) • Define the data in the data warehouse • Enable to find the data in data warehouse, more easily and fast • Data Marts • Collection of database • Comparing with Data Warehouse, data marts are usually smaller and focus on a particular subject or department. • Data marts are subsets of larger Data Warehouse • Data Warehouse vs. Data Mart • Data in Data Warehouse • The data needs to be gathered from all the relevant transactional systems that produce it, cleansed and validated, and made available from a system-of-record that ensures the referential integrity of the data • Data in Data Mart • The data needs to be presented in a structure that is intuitive to the users and facilitates their ability to query the data that is relevant to their needs
1. Data Warehouse2) Concepts for Data Warehouse Information Flow Data Marts Management Reporting Finance MetadataRepository Data Warehouse Accounting Internal / External Database Sales Internal / External Database Marketing Data Warehouse built on top of DB
1. Data Warehouse2) Concepts for Data Warehouse Data Warehouse Components Data Warehouse Components
1. Data Warehouse2) Concepts for Data Warehouse Applications and Data Marts Applications and Data Marts
1. Data Warehouse3) Difficulties and Trends Difficulties in implementing DW • Complete Alignment • Make sure you have full involvement and buy -in from those that represent your users - the consumers of your data warehouse. • Iterative & Frequent Update • Consider all aspects of the process of researching your data sources, capturing and transmitting that data to the data warehouse, transforming and loading it into the data warehouse and accounting for its lineage. • Risk • Make sure you develop a proper risk management plan.
1. Data Warehouse3) Difficulties and Trends Future Trends • Enterprise Data Warehouse • The enterprise data warehouse, whether a single store or integrated data marts across a variety of platforms, yields a view of the operation previously unattainable by Don Hatcher, SAS • Real-time • Organization move to more real-time data transformation and seek to better leverage common metadata across applications by Allan Houpt, CA • Capacity • The future of data warehousing is all about ever larger data warehouses - in fact I just read about a U.S. Government effort to create petabyte repositories by Roman Bukary, SAP Director of Market Strategy
2. OLAP1) Introduction of OLAP Definition of OLAP FAST ANALYSIS SHARED MULTIDIMENSIONAL INFORMATION • OLAP (On-Line Analytical Processing) • The dynamic enterprise analysis required to create, manipulate, animate and synthesis information from Enterprise Data Models * Providing OLAP: An IT Mandate E.F. Codd (1993) • FASMI (Fast Analysis of Shared Multidimensional Information) • This definition was first used in early 1995, and has not needed revision since Pendse& Greeth (1995)
2. OLAP1) Introduction of OLAP OLAP Architecture OLAP Architecture
Information Source Information Broker Information Consumer 2. OLAP2) Concepts for OLAP From OLTP to OLAP • Data used in OLAP • Sales data of June? (OLTP) • Multi-dimensional data (having many features) (OLAP) • Direct Access: EUC Environment • From What to Why • OLTP: Storing primitive data, supporting routine business operation (What) • OLAP: Storing cumulative data, supporting business goal (Why)
2. OLAP2) Concepts for OLAP OLTP vs. OLAP OLTP vs. OLAP
2. OLAP2) Concepts for OLAP Enterprise IT Architecture OLTP/OLAP Enterprise IT Architecture
2. OLAP2) Concepts for OLAP Data Warehouse vs. OLAP Server Data Warehouse vs. OLAP Server
2. OLAP2) Concepts for OLAP Two types of OLAP Query MDBMS MD Processing Clients Respond SQL Query Clients Clients RDBMS MD Processing SQL Respond MOLAP ROLAP
Hierarchy 2. OLAP2) Concepts for OLAP From RDB to MDB Cube Table Field, Row Dimension Record, Column • MDB: OLAP • Basic Data Structure of MDB & RDB • RDB: OLTP, Data Warehouse • RDB as OLAP Server • Cannot handle and represent Multi-dimensional relationship well • Cannot summarize data well • MDB as OLAP Server • Gives many managerial viewpoints • EUC • Supports analysis functionality
Reference Euiho Suh, “EIS_DSS_OLAP_DW (PPT Slide)”, POSMIT Lab. (POSTECH Strategic Management of Information and Technology Laboratory) Euiho Suh, “OLAP (PPT Slide)”, POSMIT Lab. (POSTECH Strategic Management of Information and Technology Laboratory) O’Brien & Marakas, “Introduction to Information Systems – Sixteenth Edition”, McGraw – Hill, Chapter 5