200 likes | 360 Views
Acct 6910 Building Business Intelligence Systems. An Introduction to Data Warehouse. Agenda. Why Data Warehouse What is Data Warehouse Current practice of data warehouse. Why Data Warehouse. Why Database??. Why Data Warehouse. Problems with current database practices:
E N D
Acct 6910Building Business Intelligence Systems An Introduction to Data Warehouse
Agenda • Why Data Warehouse • What is Data Warehouse • Current practice of data warehouse
Why Data Warehouse Why Database??
Why Data Warehouse Problems with current database practices: • Problem 1: Isolated databases distributed in an enterprise • Sub-problems: • Data Inconsistency • No comprehensive view of enterprise’s data sources – information island Sales CRM Inventory
Why Data Warehouse • Problem 1: Isolated databases distributed in an enterprise • Sub-problems: • Data Inconsistency • Performance Sales CRM Inventory
Why Data Warehouse • Problem 2: Historical data is archived in offline storage systems • Sub-problems: • Historical data is always needed to support business decisions Historical Sales Data Sales Archive
Why Data Warehouse • A marketing manager wants to know sales amount distribution by product category and customer state in July? • Query???
Why Data Warehouse • Problem 3: Database is designed to process transactions but not to answer decision support queries • Complex queries • Bad query performance
What is Data Warehouse Data Warehouse is designed to solve problems associated with current database practices: • Problem 1: Isolated databases distributed in an enterprise Sales CRM Data Warehouse Extract, Integrate and Replicate Inventory
Why Data Warehouse • Problem 2: Historical data is archived in offline storage systems Data Warehouse Historical Sales Data Sales Archive Integrate Historical Data with Current Data
What is Data Warehouse • Problem 3: Database is designed to process transactions but not to answer decision support queries • Solution: In data warehouse, organize data in subject –oriented way rather than process-oriented way – dimensional modeling.
What is Data Warehouse Dimensional Modeling ER Modeling
What is Data Warehouse • Data Warehouse is a subject-oriented, integrated, time-variant, non-volatile collection of data in support of management’s decision making process. • 1. Subject-oriented means the data warehouse focuses on the high-level entities of business such as sales, products, and customers. This is in contrast to database systems, which deals with processes such as placing an order.
What is Data Warehouse • 2. Integrated means the data is integrated from distributed data sources and historical data sources and stored in a consistent format. 3. Time-variant means the data associates with a point in time (i.e., semester, fiscal year and pay period) 4. Non-volatile means the data doesn’t change once it gets into the warehouse.
Current Practice of DW* • Expected DW market value is 2002 will grow to $113.5 billion. • Average DW development cost is $1.5 million and average maintenance cost is $0.5 million. * Source: H.J. Watson, “ Current Practicing in Data Warehousing”, I.S. Management, 2001
Current Practice of DW* • Sponsorship for the DW project Sponsor Percentage VP of a business unit 39.8 CIO 26.9 Business unit manager 16.7 CEO 11.1 Other 25.0 * Source: H.J. Watson, “ Current Practicing in Data Warehousing”, I.S. Management, 2001
Current Practice of DW* • DW Benefits • Less effort to produce better information • Better decisions • Improvement of business processes • Supporting for accomplishments of strategic business objectives * Source: H.J. Watson, “ Current Practicing in Data Warehousing”, I.S. Management, 2001
Reading: “ The Data Warehouse Toolkit” – Chapter 1