1 / 33

Defining Data Warehouse Concepts and Terminology

Defining Data Warehouse Concepts and Terminology. Objectives. After completing this lesson, you should be able to do the following: Identify a common, broadly accepted definition of a data warehouse Describe the differences of dependent and independent data marts

Download Presentation

Defining Data Warehouse Concepts and Terminology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Defining Data Warehouse Concepts and Terminology

  2. Objectives • After completing this lesson, you should be able to do the following: • Identify a common, broadly accepted definition of a data warehouse • Describe the differences of dependent and independent data marts • Identify some of the main warehouse development approaches • Recognize some of the operational properties and common terminology of a data warehouse

  3. Definition of a Data Warehouse • “A data warehouse is a subject oriented, integrated, non-volatile, and time variant collection of data in support of management’s decisions.” • — W.H. Inmon • “An enterprise structured repository of subject-oriented, time-variant, historical data used for information retrieval and decision support. The data warehouse stores atomic and summary data.” • — Oracle’s Data Warehouse Definition

  4. Subject- oriented Integrated Data Warehouse Nonvolatile Time-variant Data Warehouse Properties

  5. OLTP Applications Data Warehouse Subject Equity Plans Shares Insurance Loans Savings Customer financial information Subject-Oriented • Data is categorized and stored by business subject rather than by application.

  6. Savings Current Accounts Loans Integrated • Data on a given subject is defined and stored once. Customer OLTP Applications Data Warehouse

  7. Data Warehouse Time-Variant • Data is stored as a series of snapshots, each representing a period of time.

  8. Nonvolatile • Typically data in the data warehouse is not updated or deleted. Operational Warehouse Load Insert, Update, Delete, or Read Read

  9. First time load Refresh Refresh Purge or Archive Refresh Changing Warehouse Data Operational Databases Warehouse Database

  10. Data Warehouse Versus OLTP

  11. Usage Curves • Operational system is predictable • Data warehouse: • Variable • Random

  12. User Expectations • Control expectations • Set achievable targets for query response • Set SLAs • Educate • Growth and use is exponential

  13. Enterprisewide Warehouse • Large scale implementation • Scopes the entire business • Data from all subject areas • Developed incrementally • Single source of enterprisewide data • Synchronized enterprisewide data • Single distribution point to dependent data marts

  14. Data Warehouses Versus Data Marts

  15. OperationalSystems Marketing Finance Sales Flat Files Legacy Data External Data Operations Data External Data Dependent Data Mart Data Marts Data Warehouse Marketing Sales Finance HR

  16. OperationalSystems Flat Files Legacy Data External Data Operations Data External Data Independent Data Mart Sales orMarketing

  17. ODS Typical DataWarehouse Components Source Systems Staging Area Presentation Area Access Tools Legacy Data Warehouse External Data Marts Operational Metadata Repository

  18. Warehouse Development Approaches • “Big bang” approach • Incremental approach: • Top-down incremental approach • Bottom-up incremental approach

  19. Analyze enterprise requirements Build enterprise data warehouse Report in subsets or store in data marts “Big Bang” Approach

  20. Top-Down Approach • Analyze requirements at the enterprise level • Develop conceptual information model • Identify and prioritize subject areas • Complete a model of selected subject area • Map to available data • Perform a source system analysis • Implement base technical architecture • Establish metadata, extraction, and load processes for the initial subject area • Create and populate the initial subject area data mart within the overall warehouse framework

  21. Bottom-Up Approach • Define the scope and coverage of the data warehouse and analyze the source systems within this scope • Define the initial increment based on the political pressure, assumed business benefit and data volume • Implement base technical architecture and establish metadata, extraction, and load processes as required by increment • Create and populate the initial subject areas within the overall warehouse framework

  22. Incremental Approach to Warehouse Development • Multiple iterations • Shorter implementations • Validation of each phase Increment 1 Strategy Definition Analysis Design Build Iterative Production

  23. Data Warehousing Process Components • Methodology • Architecture • Extraction, Transformation, and Load (ETL) • Implementation • Operation and Support

  24. Methodology • Ensures a successful data warehouse • Encourages incremental development • Provides a staged approach to an enterprisewide warehouse: • Safe • Manageable • Proven • Recommended

  25. Architecture • “Provides the planning, structure, and standardization needed to ensure integration of multiple components, projects, and processes across time.” • “Establishes the framework, standards, and procedures for the data warehouse at an enterprise level.” • — The Data Warehousing Institute

  26. Source Staging Area Target Extraction, Transformation, and Load (ETL) • “Effective data extract, transform and load (ETL) processes represent the number one success factor for your data warehouse project and can absorb up to 70 percent of the time spent on a typical data warehousing project.” • — DM Review, March 2001

  27. . . . Implementation Data Warehouse Architecture Ex., Incremental Implementation Implementation Increment 1 Increment 2 Increment n

  28. Operation and Support • Data access and reporting • Refreshing warehouse data • Monitoring • Responding to change

  29. Strategy Definition Analysis Design Build Production Phases of theIncremental Approach • Strategy • Definition • Analysis • Design • Build • Production Increment 1

  30. Strategy Phase Deliverables • Business goals and objectives • Data warehouse purpose, objectives, and scope • Enterprise data warehouse logical model • Incremental milestones • Source systems data flows • Subject area gap analysis

  31. Strategy Phase Deliverables • Data acquisition strategy • Data quality strategy • Metadata strategy • Data access environment • Training strategy

  32. Summary • In this lesson, you should have learned how to: • Identify a common, broadly accepted definition of a data warehouse • Describe the differences of dependent and independent data marts • Identify some of the main warehouse development approaches • Recognize some of the operational properties and common terminology of a data warehouse

  33. Practice 2-1 Overview • This practice covers the following topics: • Answering questions regarding data warehousing concept and terminology • Discussing some of the data warehouse concept and terminology

More Related