170 likes | 829 Views
Data Warehouse Architecture Sakthi Angappamudali Data Architect, The Standard @ Oregon State University, Corvallis 16 th May, 2005. What is a Data Warehouse?. A Data Warehouse is a Subject Oriented Integrated Non-volatile Time variant
E N D
Data WarehouseArchitecture Sakthi Angappamudali Data Architect, The Standard @ Oregon State University, Corvallis 16th May, 2005
What is a Data Warehouse? • A Data Warehouse is a • Subject Oriented • Integrated • Non-volatile • Time variant • collection of detailed and summary data used to support the strategic decision making process for the enterprise
Characteristics of a Data Warehouse A Data Warehouse has the following characteristics: Purpose - Decision Support Users - Operational personnel, Analysts, Strategists Orientation - Discovery-oriented Integration - Maximum integration Data Quality - Enhanced Data Enrichment - Strategic Volatility - Non-volatile Chronology - Time-variant Granularity - Detailed and summary
The Decision Making Roadmap Business Planning Actions Vision Knowledge Transaction Systems Decision Support Systems Executive Information Systems Data Information RUN MANAGE GROW • Operational • Functional • Current • Detailed • Analytical • Subject • History • Detailed • Dimensional • Subject • History • Summary Management Users Knowledge Brokers
A Data Warehouse Is A Process Data Characteristics • Raw Detail • No/Minimal History • Integrated • Scrubbed • History • Summaries • Targeted • Specialized (OLAP) Source OLTP Systems Data Warehouse Central Repository • Design • Mapping • Extract • Scrub • Transform • Load • Index • Aggregation Meta Data System Monitoring Architected Data Mart End User Workstations • Access & Analysis • Resource Scheduling & Distribution • Replication • Data Set Distribution
Types of Warehousing Solutions • Operational Data Store (ODS) • integrated, current, detailed data for operational activities • Corporate Information Factory (CIF) • integrated, historic, summary and detailed data for company-wide data analysis • Data Mart (DM) • independent, historic, summary data for a small group of business users analyzing a specific business process
There Are Many Options User Workstations Operational Source Systems E x t r a c t i o n S y s t e m s Architected Data Mart Operational Data Store Data Warehouse Independent Data Mart
Solution Choices - ODS • Run your business (Operational Data Store) – Tactical • Perform functions not supported in transaction systems • Perform operational reporting (without impacting “real” system) • Data is currently valued (could be real-time as well) • Detail data analysis capabilities • Subject oriented along the lines of the major entities of the corporation • Integrated (physical unification and cohesiveness of the data • Volatile - can be updated as a normal part of processing • Detailed - contains detailed data only
Solution Choices - DW • Manage your business (Enterprise Warehouse, Data Marts) – Strategic • No business functions are performed (read only) • Aggregations are maintained • Measures and dimensions are defined for slice and dice capabilities • History is maintained for trend analysis • On-line Analytical Processing (OLAP) model
Operational Transaction Systems Data Warehouse Systems Operational Data Store Systems So, When Is Each System Type Used
Operational Source Systems Operational Data Store Data Warehouse Properties Data Mart Detailed Data + Appropriate Summary Summary Information + Appropriate Detail Single Function Summary Detailed Data Nearly Current Point-in-Time Current Point-in-Time Frequently Continually Periodically Periodically Tuned for Production Environment Tuning Not Usually An Issue Tuned for Update Tuned for Query Controlled for Performance Moderate Low May Be Very High Non-Volatile Very Volatile Volatile Non-Volatile What are the differences? Contents Timeliness Updated Performance Needs Volatility of Contents Amount of Data Accessed
Data Warehouse Tools Design/Transform/Extract/Aggregate/Monitor/Manage Suites / Environments Repositories Database & System Monitors DB Design Job Schedulers CASE Replication/Distribution Tools Extract//Transform/Load RDBMS Utilities • Design • Mapping • Extract • Scrub • Transform • Load • Index • Aggregation • Replication • Data Set Distribution • Access & Analysis • Resource Scheduling & Distribution Meta Data System Monitoring Metadata Browsers Data Visualization Data Mining EIS MOLAP/ROLAP
W W A A R R Design Focus Architected Data Mart Data Mart E E Architected Define H H and O O U U Develop Technology Process S S Methodology Architected Knowledge Team E E Data Mart Business P D Knowledge Team L E V A Business N E Sponsor L N Deploy O I Needs Architected P N User Data Mart M G Survey E S N T T A Maintain Architected G S Project Data Mart Initiate T E A G E Data Warehouse Development Methodology Architecture Define
Distribution Sales Product Customer Accounts Marketing Inventory Finance Vendors An Incremental Approach Glossary Common Business Metrics Common Business Rules Common Business Dimensions Common Logical Subject Area ERD Individual Architected Data Marts
Distribution Sales Product Customer Accounts Marketing Inventory Finance Vendors Enterprise Data Warehouse The Eventual Result Architected Enterprise Foundation