280 likes | 503 Views
InfoSphere Warehouse on System z (aka InfoSphere Warehouse for DB2 for z/OS). June 2009. InfoSphere Warehouse for System z - Business Value. Reduces costs More efficient access to existing data More efficient data movement Leverage cheaper zIIP/IFL specialty engines
E N D
InfoSphere Warehouse on System z(aka InfoSphere Warehouse for DB2 for z/OS) June 2009
InfoSphere Warehouse for System z - Business Value Reduces costs • More efficient access to existing data • More efficient data movement • Leverage cheaper zIIP/IFL specialty engines • Utilization of existing processors, people, practices • Potential for consolidation Simplifies operational complexity • Deployment of both operational and warehouse data on a single platform • Leverage existing operational, compliance and security procedures • Leverage existing conventions and skills for data management and database administration Improves quality of service • OLAP Query performance for data in DB2 for z/OS • Can be deployed in an HA mode (more than one image) • Efficient data movement supports near real-time decisions based on core business data managed in DB2 for z/OS • Highest scalability for new application requirements (e.g., operational BI) Add’l value from existing DB2 data • Gain additional competitive advantage and value from operational data through the ability to support near real-time decisions based on core business data managed in DB2 for z/OS. • Advanced tactical analytics with longer term history while minimizing impact on operational system • Addresses need for increased reliability, availability, security, and compliance in a data warehouse
Business Scenarios Customers with operational data on DB2 for z/OS • Extending existing BI infrastructure • Leverage tools for modeling • Improve query performance • Reduce costs with more efficient processing • Add new BI infrastructure on System z • Address security, data latency and data duplication issues • Solution that allows you to keep data on System z
This is not a brand new idea Many z customers already use DB2 for z/OS for warehouse and BI IBM is responding with new DB2 features, new software offerings and improved hardware performance and efficiency Customer commitment to the z platform continues Customers want to protect their significant investment in System z Quality of Service TCO can be reduced through the utilization of existing processors, people, practices TCO may also be achieved through a consolidation approach New BI trends map well to the strengths of DB2 for z/OS The distinction is blurring between warehouse and OLTP databases based on new trends such as Dynamic Warehouse and Operational BI, driving: The need for increased reliability, availability, security, and compliance in a DWH DBMS The need for very current warehouse data and/or collocation of warehouse and operational data Specialty processors and the new z10 provide ways to optimize TCO zIIPs and IFLs are driving down hardware and software costs; DWH/BI can make excellent use of these processors, ultimately driving TCO advantages The new processors are delivering excellent speeds and feeds, making cpu horsepower less of an issue Data Warehousing on DB2 for z/OS – What is driving this?
DB2 for z/OS with Data Sharing Data • DWH node can efficiently access OLTP data • Transformation and data movement in same data sharing group • OLTP workload not impacted • Effective resource management through WLM or IRD BI Application Operational Application OLTP Node DWH Node OLTP Data DHW Data Leverage Data Proximity on System z • Building a data warehouse requires access to operational data, transformations and finally loading into a data warehouse database • With DB2 for z/OS, data warehouse tables may be accessed in the same data sharing group where the operational data already resides • This setup is • efficient • secure (data is not leaving the platform) • easy to manage
What can you do with InfoSphere Warehouse? • Design and modify database physical models • Schema and storage design for DB2 for z/OS • Design and model cubes, dimensions, hierarchies for OLAP analytics • Build and transform physical data structures with the SQL Warehousing tool (SQW) • Designed data flow, data transformation, and control flow operations required to load the data warehouse or data mart • Administration through a browser based console • Access to multidimensional data with no-copy OLAP analytics • Integration with BI tools through standard OLE DB for OLAP (ODBO) and XML for Analysis (XMLA) interfaces
Cubing Services Engine SQW Runtime InfoSphere Warehouse on System z - Architecture MQT Advisor Windows / Linux Eclipse Design Studio • Client Layer • Design and admin client • BI / Reporting tools and apps IE/Firefox Admin Console Cognos 8 BI for System z Excel MDX Linux on System z Partition / IFL WebSphere Application Server Application Server SQL JDBC/DB2 Connect/Hipersockets DB2 for z/OS Data Warehouse Server Cube Metadata MQT Control DB DB2 for z/OS IMS VSAM RDBMS Source Systems
InfoSphere WarehouseThe Integrated Stackfor System z Present Information Transform Information Warehouse Information Excel External Data Sources Cubing Warehouse MDM Acquire Data Operational Applications ELT Replication EAI EII ETL Data Quality Common Meta Data Operational Source Systems Structured/ Unstructured Data
InfoSphere Warehouse on System z MDM Server InfoSphere Data Architect Cognos 8 BI InfoSphere Warehouse IBM Industry Models Information Server Adds core data warehouse and analytics capability to DB2 on system z • Advanced physical database modeling and design • in-database data movement and manipulation capabilities of SQL Warehouse Tool (SQW) • Perform multidimensional reporting and analysis of data with Cubing Service Expand solution to add additional capabilities • Cognos 8 BI • InfoSphere Data Architect • Information Server • MDM Server • Industry Data Models 9
InfoSphere Warehouse on System z Tooling • Design Studio – development IDE • Develop physical data model • Develop data movement flows • Develop OLAP Model • Prepare for deployment • Administration Console – manage the runtime environment • Deploy data movement applications • Schedule, Execute, Monitor flows • Define and manage cube servers • Manage OLAP Metadata • Assign cubes to cube servers 11
Data modeling overview • Design and modify database physical models (schema & storage design, as well as cubes, dimensions, hierarchies) Key Features: • Create a new DB design from scratch • Reverse engineer from an existing Database Explorer connection or from DDL • Create overview diagrams • Modify the schema graphically or in the project tree • Compare DB objects with each other or with objects existing in the database • Analyze design (best practices, and dependencies), Validation • Generate DDL script & Deploy • Impact Analysis • DB2 Storage Modeling: Table Space, Buffer Pool, Partition
Data Flows for in-database data movement and transformation • Eclipse tool that generates SQL instructions, optimized for DB2, to move and transform data within the database • Ease of use, drag and drop UI for fast warehouse building • Integrated with physical data model for source and target tables and other database objects • Data flow for doing transformations: 30+ SQL operators optimized for DB2 • Leverages DB2 functions for transformations 100+ • Customize with User-Defined Functions or custom SQL • Test and debug dataflows • Perform warehouse building in homogeneous “database” environment • Leverage existing database infrastructure • Leverage existing DBA skills, reuse custom SQL code • Native integration for fast BI application deployment • Database modeling and administration • Data Prep for in-database Report, Mining & OLAP • Analyze impact from tables to BI applications • Allow a team of BI designers to work together, and deal with changes quickly
Would you rather type this ? ………………………. SELECT SALES.OU_IP_ID AS STR_IP_ID, SALES.PD_ID AS PD_ID, SALES.MSR_PRD_ID AS TIME_ID, SALES.C_D_MKT_BSKT_TXN_ID AS NMBR_OF_MRKT_BSKTS, SALES.SUM_NBR_ITM AS NUMBER_OF_ITEMS, CASE WHEN SALES.M_BK_PD_SUB_DEPT_NM IN ('BATH AND SHOWER', 'CAMERAS') THEN SALES.BKP_SUM_NBR_ITMXPRC * DECIMAL(MARTS.RAND1N(5) + 123) / 100 ELSE SALES.BKP_SUM_NBR_ITMXPRC * DECIMAL(MARTS.RAND1N(5) + 102) / 100 END AS PRDCT_BK_PRC_AMUNT, CASE WHEN SALES.MIN_CG_PD_DEPT_NM IN ('TEEN BOYS', 'TEEN BOYS JEANS', 'DRESS FORMAL','MEN SHOES') THEN (DECIMAL(68 - MARTS.RAND1N(5)) / 100) * SALES.SUM_CG_NBR_ITMX_PRC WHEN SALES.MIN_CG_PD_DEPT_NM IN ('ELECTRICAL APPLIANCES','ELECTRONICS','COLORED TELEVISIONS','WOMEN SHOES') THEN (DECIMAL(77 - MARTS.RAND1N(5)) / 100) * SALES.SUM_CG_NBR_ITMX_PRC WHEN SALES.MIN_CG_PD_DEPT_NM IN ('HEALTH AND BEAUTY') THEN (DECIMAL(65 - MARTS.RAND1N(5)) / 100) * SALES.SUM_CG_NBR_ITMX_PRC ELSE (DECIMAL(72 - MARTS.RAND1N(5)) / 100) * SALES.SUM_CG_NBR_ITMX_PRC END AS CST_OF_GDS_SLD_CGS, SALES.SUM_NBR_ITMXSTM_PRC AS SALES_AMOUNT FROM SALES) …………
would you rather describe your logic at a higher-level ? A simple “Skills” Star schema
… and have optimized SQL generated for you? INSERT INTO OLAPANL.STAR_FACT_TABLE (ID, COMPANY_ID, TIME_ID, SKILL_DETAILS_ID, NB_SKILLS) WITHINPUT_04 (COMPANY_NAME, TIME, ID, SKILL_CAT, SKILL_DETAILS, SKILL_ID) AS ( SELECT COMPANY_NAME AS COMPANY_NAME, TIME AS TIME, ID AS ID, SKILL_CAT AS SKILL_CAT, SKILL_DETAILS AS SKILL_DETAILS, SKILL_ID AS SKILL_ID FROM TXTANL.IT_SKILLS_ASKED INPUT_0281), IN4_07 (ID, SKILLS_PER_OFFER) AS ( SELECT INPUT_04.ID AS ID, COUNT(*) AS SKILLS_PER_OFFER FROM INPUT_04 GROUP BY INPUT_04.ID) SELECT IN_07.ID AS ID, IN1_07.COMPANY_ID AS COMPANY_ID, IN2_07.TIME_ID AS TIME_ID, IN3_07.SKILL_DETAILS_ID AS SKILL_DETAILS_ID, DOUBLE(1) / DOUBLE(IN4_07.SKILLS_PER_OFFER) AS NB_SKILLS FROM TXTANL.IT_SKILLS_ASKED IN_07, OLAPANL.STAR_COMPANY IN1_07, OLAPANL.STAR_TIME IN2_07, OLAPANL.STAR_SKILL IN3_07, IN4_07 WHERE (IN_07.SKILL_DETAILS = IN3_07.SKILL_DETAILS AND IN_07.COMPANY_NAME = IN1_07.COMPANY_NAME AND IN_07.TIME = IN2_07.TIME_DATE AND IN_07.ID = IN4_07.ID)
Control Flows: Control and Sequencing Key Features: • Control flows to coordinate the execution of several data flows and other activities • Support for execution conditions: On Success, on Failure, always, Variable Comparison • Support for non-database activities like: • Data flows • Datastage jobs • Executables, FTP • SQL Scripts, JCL jobs, Runstats, Reorg, Unload • E-mail, Logging • Iterators • Stored Procedures • Execute activities in Parallel • Modularize using Subprocesses • Generate deployable packages (Warehouse Applications)
What is OLAP (online analytical processing)? • Empowers users with ad hoc access to business information • “What is the profitability for Product A across the Branches X,Y,Z?” • Different modeling and data access compared to online transaction processing (OLTP) applications • Requires grouping, aggregation, and calculation of information across various business dimensions (e.g., time, product, customer, …) • Slice/dice, drill down/up, … Time Products Branches
What is Multidimensional Analysis or OLAP? • Navigation - Gives structure to the schema data through metadata • Dimensions: Product, Geography, Time • Dimensions have attributes: Products have colors, sizes, price ranges • Dimensions have levels: Region->State->City • Aggregation • Express queries in terms of dimensions • Report Aggregate data along business views • Great calculations • Simple aggregations: sums, averages • Time based calculations – 3 month moving averages • Multi-pass calculations – rank, percentage of total
Cubing Services (OLAP) Cubing Services is a Multidimensional Analysis Server that enables OLAP applications access to Terabyte data volumes via industry standard OLAP connectivity Benefits Uses industry standards OLAP Interfaces for access by a wide variety of tools for presentation and reporting Empowers users with ad hoc access to business information. What is the profitability for Product A across the Branches X,Y,Z? Speed of thought access to OLAP data managed in a DB2 Data Warehouse. Thus for your DB2 Warehouse OLAP data there is: OLAP and SQL shared access to the same information Single point of Management Single point of Maintenance Single point of Performance Tuning Enables Access to up to 1 TB of base OLAP Data IBM DB2 Warehouse
Cubing Services - OLAP Analytics and Open Access Reporting Tool Excel Alphablox/Cognos/DataQuant MDX/XMLA Cubing Services MDX MDX/ODBO SQL Linux on System z DB2 Warehouse z/OS
Microsoft Excel 2007 • IBM OLE DB Provider for OLAP • Uses Excel Pivot Table Services
Cognos 8 BI Integration Query Studio Analysis Studio
Cognos 8 BI Integration Report Studio Report in PDF Report Studio Report Definition