490 likes | 928 Views
Informatica Data Virtualization The “Foundation” for AGILITY & PRODUCTIVITY. Kerry Holton Informatica Senior Sales Engineer. H. Let’s Win Something!!!. A copy of “ Lean Integration .” Tell me which box is the ONLY thing that data virtualization built on data federation does – and why???.
E N D
Informatica Data VirtualizationThe “Foundation” for AGILITY & PRODUCTIVITY Kerry Holton Informatica Senior Sales Engineer
H Let’s Win Something!!! A copy of “Lean Integration.” Tell me which box is the ONLY thing that data virtualization built on data federation does – and why??? Answer questions along the way… Take some good notes !
To Learn More… Informatica.com > Products > PowerCenter > Data Virtualization Edition Informatica.com > Products > Data Virtualization Sign-Up Expert Roundtables Data Virtualization Corner • http://vip.informatica.com/?elqPURLPage=8668 JOIN & DISCUSS 2000+ Strong “Data Virtualization & Data Services Architecture” Group
Agenda • “2012” – The Year of “BI” Agility • Data Virtualization – Overview, Problem & Need • Key Use Cases • Customer Examples • Data Virtualization in Action • Why Informatica? • Next Steps & Q&A
ICC Director (VP of IM) to Dave Lyle (VP Product Strategy), end of Q3, 2009 I’m writing you a million dollar check, but you’re not solving my big problem. My big problem isn’t getting the data into the data warehouse. My big problem is … getting the data out!”
“2012” • Have any of you had this discussion? • Need for a new BI infrastructure • Replacing spreadsheets • Faster data access & reporting BI will be the top priorityfor the CIO, in 2012! Business / BI “Demands by users of business intelligence (BI) applications to "just get it done"are turning typical BI relationships, such as business/IT alignment and the roles that traditional and next-generation BI technologies play, upside down. As business users demand more control over BI applications, IT is losing its once-exclusive controlover BI platforms, tools, and applications.” – Boris Evelson, Forrester Research, Blog - “Top 10 BI Predictions for 2012” IT Our world will be turned upside down… • Business-focused BI • $100M Qtr. in 2011 • 10k+ customers
How Long Does it Take to Deliver New Critical Data or Reports to the Business?
The Business Can’t Wait 3-6 Months • For a Single View of All Enterprise Data BusinessIntelligence BusinessIntelligence BusinessIntelligence BusinessIntelligence BusinessIntelligence BusinessIntelligence BusinessIntelligence Social Warehouses NoSQL … Cloud Computing Unstructured NACHA SWIFT Partner Data Applications Databases HIPAA ETL SOA Hand Coding ESB/EAI EII
HealthNow’s Data Integration Challenges Business IT BI (Cognos) Portal (WebSphere) NO REUSE 30,000 Data Marts Were Created by Shadow IT Teams So What Did the Business Do? To Add 1 Product Attribute to Existing Report – IT Estimated 1700 Hours Different Price Info in Each LOB 16 Types of Data Sources 30,000 Data Marts (MS Access) Data Warehouse (DB2) Facets [Benefits, Products] (Sybase ASE) Product Config Mgmt (MS SQL Server)
The Fundamental Problem(s)… TypicalData Integration Process • It takes too long to explain requirements • It takes months to change a DW / add new critical data • It takes many iterations to get the right data / reports • Changes can break existing integrations & impact apps. • Design • Change • Integrate • Unit Test • Validate • Deploy Business is Involved Too Late As-Is Value Stream Map (LOT OF WAIT & WASTE)
Applications Trying to Solve it in BI Layer Just Wont Scale…Why? No Reuse No Common Data Access Layer No Easy Way to Handle Change No Data Quality & No Data Consistency Unstructured Data Spread Marts EDW DATA MART
What is Needed to Solve these Problems? COMMON ACCESS LAYER ACROSS MANY DATA SOURCES BI Portal Composite Apps Data Consumers FAST, DIRECT ACCESS TO DATA THE BUSINESS TRUSTS Logical Data Objects Data Abstraction CUSTOMER ORDER PRODUCT … DATA ABSTRACTION & REUSE OF SKILLS/LOGIC Enterprise Data Sources SUPPORT ALL USE CASES Logical View of All Underlying Data Think Virtual Machines for DATA! BI / DW MDM SOA
How is the Market Trying to Address the Problems? Time GAINED by federation is nullified by Time SPENT on more processing Data Virtualization (Built-On Data Federation) Limited or Data Source Profiling Only SQL/XQuery Only Transformations & No Data Quality BI X X Deliver X DW Virtual View Merge Cannot Easily Move to Persistent Store or Reuse Access DW • Addresses specific use cases • No data movement / no copies / only federation • Code heavy / not model-based / no reuse • Not tools for business self-service • SQL/XQuery-only transformations • No data profiling / no data quality It’s like ONE step forward & TWO steps backward
What Are the Top 3 Key Capabilities for a Project that Needs Data Virtualization? If Performance is a given… Dataset - 600 Source – Informatica Data Virtualization Expert’s Forum ,2011
Web Services Accounts CRM Call Center Accounts ACCESS & MERGE MOVE OR FEDERATE Customer Batch Name Address Category 7 6 1 3 4 5 2 SQL Orders BusinessManager Analyst,Steward Developer,Architect CRM Accounts MODEL SCALE & PERFORM REUSE INSTANTLY Common Metadata PROFILE IN RT DW Advanced Transformations, Data Quality, Data Masking TRANSFORM IN RT What Does the Ideal Solution Look Like? Virtual Table Virtual Table Optimizations & Caching Virtual Table Virtual Table Business IT Virtual Table Virtual Table Virtual Table Query Engine WS Server
How Does Informatica Deliver the Ideal Solution? Data Virtualization = (Data Integration + Data Federation) in ONE Tool Advanced Transformations & Data Quality Analyze & Profile Data & Logic Anytime Early Business Involvement BI Deliver DW Virtual View Merge PrototypeFirst Move to DW or Instantly Reuse as SQL / WS Access DW • Single environment for both data integration and data federation • No data movement / no copies – but easily reuse virtual views for batch • Early & iterative business (analyst) involvement – self-service • Pre-built library of rich ETL-like advanced data transformations • Integrated real-time, on-the-fly data profiling & data quality
CUSTOMER CUSTOMER SUPPORT PRODUCT INVOICE Cust DW DM ODS WEB EXISTING QUERY NEW QUERY • NEW REQUEST • Change / add an attribute • Join new data not in DW • Create a new report NEW DATA & REPORTS THAT BUSINESS NEEDS & TRUSTS, DELIVERED IN DAYS vs. MONTHS How Does It Work? INSTANT REUSE SELECT * FROM customer_table SELECT * FROM customer_table INNER JOIN support_table ON customer_table.customer_num = support_table.customer_id WHERE customer_name=‘ACME’ SELECT * FROM SUPPORT DM DM DW DM DW Virtual view can be physically materialized later into DW Complement data architecture with virtualization Trusted blend of historical and operational data delivered On-boarding new data does not break integrations Data quality rules applied on-the-fly against data Results retrieved in real-time without data movement Query is processed by virtualization layer New query for report needing data not in DW Retrieve historical customer datatxt
“Virtual Table” Common Data Model MEMBER CLAIM ORDER PRODUCT Fast, Direct Data Delivery 1 week (vs. 3 months) Shared Repository Informatica Data Virtualization at HealthNow Business IT BI (Cognos) Portal (WebSphere) NO REUSE Instant ReuseDW, BI, SOA & MDM (SQL, Web Services, Batch) 30,000 Data Marts (MS Access) Data Warehouse (DB2) Facets [Benefits, Products] (Sybase ASE) Product Config Mgmt (MS SQL Server)
ETL (PC Standard Edition) What Does Informatica’s Data Virtualization Solution Look Like? NEW • New PowerCenter Edition for AGILITY & PRODUCTIVITY • Combines: • Data integration (PowerCenter SE) • Data Virtualization (IDS Full Use) • Data Profiling (IDE Full Use) • Business-IT Collaboration (Analyst) • Packaged for simplicity and attractively priced • Reuses existing skills and resources PowerCenter Data Virtualization Edition Partitioning Data Profiling Data Federation (Data Services) Developer Tool Analyst Tool 2 Adapters (PWX for Relational)
Weeks/Days Change Request Months What Use Cases Are Supported? 1 DW/Business Intelligence (BI) Prototype DW & accelerate new data& reports from months to days Business IT Deploy toProduction 2 MDM HUB MDM Deliver a complete view of master &transactional data in real-time x TRANSACTIONALSYSTEMS x DATA WAREHOUSE INCOMPLETE VIEW OF CUSTOMER COMPLETE VIEW OF CUSTOMER Virtual View Applications 3 Registry ESB SOA Deliver the missing data services layer to SOA & applications BPM Biz. Services Data Abstraction Data Sources
What are the Benefitsof Informatica’s Solution? • Provide fast, direct access to critical new data & reports in days vs. months • Enable rapid iterations to results with instant Biz-IT collaboration • Deliver flexibility, ensure reuse & insulate applications from changes COMPLETE, CURRENT & TRUSTEDView of All Data, On-Demand
BI, MDM, SOA – HealthNow NY Improves Risk & Pricing Analysis With Data Services BI (Cognos) Portal (WebSphere) SQL, Web Service Virtual Table IDS Data Marts (MS Access) Data Warehouse (DB2) Facets [Benefits, Products] (Sybase ASE) Product Config Mgmt (MS SQL Server) The Challenge The Benefits The Solution • 16 enterprise databases and over 30,000 Access databases • Took 1700 man hours to add a new product to portfolio • Business had to go to 5 different sources for all information related to paid claims • Continued data growth with over 30,000 claims processed per day • Data proliferation leading to HIPAA compliance concerns • Logical data models and data services to represent their core data entities – MEMBER, CLAIMS,PROVIDER, ENCOUNTER, LAB RESULTS • ‘Rate Letter’ project for determination of policy rates and discounts went live in May 2010 • Over 400 Logical data objects and 2 web services being used by around 125 end users • Speed of data delivery – Implemented first project in around 40 man hours. This would have taken an order of magnitude more in the past • Complete view of the truth - Business users now access plan rate information from single service • Better governance – Centrally managed virtual views as opposed to one-off data marts is improving governance of data
BI, SOA - Large Latin American Bank Improves Governance Microsoft Reporting Services Customized Applications SQL, Web Service Virtual Table Data Virtualization Transactions Tables (Mainframe – Adabas, DB2) Data Warehouse (DB2 LUW) Credit Analysis, Applications, AML (SQL Server) Financial Institutions (Flat Files and Messages) The Challenge The Benefits The Solution • Lack of visibility for proper supervision and regulation of the national financial system • Real-time analysis and joining of data (Adabas, DB2, SQLServer, Files) • Persistent data replication even for one-time use • Huge data volumes (Online 6TB, DW 14 TB) • Different reporting tools requesting different data combinations across heterogeneous data sources • Speed of data delivery – implemented first project in around 60 man hours and delivered a new virtual view in < 1hour • Better risk/fraud governance (across more than 6000 financial institutions) and compliance with BASEL I, BASELII and SOX • Complete single view of the truth - business users can now access consistent customer and plan rate data • Centralized management and administration of logical data objects • Logical data models to represent core business entities (e.g. CUSTOMER) • Mainframe virtualization (join data from Adabas, DW DB2, Apps., 3rd Party ) • Logical data models and Web services to deliver flexibility and agility to respond to changing business needs • Creation of logical data objects and physical materialization of virtual views to familiar PowerCenter environment
BI, MDM – VW Leverages Delivers a Complete View of Critical Data On-Demand BI Portal Reuse SQL, Web Service IDQ Virtual Table IDS MDM Hub (Customer, Purchase, Case) (IBM) DW (Service History) (Teradata) PRD [Campaign History] (SAGA/Win) Transactional Systems (Warranty, Service)(Varied) The Challenge The Benefits The Solution • CUSTOMER data in > 30 systems, MDM hub, transaction systems, DW • Have 80% data but missing critical 20% transactions - WARRANTY, SERVICE • No authoritative source of CUSTOMER, PRODUCT data, conflicting relationships • No complete view of CUSTOMER data on-demand is affecting service • Without complete view of data, can’t meet goal to sell 3x more cars by 2018 • Create a common data model for VW owners, prospects, & partners • Federate data in real-time from > 30 systems & transactional systems • Provide easy-to-use, browser-based tools for business & IT to collaborate • Apply reusable DQ rules on-the-fly to CUSTOMER, PRODUCT data • Instantly reuse data services for SQL or Web services • Completed DI, DQ, & data services production pilot in <1 month • Can leverage operational efficiency & real-time decisions to differentiate • Delivered accurate, complete view of CUSTOMER data, on-demand • Lowered costs by increasing productivity & reuse of data services • Supported strategy to triple sales to 1M vehicles annually, by 2018
The “Keystone” – Business Owns the Data While IT Retains Control • Role-based tools for Analysts (Web) & IT developers (eclipse) • Common metadata lets Analysts & IT collaborate in RT • Empower business analysts to: • Define entities & directly access & merge data to create virtual views • Rapidly profile data sources & logic without more processing • Quickly find data & rules via business glossary • Collaborate, test, validate & share results • Cuts the wait & the waste in the process BI Report Analyst Tool (Web Browser) SQL or Web Service VIRTUAL TABLE Portal SQL or Web Service CommonMetadata Batch ETL Developer Tool (Eclipse) Data Warehouse
Web Services Accounts CRM Call Center Accounts ACCESS & MERGE MOVE OR FEDERATE Customer Batch Name Address Category 7 6 1 3 4 5 2 SQL Orders BusinessManager Analyst,Steward Developer,Architect CRM Accounts MODEL SCALE & PERFORM REUSE INSTANTLY Common Metadata PROFILE IN RT DW Advanced Transformations, Data Quality, Data Masking TRANSFORM IN RT The 7 Steps to AGILITY & PRODUCTIVITY Virtual Table Virtual Table Optimizations & Caching Virtual Table Virtual Table Business IT Virtual Table Virtual Table Virtual Table Query Engine WS Server
1. Model Represent underlying data as business entities (CUSTOMER) Provide a common logical view or abstraction of all data Import logical model from 200+ modeling tools (ERWIN) Use visual and metadata based mapping language Instantly reuse logical data object for all applications Common Data Access Layer – Logical Data Object CUSTOMER ORDER PRODUCT INVOICE Unstructured Data Spread Marts Applications EDW Data marts 31
Turn many data sources into ONE with Data Virtualization CUSTOMER SUPPORT PRODUCT INVOICE Social HIPAA Warehouses NoSQL Application Partner Data NACHA Unstructured … Cloud Computing Database SWIFT 2. Access and Merge Master Data Interactional Data Analytical Data Transactional Data Archived Data
3. Profile in RT Rich set of integrated profiling capability to find data anomalies and to discover keys and hidden relationships: • Column & Rule Profiling • Midstream or Comparative Profiling • Join & Overlap Analysis • Primary Key / Foreign Key Profiling • Dependency Profiling
4. Transform in RT • Metadata-driven, codeless, graphical environment • Rich, pre-built library of advanced transformation • Integrated Data Quality transformations • Define policies to mask sensitive data in real time
5. Reuse Instantly Batch • Instantly reuse LDOs for any mode/protocol (SQL, WS) • Single click deployment to batch • Execution & optimization separate from design-time • No re-development & re-building of LDOs SQL Web services METADATA REPOSITORY
6. Move or Federate • Majority of use cases • Physical data movement • Bulk/batch, near real-time, real-time • Advanced transformations • Built-in data quality Data Federation Data Integration BI Deliver BI Virtual View DW Merge DW Single-click deployment to PowerCenter (batch) Access DW • Specific use cases • No data movement / no copies • Real-time federation • SQL/XQuery-only transformations • No data quality / business validation Advanced Transform & Quality Extract Load
7. Scale & Perform • Leverage the proven, high-performance Informatica engine • Optimized SQL Query engine & graphical Query Plan • High-performance Web services server • Rich set of optimizations & caching mechanisms • Rule Based, Cost Based, Push Down, Early Projection, Early Selection, Semi-Join, Virtual Table & Result Set Caching • Fine grained access control, WS-Security & pass-through security • Database, Schema, Table, Column, Row-Level (v9.5) security
Web Services Accounts CRM Call Center Accounts ACCESS & MERGE MOVE OR FEDERATE Customer Batch Name Address Category 7 6 1 3 4 5 2 SQL Orders BusinessManager Analyst,Steward Developer,Architect CRM Accounts MODEL SCALE & PERFORM REUSE INSTANTLY Common Metadata PROFILE IN RT DW Advanced Transformations, Data Quality, Data Masking TRANSFORM IN RT Data Virtualization Built On Data Federation Does 1 Box – Which 1? Virtual Table Virtual Table Optimizations & Caching Virtual Table Virtual Table Business IT Virtual Table Virtual Table Virtual Table Query Engine WS Server
Do it Right – Avoid Costly Mistakes! Enabling Rapid Development Integrating with Quality Scaling with Flexibility Leveraging Investments Analyzing & Profiling Re-work, re-deploy & re-train every time Non-integrated technologies Hand-coding can’t do advanced transforms Only source profiling, need extra processing 1000s of lines of code EII SQL XQuery Simple Cleansing Web Service Business Rules SQL Web Services X STAGE ETL TIME COST TIME COST RISK TIME COST RISK TIME COST RISK TIME COST Many Iterations & Mistakes Limited Rules, No Data Quality Overburden Data Virtualization Re-invent the Wheel Maintenance Nightmare v/s v/s v/s v/s v/s Sustain & Maintain Get it Right 1st Time Bake-in Quality Prototype First & Then Scale Re-purpose Logic & Skills TIME COST TIME COST RISK TIME COST RISK TIME COST TIME COST Virtual Table SQL ETL EII Optimizations Model & metadata- driven environment Profile data AND logic anywhere Leverage pre-built logic including quality Virtualize or physically materialize in 1 tool Naturally extend your infrastructure
Scenario – Big Company ISSUES • Call center talk times increasing = scattered data + many screens • Time wasted in correcting inconsistent & inaccurate customer data • Agents can’t easily & quickly identify what products are owned IMPACT • Can’t easily identify top customers to improve up-sell/cross-sell • Low customer satisfaction & growing customer attrition • High marketing costs without targeted campaigns
Demo – Big Company • Business needs a new report – NOW vs. months! • Quickly merge data from multiple systems & cleanse • Analysts know the data – want some self-service • Join CUSTOMER (Oracle CRM) & ORDER (file) • Get ORDER TOTAL for ACTIVE customers Integrate missing data, do data cleansing “on-the-fly,” validate Analyst IT Architect / Developer Analyst defines business entity, profiles, defines rules & hands over to IT IT enriches the business entity & publishes for BI tool, portal or batch
Why Informatica? ONLY INFORMATICA COMBINES… Gartner Magic Quadrant for Data Integration Tools, 2011 Forrester Wave: Data Virtualization, Q1 ‘12 2009-10 THE BEST OF “DATA VIRTUALIZATION” (AGILITY) THE BEST OF “DATA INTEGRATION” (SOPHISTICATION) Power of The Platform “With v9, Informatica advanced its capabilities with on-the-fly data quality and profiling, a model-driven approach to provisioning data services, performance enhancements, cloud integration, common metadata, and role-specific tools.” The Forrester Wave: Data Virtualization, Q1 2012 • “The ability to switch seamlessly and transparently • between delivery modes (bulk / batch vs. granular • real-time vs. federation) with minimal rework will be • key for IT organizations seeking to develop a • successful data integration strategy.” • Ted Friedman, VP Distinguished Analyst, Gartner …INTO ONE SOLUTION THAT REUSES SKILLS
Only Informatica Provides ONE Solution for Data Integration and Federation Advanced Transformations & Data Quality Analyze & Profile Data & Logic Anytime Early Business Involvement BI Deliver DW Virtual View Transform PrototypeFirst Move to DW or Instantly Reuse as SQL/WS Access DW • Single environment for both data integration and data federation • No data movement / no copies – but can easily reuse virtual views for batch • Early & iterative business (analyst) involvement, efficient collaboration • Pre-built library of rich ETL-like advanced data transformations • Integrated real-time, on-the-fly data profiling & data quality
Have the Conversation with the Business! New data & reports take too long… Business IT “YOU” can now do it in DAYS! Identify a Critical Project in Your Company Involve the Business Early & Often Bake-In Quality & Support Advanced Logic Demonstrate Business Value Early Self-Service + Data Virtualization = ROI
Next Steps & Q&A Informatica.com > Products > PowerCenter > Data Virtualization Edition Informatica.com > Products > Data Virtualization Sign-Up Expert Roundtables Data Virtualization Corner • http://vip.informatica.com/?elqPURLPage=8668 JOIN & DISCUSS 2000+ Strong “Data Virtualization & Data Services Architecture” Group