1 / 71

Business Intelligence & Data Warehousing Session April 2007

Business Intelligence & Data Warehousing Session April 2007. Robert Stackowiak, Vice President, Business Intelligence, Oracle Corp. Business Intelligence Today. Provides reporting and analyses capabilities Can deliver timely, accurate, and useable information

bandele
Download Presentation

Business Intelligence & Data Warehousing Session April 2007

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Business Intelligence & Data Warehousing SessionApril 2007 Robert Stackowiak, Vice President, Business Intelligence, Oracle Corp.

  2. Business Intelligence Today • Provides reporting and analyses capabilities • Can deliver timely, accurate, and useable information • Enables better strategic and tactical decision making

  3. “In the next five years, Business Intelligence as we know it will disappear from view.  Apart from the important role of analytical BI to find new insight, operational BI will become a part of the core fabric of businesses - embedded into business processes with relevant insight available to everyone in the organization when they need it, where they need it.” Neil Raden, BI Megatrends, Intelligent Enterprise 2006

  4. Spectrum of BI Functionality Visualize Analyze Publish Integrate Data Model Collaborate Monitor Secure & Manage Compare Predict Sense Respond

  5. SOA Management Packs Identity Management/Security End-to-End Business Intelligence

  6. BI Community / LOBs • Information Access • Being a catalyst for business change through rapid time to benefit • Heterogeneous & ubiquitous access • Information consumption represents value • Report centric approach • Change managed through product • Starting point is often BI Applications with well understood model DW Community / IT • Managing information over the longer term • Managing through business change • Single version of the truth • Information management represents value • Long term architected approach • Model centric approach • Starting point is probably a blank sheet of paper Why BI still so Difficult?IT & LOB and Competing Agendas

  7. BI Focus Areas in Key Verticals • Financial • Cross-sell, targeted marketing • Manufacturing • Supply chain analysis / just in time inventory • Retail • Supply chain analysis / inventory control • Media • Revenue across multiple distribution channels • Transportation • Optimized logistics management • Healthcare • Cost containment / quality of care • Civilian Government • Budget spend and reporting to constituents • Military & Intelligence • Logistics management, threat analysis

  8. Solution Development & Presentation Approval & Budget Organization Strategy Assessment & Scope Discovery Selling the Project • Pick a strategic business area • Should have potentially highly visible positive ROI • Build business / IT partnership • Limit scopes, deliver successes • Leverage previous work and data models if possible • Demonstrate business value in 90 – 180 day increments • Continue building on success

  9. So, what is so hard? • Understanding business requirements & prioritizing • Getting through sponsorship & politics • Getting enough information to design and size the platform • Understanding the corporate vision well enough to include planning for future growth and requirements • Finding architects who can uncover business and technical requirements and design solutions • Unforeseen developments

  10. Finding BenefitsSomewhat motivating – often IT led • ROI = Present Value of Benefits - Costs • Increasing ROI for information producers by • Reducing current costs of analysis for LOBs • Empower LOBs with tools and applications and automated analysis rather than manually creating custom reports • Reducing ongoing costs of analysis for IT • Faster and simpler development of analytic applications • Centralized management

  11. Finding BenefitsThe big wins… led by the business • ROI = Present Value of Benefits - Costs • Increasing ROI for information consumers (LOBs) by • Enabling top-line growth • Identify new customers/markets, focus on most profitable customers, … • React faster than the competition • Enabling bottom-line savings • Optimize operations, financial management, … • Less time spent on gathering data; more time on analysis

  12. Finding BI Opportunity???? St. Louis-area baby boomers mirror national trend of moving to smaller homes KRTBN Knight-Ridder Tribune Business News - St. Louis Post-Dispatch - Missouri22 November 2004 Unified Building Codes Offer Nationwide Growth Opportunities for Homebuilding Sector HomebuilderStocks.com Exclusive ArticleMarket Wire Incorporated17 November 2004 Housing caters to aging boomers in Cary, N.C. Knight-Ridder Tribune Business News07 November 2004 Volatile material prices and supplies plague San Antonio homebuilders, buyers. Knight-Ridder Tribune Business News02 November 2004

  13. Oracle BI Products • Integrated Business Intelligence Database • Business Intelligence Functions Integrated in Database • Integrated Business Intelligence Tools • Best-of-Breed, Next Generation Business Intelligence Infrastructure • Integrated Analytic Applications • Enterprise Wide, Industry Specific BI and Corporate Performance Applications

  14. Oracle & “Hot Pluggable”Open and best of breed Portals Analytic Tools Reporting Desktop Tools Any JSR 168 Portal BOBJ, COGN MSTR Actuate, Oracle Apache FOS Excel, Outlook, Lotus Notes Oracle BI Server Security Data Access ETL Oracle MSFT AD iPlanet Novell Others .. Oracle DB2 MS SQL Teradata Red Brick SQL Anywhere XML, Excel Others .. Oracle Informatica Ascential Sunopsis Others ..

  15. Current Trends • Consolidation of data marts • Deployment of enterprise-wide data on lower cost platforms • Leveraging of applications to speed deployment

  16. Vendor B Marketing Vendor A HR Vendor C SFA Vendor D CRM Typical IT Infrastructure Home Grown Systems A, B, C ….. and more. Home Grown Customer Hub Vendor E ERP e.g. Manufacturing, Order Management & Advanced Scheduling Data Warehouse & Business Intelligence Where is the source of truth?

  17. Data Warehouse Consolidation? Global Companies with more than 1,000 users No Don’t Know Yes Source: 2005 ITtoolbox Data Warehouse Survey

  18. Consolidation Considerations • Data definitions and politics • Hybrid schema and complexity • Advanced security • High availability • Information Lifecycle Management • Appearance of new data marts

  19. Analytics Orchestration Master Data Master Data EAI Master Data DW ETL DW DW Transactional Data Analytical Data EAI Evolving Architecture Data Marts Reporting Applications Data Warehouse Sales Marketing Self Service DW Call Center Business Intelligence Financials Providers Eligibility Claims ETL

  20. 2003 Survey 2005 Survey 2001 Survey SBC Teradata 10.50 First Union Informix 4.50 Dialog Proprietary 4.25 Telecom Italia DB2 3.71 FedEx Teradata 3.70 Office Depot Teradata 3.08 AT & T Teradata 2.83 SK C&C Oracle 2.54 NetZero Oracle 2.47 Telecom Italia Informix 2.32 France Telecom Oracle 29.23 AT&T Daytona 26.27 SBC Teradata 24.81 Anonymous DB2/Unix 16.19 Amazon.com Oracle 13.00 Kmart Teradata 12.59 Claria Oracle 12.10 HIRA SybaseIQ 11.94 FedEx Teradata 9.98 Vodafone Teradata 9.91 Yahoo! Oracle 100.38 AT&T Daytona 93.87 KT-IT Group DB2/Unix 49.39 AT&T Daytona 26.71 LGR-Cingular Oracle 25.20 Amazon.com Oracle 24.77 Anonymous DB2/Unix 19.65 UPSS Microsoft 19.46 Amazon.com Oracle 18.55 Nielsen Media SybaseIQ 17.68 Winter VLDB SurveyLargest Database Size*, Decision Support + Overall *Database Size = Data + Indexes

  21. Real Application Clusters • Scalable & Manageable Clusters • Cache Fusion architecture • Vendor Independent Architecture • Linux, Windows, HP, HP Alpha, IBM AIX, Sun Solaris • Integrated ‘clusterware’ in Oracle 10g • Transparent application failover

  22. Amazon.com Financial Analysis Supply Chain Planning Apps Fulfillment Centers Data Warehouse Partners Web Servers Customers Internal End Users Personalization Campaign Generation Email Servers Content Servers

  23. Oracle & High Availability Solutions Grid Clusters Computer Failures UnplannedDowntime ASMFlashbackRMAN & Flash Recovery AreaH.A.R.DData Guard Data Failures Oracle MAA Best Practices System Changes Online ReconfigurationRolling Upgrades PlannedDowntime Data Changes Online Redefinition

  24. Amazon.com DW Growth*Data volumes Data Volume has grown 2x year over year for the past 4 years.Projecting higher growth in upcoming years*: - additional lines of business / product lines supported - huge standard reporting growth with more partners supported

  25. Amazon.com Data Warehouse GrowthUsers & Query growth About 3 concurrent queries per node (8 way parallel queries)

  26. Amazon DW Modular Architecture Oracle10g RAC Amazon’s RAC is so cost-effective they run 2 concurrently and still save money!! 1. Extract from source systems 2. Integrate, transform, and denormalize 3. Query and analyze 4. Data access and publishing Users STG1 (ETL/Staging) ADS1 Query DW (Atomic Data Store) Extract Servers DSS UI Client 16 nodes 4 cpus 8 nodes 4 cpus ETL Manager 2nd pair of identical RAC clusters means ‘no need for backup’ for active online data STG2 (ETL/Staging) ADS2 Query DW (Atomic Data Store) 16 nodes 4 cpus 8 nodes 4 cpus

  27. Oracle Information Appliance Foundation • You determine the potential size and workload type for the data warehouse • You determine possible initial sizing possibilities • IBM, HP, EMC, Panta, and Sun • Platform vendors determine final configuration http://www.oracle.com/solutions/business_intelligence/oiai.html

  28. Customers CxO Process Workers Managers Sales Teams Partners Suppliers Packaged Analytic Applications Custom Analytic Applications Fusion BI Infrastructure and Middleware Blueprint for Greater Business InsightOne Technology for Packaged & Custom Analytic Apps Oracle Hot-Pluggable BI Other Operational & Analytic Sources

  29. Oracle Database EvolutionScalability and Analytics • Cost-based optimizer (Oracle7) • 64-bit SMP and NUMA support (Oracle7) • Static bit-mapped indexing (Oracle7) • Parallel bit-mapped star joins (Oracle8) • Materialized Views (Oracle8i) • Real Application Clusters (Oracle9i) • SQL Analytic Functions (Oracle8i, Oracle9i, Oracle 10g) • OLAP Services (Oracle9i, Oracle 10g) • Data Mining Algorithms (Oracle9i, Oracle 10g)

  30. Oracle ParallelizationIndependent of Partitioning Parallel Query Parallel DDL Parallel DML SQL*Loader

  31. Oracle Multi-Version Read ConsistencyRow Level Locking • Readers don’t lockout updates and updates don’t lockout readers • Reports see only committed data via Multi-Versioning • Queries yield maximum throughput with correct results - no waiting and no dirty reads! • Row locks never escalate - the most scaleable solution available report Budget Table update Rollback Segment Before Image update accurate report

  32. Star Schema Channel Geography Region District Branch Sales Distributor OEM Fact Table: Sales Transactions Time Product Year Quarter Month Type Size Color

  33. Index on COLOR 0 1 0 0 1 0 0 1 0 0 1 0 color = ‘BLUE’ 0 1 0 1 0 0 1 0 1 0 1 0 0 0 1 0 color = ‘RED’ 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 1 color = ‘GREEN’ 1 0 0 0 1 1 0 0 0 1 0 1 0 1 0 0 size = ‘SMALL’ 0 0 1 0 1 1 0 1 0 1 0 0 0 1 0 1 size = ‘MED’ 1 0 0 0 0 0 1 0 0 0 0 1 0 1 0 0 size = ‘LARGE’ 0 1 0 1 0 0 0 0 1 0 1 0 1 0 1 0 Oracle Bit-Mapped Indexes partno color size weight PARTS table SELECT count(*) FROM parts WHERE size = ‘MED’ a AND color = ‘RED’ 001 GREEN MED 98.1 002 REDMED 1241 003 REDSMALL 100.1 004 BLUELARGE 54.9 005 REDMED 124.1 006 GREENSMALL 60.1 ... .... ..... ... Index on SIZE

  34. Oracle Parallel Bitmap Star-Joins • Dramatic Performance Gains • Innovative use of bitmap indexes • Complex Star Schemas • Multiple fact tables • Many dimensions • Unconstrained dimensions • Large dimension tables • Optimized for sparse fact tables • Parallel execution

  35. Materialized ViewsSummary Management • Stored summaries are maintained, for faster warehouse query processing • Optimizer transparently redirects queries to use summary, instead of detail data • Intelligent, robust query rewrite capabilities SALES PER MONTH BY REGION (2 GB) What were sales in the West and South regions for the last three quarters? x SALES (30 GB)

  36. Materialized ViewsAutomatic Query Rewrite SELECT p.brand, r.country, t.month, SUM(s.amt) tot_sales FROM sales s, region r, time t, product p WHERE s.region_id = r.region_id AND s.sdate = t.curdate AND s.prod_code = p.prod_code GROUP BY p.brand, r.country, t.month HAVING SUM(s.amt) > 5000000; Optimizer automatically rewrites query on detailed data ... … to access data in the materialized view SELECT brand, country, month, tot_sales FROM sales_sumry WHERE tot_sales > 5000000;

  37. COUNT ORDERS by PRODUCT AVG ORDERS by REGION SUM ORDERS by QUARTER Materialized ViewsData Refresh Incremental Refresh Insert Update Delete • Maintain the timeliness of stored result to preference • Several options for refreshing data in materialized views • Manage the effect of performance on operational data • No impact on load performance ORDERS Full Refresh Deferred Refresh

  38. Data Consolidation with Oracle Oracle BI User’s tool of choice Microsoft Excel Open standards-based platform for non-Oracle tool users + native Excel support thru add-in Hybrid schema Relational & OLAP databases unite In-place data mining Oracle10g

  39. Ranking functions Window Aggregate functions (moving and cumulative) LAG/LEAD functions Reporting Aggregate functions Statistical Aggregates Linear regression Descriptive Statistics Correlations Cross Tabs Hypothesis Testing Distribution Fitting Pareto Analysis (documented) Statistics & SQL Analytics in Oracle

  40. Compare Profit this Quarter versus Last Quarter What is a Product’s Profit Share of its Category? CD Profit DVD Product USA Audio West SF Market Q1 Q2 Q3 Time Why Oracle OLAP is so FastData Access Method • Data stored in dense arrays • Offset addressing – no joins • More powerful analysis • Better performance

  41. Oracle’s Complete Analysis SolutionOracle Data Mining Option • Key Algorithms • Naïve Bayes, Associations, Adaptive Bayes Network, Clustering algorithms, Support Vector Machines, Nonnegative Matrix Factorization, Decision Trees • Data Mining Tools Available • Oracle Data Miner to build applications • Popular Data Mining tools partners • SPSS Clementine • InforSense • Leverage output in other tools such as OBI EE & RTD

  42. Oracle Database EvolutionManageability • Oracle Enterprise Manager (Oracle7) • Oracle Parallel Server high availability (Oracle7) • Range, Composite Partitioning (Oracle8, Oracle8i) • Adaptive Degree of Parallelism (Oracle8i) • Resource Manager: CPU (Oracle8i) • List Partitioning, Composite Range-List (Oracle9i) • Resource Manager: query governor, automatic queuing, reprioritization (Oracle9i) • XML Database, Virtual Private DB (Oracle9i) • Automatic DB Diagnostics Monitor (Oracle 10g) • Automatic Storage Manager (Oracle 10g)

  43. Oracle Enterprise Manager • Browser-based console manages single server, multiple servers, or computing grid • Multi-user repository

  44. Self Managing & Tuning • Adaptive degree of parallelism • Database Resource Manager • CPU allocation by groups of users • Proactive query governing • Automatic queuing • Dynamic re-prioritization • Automatic Database Diagnostics Monitor • Automatic Shared Memory Management • Mean Time to Recovery Advisor • Segment Advisor • SQL Access Advisor • SQL Tuning Advisor • Single Command Flashback

  45. Automated Storage Manager (ASM) Database 1 • Portable, high performance, cluster file system & volume manager • Partitions disk into uniform MB units • Combines units into files • No intermediate volumes • Units assigned to a file are tracked using indexing techniques • Files spread across disks to balance load • Integrated mirroring File #1 File #2 File #1 Database 2

  46. Add order table Nov 2006 Jan 2006 Feb 2006 Mar 2006 Apr 2006 Drop Local Index Other data is not affected Partitioning • Partition level management • On-line addition and removal of partitions • Data management operations (loading, index builds) • Range, hash, composite range-hash, list, composite range-list • Improved availability • Localized disk failures, backup and recovery

  47. RANGE (sales_date) NOV 2006North NOV1998 OCT 2006 North OCT1998 SEP 2006 North SEP1998 DEC 2005 North DEC1997 NOV 2005 North NOV1997 ... LIST (geography) West West West West West ... South South South South South Composite Range-List Partitioning Range partition across time List partition across another major attribute ...

  48. Table Compression 2-7X Compression Observed • Queries on compressed tables may observe minor performance degradation • Performance impact depends upon the query • Many queries will be faster • Compression reduces IO but increases CPU utilization • For a set of queries, performance should degrade by no more than 5% • Load and direct-path INSERT performance will be slower • Data must be compressed as it is added to the table

  49. Encryption Snooping Tampering Authentication Management Secure Data Protection Stored Data EncryptionProtect ultra-sensitive data Virtual Private Database Oracle Label SecurityEnforce row level security Security andPrivacy Data Security Fine-Grained AuditingIdentify misuse of data access rights Auditing Encrypt All Protocols into the Database Prevent wire tapping and tampering Network Security PKI and LDAP Centralized Management, Strong Authentication Unified User Identity

  50. VPD POLICY Where territory_id = 12 Where product_id - 22 Virtual Private DatabaseTransparent Security inside the database • Server dynamically rewrites SQL • Query modification based on PL/SQL package assigned to an object • “Where” clause appended to SQL statement SELECT * FROM orders Western Sales Rep SELECT * FROM orders orders Product Manager

More Related