1 / 30

Oracle 11g Data Warehousing

Oracle 11g Data Warehousing. IOUG collaborate 08 Paper # 204. Vincent Chazhoor Data Warehouse Architect Lafarge North America Vincent_chazhoor@hotmail.com. The Pillars of Oracle Data Warehousing. Partitioning Parallel Processing Data Compression Summary Management.

Download Presentation

Oracle 11g Data Warehousing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Oracle 11g Data Warehousing IOUG collaborate 08 Paper # 204 Vincent Chazhoor Data Warehouse Architect Lafarge North America Vincent_chazhoor@hotmail.com

  2. The Pillars of Oracle Data Warehousing • Partitioning • Parallel Processing • Data Compression • Summary Management

  3. Data Compression • Introduced in Oracle9i Release 2 • Supports compression during bulk load operations only • Data modified using conventional operations not compressed • Improved performance for queries accessing large amounts of data • Fewer IOs • Efficient IO operations • Buffer Cache efficiency • Data Can be compressed at the table level or partition level • Compressing unmodified partitions are common in DW environment • Compressing data saves both storage space and CPU resources • Savings in storage space cascade to standby, backup and development, test and other database copies

  4. Data Compression – 11g New Features • Oracle Database 11g extends compression for OLTP data • Support for conventional DML Operations (INSERT, UPDATE, DELETE) • New algorithm significantly reduces write overhead • Batched compression ensures no impact for most OLTP transactions • Typical compression ratio of 2x to 3x

  5. Using OLTP Table Compression • Requires database compatibility level at 11.1 or greater • New Syntax extends the ‘COMPRESS’ keyword • COMPRESS [FOR {ALL | DIRECT_LOAD} OPERATIONS] • DIRECT_LOAD (DEFAULT) • Refers to Bulk load operations from 10g and prior releases • ALL • OLTP + Direct loads • Enable compression for a new table • CREATE TABLE ( ……………………. COMPRESS FOR ALL OPERATIONS • Enable only direct load compression on existing table • alter table <tablename> compress • only new rows are compressed, existing rows are uncompressed

  6. Data Compression – 11g New Features • Fast RMAN Compression • Compresses the backup contents before writing them to disk or tape • No extra decompression is required during recovery • Performance improvement • 40% faster backup compression versus Oracle Database 10g • Improved Data pump Compression • Metadata compression available in 10g • ALL, DATA_ONLY, METADATA_ONLY, NONE options in 11g • No need to decompress before import • Single step compression of both data and metadata • Complete Data pump functionality available oncompressed file • suitable for fast, incremental daily backups

  7. Partitioning • Partitioning is the Key functionality for managing large volumes of data in Oracle Database • One logical object and Many physical objects • Many features directly and indirectly depend on partitioning • No bitmap indexes or star transformation without partitioning! • Partition exchange is key in providing longer availability of the data warehouse and fast ETL

  8. Oracle Partitioning - 10g and Before

  9. Oracle 11g PartitioningEnhancements • Composite Partitioning enhancements • Data is partitioned along two dimensions • A distinct value pair for the two dimensions uniquely determines the target partitioning • In 11g More types of Composite partitioning options • Enhancements to Partition Management • Virtual column based partitioning • REF partitioning • Interval partitioning • Partition advisor

  10. Composite partitioning enhancements Oracle Partitioning – 11g Enhancements ORDERS ORDERS 200802 Cement ORDERS 200802 200801 Gypsum 200801 CAN CAN 200802 USA USA 200801 LIST-RANGECountry byGL Post Date RANGE-RANGEOrder Date by GL Post Date LIST-LISTCountry by Product Type

  11. Range- Range Composite Partitioning • Data is coming by Invoice date and Queries are by GL_date • ETL team want to partition by Invoice_date to use Partition exchange loading and Reporting team wants to Partition by GL_Date • For the records the Invoice_date will be in one moth and GL_date in another Invoice_date Jan 2008 ... ... Feb 2008 ... ... ... ... May 2008 ... ... ... ... Mar 2008 Jan 2008 Feb 2008 Jun 2008 GL_date

  12. Mar 2008 Range- Range Composite Partitioning • In 10g and before Range partition on one date column and Hash on other • DBAs use other work around to address the issue • In 11g the DBA can make both ETL and Reporting Teams happy Invoice_date • All records with GL_DATE in March 2008 Jan 2008 ... ... Feb 2008 ... ... ... ... May 2008 ... ... ... ... Jan 2008 Feb 2008 Mar 2006 Jun 2008 GL_DATE

  13. Virtual Columns Oracle Database 11g introduces virtual columns • purely virtual, meta-data only • Do not take storage space • Virtual Columns are ‘Column Expressions’ • Derived from other columns in the table • Deterministic functions • Constants • Treated as real columns • Can compute statistics. • Can create Index • Eligible as partitioning key • Improved performance

  14. Virtual Columns • Virtual Columns can be added using CREATE TABLE or ALTER TABLE commands • Eg CREATE TABLE Transaction (Accunt_number number(18) not null, Account_name varchar2(50) not null, .................... .................... Bank_branch number(2) generated always as (to_number(substr(to_char(Account_number),1,3))) partition by list (Bank_branch)

  15. Interval Partitioning • Adding Range Partitions when needed is a challenge for DBAs • New partitions must be created on-time for new data • alter table <tablename> split partition MAX_PARTITION at (TO_DATE(' 2008-05-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')) into (partition P200804, partition MAX_PARTITION); • “Managing Partitioned Tables While You Sleep” by David Burns is the most downloaded article from IOUG “knowledge library” and it is in “top 5 most downloaded articles” for years • Automating partition management was a request from DBAs for years and Interval partitioning finally does that

  16. Interval Partitioning • Interval Partitioning • This is an extension to range partitioning • Automatically create equi-sized range partitions • Support Composite Partitioning • Interval, Interval-list, Interval-hash, Interval-range • Partitions are created as metadata information • Start partition is created • Other partitions are created when data arrives • Segments are allocated as soon as new data arrives • No need to add new partitions • Local indexes are created and maintained as well

  17. Table SALES_ORDER ... ... ... Jan 2008 Feb 2008 Mar 2008 Jun 2008 Oct 2009 Nov 2009 First segment is created Interval Partitioning CREATE TABLE Sales_Order (order_date DATE, ...)‏ PARTITON BY RANGE (order_date)‏ INTERVAL(NUMTOYMINTERVAL(1,'month')‏ (PARTITION p_first VALUES LESS THAN ('01-FEB-2008'); New segment is automatically allocated when data arrives INSERT INTO Sales_Order (order_date,...) VALUES ('02-JUN-2008',...);

  18. REF Partitioning Partitioning in “Relational Database” • If a Table is partitioned and a child table of that table is not partitioned or partitioned by a different column, then that can create issues for maintenance • How do you truncate an old partition from the parent table when associated records exist in the child table? • Deleting records from the child table will be a maintenance issue Partitioned Parent Table Non-Partitioned Child Table Child Table Partitioned by different partition strategy

  19. Partitioning in “Relational Database” • This issue is not very common in star-schema design following Kimball approach. The fact tables will not be associated with another fact table. • This is an issue in “relational” data warehouse designs following Inmon’s approach • DBAs generally try to solve this issue by partitioning the child table exactly like the parent table using the same partition key. • Many companies follow this as a standard or best practice • The child table will have the PK of the parent table as a Foreign Key but the partition key may be missing • Redundant storage of Parent Table Partition Key • Redundant maintenance

  20. REF Partitioning • Oracle Database 11g introduces REF Partitioning • There is no need for specifically following the partition strategy of the parent table while designing child table • Child table inherits the partitioning strategy of parent table through PK-FK relationship • The partition key of the parent will not be stored redundantly in the child table • Enhanced Performance and Manageability

  21. PartitioningAdvisor • Advises on Partitioning Strategy • Advises on partitioning method, partitioning key and interval • Partition advise integrated with advises for indexes and MVs SQL Advisor SQL Plan Tuning Access Analysis Partition Analysis SQLStructure Analysis Oracle DB

  22. Partition Merging – No change in Behavior • In Oracle 10g and before, when two partitions are merged, data is actually moved. This is different from Partition exchange which is just a data dictionary update without any actual movement of data • There is no change to the behavior of partition merging in 11g – still there is data movement • SQL> drop table test_fct; Table dropped. • SQL> create table test_fct 2 (col1 number) 3 partition by range (col1) 4 (partition P1 values less than (200), 5 partition P2 values less than (300), 6 partition P3 values less than (400)); Table created.

  23. Partition Merging in 11g • SQL> insert into test_fct values (100); 1 row created. • SQL> insert into test_fct values (201); 1 row created. • SQL> insert into test_fct values (305); 1 row created. • SQL> commit; Commit complete.

  24. Partition Merging in 11g • SQL> select subobject_name,object_id, DATA_OBJECT_ID 2 from user_objects 3 where object_name = 'TEST_FCT' 4 and subobject_name is not null; SUBOBJECT_NAME OBJECT_ID DATA_OBJECT_ID ------------------------------ ---------- -------------- P1 69855 69855 P2 69856 69856 P3 69857 69857 • SQL> alter table TEST_FCT merge partitions p1,p2 into partition p2; Table altered.

  25. Partition Merging • SQL> select subobject_name,object_id, DATA_OBJECT_ID 2 from user_objects 3 where object_name = 'TEST_FCT' 4 and subobject_name is not null; SUBOBJECT_NAME OBJECT_ID DATA_OBJECT_ID ------------------------------ ---------- -------------- P2 69856 69858 P3 69857 69857 • A new data object id is assigned to the new partition because a new segment is created for it • No change from 10g behavior

  26. Pivot • A great feature for data Warehousing • Common Query requirement to display the data for current month, previous month and same month last year as one record • These values will be stored as different records in the database • The challenge of converting rows into columns in the report • Developers used to do this using workarounds • Pivot function will do the trick • Pivot makes this coding lot easier

  27. Optimizer Statistics • Oracle Database 11g enhancements improve Optimizer Statistics • Oracle Database 11g makes the collection of optimizer statistics faster • The computed optimizer statistics are more thorough, providing better information to the CBO by correlating statistics, such as Number of Distinct Values (NDV) and histograms, on multiple columns. • Oracle Database 11g makes gathering statistics safer. Newly generated statistics can remain in a private ‘statistics workspace’ until they can be validated and published for general use • In Oracle 11g we can set individual thresholds on table level with DBMS_STATS. dba_tab_stat_prefs • New data dictionary view dba_tab_stat_prefs to view these thresholds. • Extended statistics on expressions can be computed in 11g exec DBMS_STATS.GATHER_TABLE_STATS(’schema’, ‘Tablename’, METHOD_OPT => ‘for all columns size 1 columns (upper(last_name)) size 5‘)

  28. Cube Organized Materialized Views • The 11g introduces Cube Organized Materialized Views • This new feature permits standard SQL queries (such as those generated by non-OLAP aware reporting tools) to access OLAP cube data through query rewrite • Cube Organized Materialized Views had many advantages compared to conventional MVs • A Cube can replace hundreds of MVs

  29. Q A &

  30. Thank You • Please complete the evaluation form • Speaker Name : Vincent Chazhoor • Session name : Oracle 11g Data Warehousing • Session# : 204

More Related