1 / 74

How cool is that? Capitalizing on Oracle 9i for data warehousing

Session id: 36482. How cool is that? Capitalizing on Oracle 9i for data warehousing. Mike Ames Oracle DBA SAS Institute. Topics. Performance: Parallelism and Design Capitalizing on of Oracle 9i for DW External Tables Merge & Multiple Inserts Partitioning Materialized View Enhancements

Download Presentation

How cool is that? Capitalizing on Oracle 9i for data warehousing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Session id: 36482 How cool is that? Capitalizing on Oracle 9i for data warehousing Mike AmesOracle DBA SAS Institute

  2. Topics • Performance: Parallelism and Design • Capitalizing on of Oracle 9i for DW • External Tables • Merge & Multiple Inserts • Partitioning • Materialized View Enhancements • Bitmap Join Indexes • SQL for Analysis • Conclusions & Questions

  3. Performance: Parallelism and Design serial operation • At 20MB/sec it takes 14 ½ hours to read 1TB serially • With perfect parallelism and 16 parallel processes it takes about 54 minutes (assuming sustained throughput)

  4. Performance: Parallelism and Design • Amdahl’s Law in a nutshell Speedup (S) = “Efficiency gained by executing a process in parallel” • Formula for Speedup : S = 1/ ( % sequential) +(% parallel/ # processors) + overhead) S = Speedup N = Number of Processors B = % of the process or algorithm that is serial S = 1/ ( B + (1-B/N) + O ) Example: 8 processors & 5% serial operations

  5. Performance: Parallelism and Design • Why is this important? “Dependencies created by design and complexity reduce our ability to parallelize data warehouse operations.” S = 1/ ( 5%) +(95%/ 8) + 0) = 5.9 • Assuming perfect parallelism, a query that takes 30 minutes to execute serially would take just over 5 minutes in parallel. • Keys to parallel performance: • Minimize dependencies • Minimize overhead associated with complexity

  6. Performance: Parallelism and Design • Two things to note: • Incremental speedup by doubling the # processors is dependant on % Sequential • 5% sequential with 4 processors > 20% sequential with 8 processors

  7. Performance: Parallelism and Design realistic expectations

  8. Performance: Parallelism and Design measuring performance • Database performance is generally measured by: • Load Performance • Ability to parallelize • Order & number of operations and complexity to maintain integrity • Ability to leverage RDBMS load facilities • Query Performance • Ability to parallelize • Ability to leverage partitioning • Ability to exploit query re-write & summary data • Number of sort and join operations • Usability • Ability of users to capitalize on data • Level of complexity

  9. Performance: Parallelism and Design critical factor • Database design is critical for performance • Ability to parallelize is constrained by dependencies • Data and referential integrity • Order of operations • Ability to leverage RDBMS features is constrained by design • Loading • Partitioning • Indexing • Query re-write • Materialized Views • End user satisfaction is constrained by complexity • Ease of use • Data quality

  10. Performance: Parallelism and Design • The bottom line: design dictates performance… • Good Design: • Maximizes Parallelism by minimizing dependencies • Minimizes Complexity • Capitalizes on features of the RDMBS (Oracle 9i)

  11. Capitalizing on of Oracle 9i • External Tables • Merge & Multiple Inserts • Partitioning • Materialized View Enhancements • Bitmap Join Indexes • SQL for Analysis

  12. Capitalizing on – External Tables External Tables • Enable you to reference multiple flat files as if they were a table on your database. • Restrictions • Read Only no DML (INSERT, UPDATE, DELETE) • Can’t be used for partition exchange

  13. Capitalizing on – External Tables Loading from flat files: • Old method • Create a Stage Table on your warehouse • Use SQL*Loader to bulk load the table • Read from stage table performing operations to put data into final format. • New method • Create a table that references the external file • Read from the external table performing operations to put data into final format. • Significantly reduces the number of times data has to be moved around and virtually eliminates the need to use SQL*Loader directly.

  14. Capitalizing on – External Tables External Table & Insert /*+Append*/ Example File(s) Target Source • Steps • FTP 02Nov2003_Sales Extract • “Alter Table” add file(s) to location • Perform INSERT /*+Append*/ INTO

  15. Capitalizing on – External Tables CREATE TABLE NOV_SALES_EXTERNAL( PROD_ID NUMBER(6), CUST_ID NUMBER, TIME_ID DATE, CHANNEL_ID CHAR(1), PROMO_ID NUMBER(6), QUANTITY_SOLD NUMBER(3), AMOUNT_SOLD NUMBER(10,2), UNIT_COST NUMBER(10,2), UNIT_PRICE NUMBER(10,2)) ORGANIZATION external ( TYPE oracle_loader DEFAULT DIRECTORY extracts ACCESS PARAMETERS ( RECORDS DELIMITED BY NEWLINE CHARACTERSET US7ASCII BADFILE log_file_dir:' NOV_SALES_EXTERNAL.bad' LOGFILE log_file_dir:' NOV_SALES_EXTERNAL.log' FIELDS TERMINATED BY "|" LDRTRIM ) location ( '01Nov2003Sales.dat’) )REJECT LIMIT UNLIMITED; Step 0. Create a target Table and an external extract table Step 1. FTP Nov 2nd and 3rd extracts

  16. Capitalizing on – External Tables • Step 2. Alter External Table adding the new files to the location. ALTER TABLE NOV_SALES_EXTERNAL LOCATION ('01Nov2003Sales.dat', '02Nov2003Sales.dat', '03Nov2003Sales.dat'); • Step 3. Insert into target table from external table INSERT /*+ APPEND/ INTO SALES_FACT SELECT * FROM NOV_SALES_EXTERNAL

  17. Capitalizing on – External Tables Sales Extract External Table Sales Table INSERT/*+APPEND*/ INTO SALES SELECT * FROM SALES_EXTRACT_EXTERNAL

  18. Capitalizing on – External Tables • Fixed-length fields are processed faster than delimited fields or records terminated by a string. • Single-character delimiters for record terminators and field delimiters are faster to process than multi-character delimiters. • Using INSERT/*+APPEND*/ with either key lookup functions or joins (for surrogate key lookup) is a great method to load fact tables.

  19. Capitalizing on – merge Merge: • Enables you to perform updates to matched records and inserts of new records. • Leverages parallelism • Is a slick way of performing slowly changing dimension logic.

  20. Capitalizing on – merge Customer Extract External Customer Existing Table Customer Post Merge MERGE INTO CUSTOMERS USING ( CUST_EXT x) WHEN MATCHED THEN UPDATE SET city = x.city… WHEN NOT MATCHED THEN INSERT (CUST_ID…) VALUES(x.cust_id…) New Matched updated

  21. Capitalizing on – merge Typical type 1 slowly changing dimension (SCD) logic Source Extract Update Existing New & Changed Records Compare Generate Key Master Dimension Cross Reference Insert New

  22. Capitalizing on – merge for type 1 SCD Capitalize on MERGE for Type 1 SCDs Steps: • Create your dimension table • Create an external table as your “extract table” that contains all of the columns in your dimension except the surrogate key • Create an Oracle sequence this will be used for the surrogate key of your dimension • Use MERGE to populate your dimension

  23. Capitalizing on – merge for type 1 SCD • Create our Dimension (Target) table Note: cust_key is our the surrogate key; cust_id is the “natural key” or production key; CREATE TABLE CUSTOMER_DIM ( CUST_KEY NUMBER NOT NULL, CUST_ID NUMBER NOT NULL, CUST_NAME VARCHAR2(20) NOT NULL, ZIP CHAR(5) NOT NULL, CITY VARCHAR2(30) NOT NULL, STATE VARCHAR2(40) NULL, CONSTRAINT CUSTOMER_PK PRIMARY KEY (CUST_KEY) );

  24. Capitalizing on – merge for type 1 SCD • Create our extract or “staging” table 2a. Running a new extract is simply a matter of referencing a new file: ALTER TABLE customer_extract LOCATION (‘CIF_NOV_2003.dat‘); CREATE TABLE CUSTOMER_EXTRACT ( CUST_KEY NUMBER, CUST_ID NUMBER, CUST_NAME VARCHAR2(20), ZIP CHAR(5) , CITY VARCHAR2(30) , STATE VARCHAR2(40)) ORGANIZATION EXTERNAL ( TYPE oracle_loader DEFAULT DIRECTORY extracts_dir ACCESS PARAMETERS ( RECORDS DELIMITED BY NEWLINE CHARACTERSET US7ASCII BADFILE log_file_dir:'customer_extract.bad' LOGFILE log_file_dir:'customer_extract.log' FIELDS TERMINATED BY "|" LDRTRIM ) location ( 'cust_extract.dat' )) REJECT LIMIT UNLIMITED PARALLEL;

  25. Capitalizing on – merge for type 1 SCD • Create our Sequence – this will used for the surrogate key CREATE SEQUENCE CUST_SEQ START WITH 1000 INCREMENT BY 1000;

  26. Capitalizing on – merge for type 1 SCD Capitalizing on – merge Customer Extract External • Use a single MERGE statement to perform our type 1 SCD logic Existing Customer Dimension MERGE INTO CUSTOMER_DIM c USING CUSTOMER_EXTRACT X ON (c.cust_id = x.cust_id) WHEN MATCHED THEN UPDATE SET CUST_FIRST_NAME = X.CUST_FIRST_NAME, CUST_LAST_NAME = X.CUST_LAST_NAME,…. CUST_EMAIL = X.CUST_EMAIL WHEN NOT MATCHED THEN INSERT ( CUST_KEY, CUST_ID,…) VALUES( CUST_SEQ.NEXTVAL, X.CUST_ID,… X.CUST_EMAIL); Updates Matched updated Customer Dimension Post Merge New New

  27. Capitalizing on – merge for type 2 SCD Type 2 Slowly Changing Dimension (SCD) : • Type 2 SCD a technique where a new dimension record is created with a new surrogate key each to reflect the change • We can do this quite simply a single merge statement simply add the change columns to the ON () portion of the merge. MERGE INTO CUSTOMER_DIM c USING CUSTOMER_EXTRACT X ON (c.cust_id = x.cust_id and c.city=x.city and c.state=x.state and c.zip = x.zip) WHEN MATCHED THEN UPDATE SET CUST_NAME = X.CUST_NAME, City = X.CITY, STATE=X.STATE, Zip=X.ZIP…. WHEN NOT MATCHED THEN INSERT ( CUST_KEY, CUST_ID, CUST_NAME, CITY, STATE, ZIP, CREATE_DATE) VALUES( CUST_SEQ.NEXTVAL, /* CUST_KEY */ X.CUST_ID, X.CUST_NAME… TRUNC(SYSDATE));/* CREATE_DATE */ Change Columns Change Pointer Surrogate Key

  28. Capitalizing on – merge for type 2 SCD • A more common approach to Type 2 Slowly Changing Dimension logic is the addition of change pointers for reference data, Unfortunately, this requires a multi-step process. • This is generally performed with a series of insert and update statements or procedural logic but can be accomplished with two merge statements as well. • One to insert new records and update existing (close out) • One to insert new changed records Surrogate Key Change Pointers

  29. Capitalizing on – merge for type 2 SCD 1 2 /* First Merge: Close out existing, Insert New */ MERGE INTO CUST_DIM C USING (SELECT cust_id, cust_name, city, state, zip FROM CUST_EXTRACT MINUS SELECT cust_id, cust_name, city, state, zip FROM CUST_DIM WHERE CURRENT_FLAG = 'Y') X ON (C.CUST_ID = X.CUST_ID AND C.END_DATE=to_date('15-JAN-2099','DD-MON-YYYY')) WHEN MATCHED THENUPDATE SET c.current_flag = 'N' WHEN NOT MATCHED THEN INSERT (CUST_KEY,CUST_ID,CUST_NAME, CITY,STATE,ZIP,START_DATE,END_DATE, CURRENT_FLAG) VALUES( CUST_SEQ.NEXTVAL, X.CUST_ID, X.CUST_NAME, X.CITY, X.STATE, X.ZIP, trunc(SYSDATE), TO_DATE(‘01-JAN-2099','DD-MON-YYYY'), 'Y'); COMMIT; /* Second Merge: Insert new changed record */ MERGE INTO CUST_DIM C USING ( SELECT cust_id, cust_name, city, state, zip FROM CUST_EXTRACT MINUS SELECT cust_id, cust_name, city, state, zip FROM CUST_DIM) X ON (C.CUST_ID = X.CUST_ID AND C.CURRENT_FLAG='Y') WHEN MATCHED THEN UPDATE SETc.END_DATE=trunc(SYSDATE -1) WHEN NOT MATCHED THEN INSERT (CUST_KEY,CUST_ID,CUST_NAME, CITY,STATE,ZIP,START_DATE,END_DATE, CURRENT_FLAG) VALUES( CUST_SEQ.NEXTVAL, X.CUST_ID,X.CUST_NAME,X.CITY, X.STATE,X.ZIP, trunc(SYSDATE), TO_DATE(‘01-JAN-2099','DD-MON-YYYY'), 'Y'); COMMIT; 3 /*+ Final step Date Closeout */ UPDATE CUST_DIM C SET c.END_DATE=trunc(SYSDATE -1) WHERE C.CURRENT_FLAG = 'N' AND c.END_DATE=TO_DATE(‘01-JAN-2099','DD-MON-YYYY'); commit;

  30. Capitalizing on – merge for type 2 SCD Customer Extract External Take note of how a single table design decision has impacted our ability to parallelize. Instead of a single merge statement we now have a three step process. Customer Existing Dimension Customer Post Type 2 Merge (s)

  31. Capitalizing on – multiple inserts Multiple Inserts: • Enables us to conditionally insert into multiple tables in parallel. • All: result set is applied to all conditions • First: result set is applied to the first condition • Leverages parallelism • Is a slick way to segment data and load fact tables.

  32. Capitalizing on – multiple inserts GA_SALES • Multiple Inserts (ALL) • INSERT ALL • WHENstate='GA‘ or state = ‘FL’THEN INTO GA_SALES • VALUES(prod_id, cust_id,sale_date,sale_amount qty_sold) • WHENstate = 'FL'THEN INTO FL_SALES • VALUES(prod_id, cust_id,sale_date,sale_amount qty_sold) • ELSE INTO ALL_OTHER_SALES • VALUES(prod_id, cust_id,sale_date,sale_amount qty_sold) • SELECT prod_id, cust_id,sale_date,sale_amount qty_sold • FROM sales_extract; FL_SALES ALL_OTHER_SALES Query

  33. Capitalizing on – multiple inserts GA_SALES • Multiple Inserts (FIRST) • INSERTFIRST • WHENstate ='GA‘ OR state = ‘FL’ • THEN INTO GA_SALES • VALUES(prod_id, cust_id,sale_date,sale_amount qty_sold) • WHENstate = 'FL'THEN INTO FL_SALES • VALUES(prod_id, cust_id,sale_date,sale_amount qty_sold) • ELSE INTO ALL_OTHER_SALES • VALUES(prod_id, cust_id,sale_date,sale_amount qty_sold) • SELECT prod_id, cust_id,sale_date,sale_amount qty_sold • FROM sales_extract; FL_SALES ALL_OTHER_SALES

  34. Capitalizing on - Partitioning Types of Partitioning • Range - maps rows to partitions based on ranges of column values • List (New) - enables you to explicitly control how rows map to partitions. • Hash - evenly distributes rows among partitions • Composite • Range-Hash: benefits of range partitioning then further hash distributing the sub-partition. • Partition pruning & Parallel processing • Range List (New for 9i): benefits of range partitioning and further discrete sub-partitioning.

  35. Capitalizing on - Partitioning • Why is partitioning important “partitioning enables you to split large volumes of data into smaller separate buckets that can be managed independently” • Partition Pruning / Elimination • Partition-wise Joins • Parallel DML • Partition Exchanging / Swapping

  36. Capitalizing on Oracle 9i partitioning Jan 2003 • Why Partition? • Partition pruning “Ability to eliminate partitions that don’t satisfy query conditions” Feb 2003 Mar 2003 … SELECT sum(qty_sold) FROM sales WHERE sale_date BETWEEN Feb 1, 2003 and Feb 15, 2003

  37. Capitalizing on Oracle 9i partitioning Why Partition? • Partition-wise joins • Full – Equi-partitioned on the join keys i.e. the two tables are both partitioned on the same key. Hash-Hash is the easiest example. • Partial – Oracle dynamically repartitions based on the reference table. Here accounts and transactions are both hash partitioned by account_id into 32 partitions Note: to achieve equal work distribution, the number of partitions should always be a multiple of the degree of parallelism. Ex. Here we hashed account and transaction into 32 partitions with a degree of parallelism 8

  38. Capitalizing on Oracle 9i partitioning ACCT ID NAME ACCT ID Price QTY Partition-wise Join

  39. Capitalizing on - Partitioning PQ1 Partition Exchanging Stock Extracts PQ2 PQ3 ALTER TABLE stock_sales EXCHANGE PARTITION PQ3 WITH stock_extracts; *Does not work with external tables

  40. Capitalizing on - Partitioning Range Example CREATE TABLE account_balance_range( account_key NUMBER(7) CONSTRAINT acct_nn NOT NULL, branch_key NUMBER(7) CONSTRAINT brch_nn NOT NULL, product_key NUMBER(7) CONSTRAINT prod_nn NOT NULL, snapshot_date DATE CONSTRAINT mnth_nn NOT NULL, state_key CHAR(2) CONSTRAINT stat_nn NOT NULL, ending_bal NUMBER(7,3) CONSTRAINT ebal_nn NOT NULL, average_daily_bal NUMBER(7,3) CONSTRAINT abal_nn NOT NULL, transaction_count NUMBER(7) CONSTRAINT txnc_nn NOT NULL, interest_paid NUMBER(7,3) CONSTRAINT intp_nn NOT NULL, fees_charged NUMBER(7,3) CONSTRAINT feec_nn NOT NULL) PARTITION BY RANGE(snapshot_date)( PARTITION Q1_ACCT_BAL VALUES LESS THAN (TO_DATE('01-APR-2000','DD-MON-YYYY')), PARTITION Q2_ACCT_BAL VALUES LESS THAN (TO_DATE('01-JUL-2000','DD-MON-YYYY')), PARTITION Q3_ACCT_BAL VALUES LESS THAN (TO_DATE('01-OCT-2000','DD-MON-YYYY')), PARTITION Q4_ACCT_BAL VALUES LESS THAN (TO_DATE('01-JAN-2001','DD-MON-YYYY')));

  41. Slick New Feature - Partitioning List Example CREATE TABLE account_balance_list( account_key NUMBER(7) CONSTRAINT acct_nn NOT NULL, branch_key NUMBER(7) CONSTRAINT brch_nn NOT NULL, product_key NUMBER(7) CONSTRAINT prod_nn NOT NULL, snapshot_date DATE CONSTRAINT mnth_nn NOT NULL, state_key CHAR(2) CONSTRAINT stat_nn NOT NULL, ending_bal NUMBER(7,3) CONSTRAINT ebal_nn NOT NULL, average_daily_bal NUMBER(7,3) CONSTRAINT abal_nn NOT NULL, transaction_count NUMBER(7) CONSTRAINT txnc_nn NOT NULL, interest_paid NUMBER(7,3) CONSTRAINT intp_nn NOT NULL, fees_charged NUMBER(7,3) CONSTRAINT feec_nn NOT NULL) PARTITION BY LIST (state_key)( PARTITION northwest VALUES ('OR', 'WA'), PARTITION southwest VALUES ('AZ', 'UT', 'NM'), PARTITION southeast VALUES ('FL','GA','SC','AL','TN','NC'), PARTITION rest VALUES (DEFAULT))’);

  42. Capitalizing on - Partitioning Hash Example CREATE TABLE account_balance_hash ( account_key NUMBER(7) CONSTRAINT acct_nn NOT NULL, branch_key NUMBER(7) CONSTRAINT brch_nn NOT NULL, product_key NUMBER(7) CONSTRAINT prod_nn NOT NULL, snapshot_date DATE CONSTRAINT mnth_nn NOT NULL, state_key CHAR(2) CONSTRAINT stat_nn NOT NULL ending_bal NUMBER(7,3) CONSTRAINT ebal_nn NOT NULL, average_daily_bal NUMBER(7,3) CONSTRAINT abal_nn NOT NULL, transaction_count NUMBER(7) CONSTRAINT txnc_nn NOT NULL, interest_paid NUMBER(7,3) CONSTRAINT intp_nn NOT NULL, fees_charged NUMBER(7,3) CONSTRAINT feec_nn NOT NULL ) PARTITION BY HASH (account_key)( PARTITIONS 16 STORE IN (TS1_DATA, TS2_DATA, TS3_DATA,TS4_DATA);

  43. Capitalizing on - Partitioning Composite Range-Hash Example CREATE TABLE account_bal_range_hash( account_key NUMBER(7) CONSTRAINT acct_nn NOT NULL, branch_key NUMBER(7) CONSTRAINT brch_nn NOT NULL, product_key NUMBER(7) CONSTRAINT prod_nn NOT NULL, snapshot_date DATE CONSTRAINT mnth_nn NOT NULL, state_key CHAR(2) CONSTRAINT stat_nn NOT NULL, ending_bal NUMBER(7,3) CONSTRAINT ebal_nn NOT NULL, average_daily_bal NUMBER(7,3) CONSTRAINT abal_nn NOT NULL, transaction_count NUMBER(7) CONSTRAINT txnc_nn NOT NULL, interest_paid NUMBER(7,3) CONSTRAINT intp_nn NOT NULL, fees_charged NUMBER(7,3) CONSTRAINT feec_nn NOT NULL) PARTITION BY RANGE (snapshot_date) SUBPARTITION BY HASH (account_key) SUBPARTITIONS 8 ( PARTITION Q1_ACCT_BAL VALUES LESS THAN (TO_DATE('01-APR-2000','DD-MON-YYYY')), PARTITION Q2_ACCT_BAL VALUES LESS THAN (TO_DATE('01-JUL-2000','DD-MON-YYYY')), PARTITION Q3_ACCT_BAL VALUES LESS THAN (TO_DATE('01-OCT-2000','DD-MON-YYYY')), PARTITION Q4_ACCT_BAL VALUES LESS THAN (TO_DATE('01-JAN-2001','DD-MON-YYYY')));

  44. Capitalizing on - Partitioning Composite Range-List Example CREATE TABLE account_bal_range_list( account_key NUMBER(7) CONSTRAINT acct_nn NOT NULL, branch_key NUMBER(7) CONSTRAINT brch_nn NOT NULL, product_key NUMBER(7) CONSTRAINT prod_nn NOT NULL, snapshot_date DATE CONSTRAINT mnth_nn NOT NULL, state_key CHAR(2) CONSTRAINT stat_nn NOT NULL, ending_bal NUMBER(7,3) CONSTRAINT ebal_nn NOT NULL, average_daily_bal NUMBER(7,3) CONSTRAINT abal_nn NOT NULL, transaction_count NUMBER(7) CONSTRAINT txnc_nn NOT NULL, interest_paid NUMBER(7,3) CONSTRAINT intp_nn NOT NULL, fees_charged NUMBER(7,3) CONSTRAINT feec_nn NOT NULL) PARTITION BY RANGE (snapshot_date) SUBPARTITION BY LIST (state) SUBPARTITION TEMPLATE( PARTITION northwest VALUES ('OR', 'WA'), PARTITION southwest VALUES VALUES ('AZ', 'UT', 'NM'), PARTITION southeast VALUES ('FL', 'GA','SC','AL','TN','NC'), PARTITION rest VALUES (DEFAULT)))( PARTITION q1_2002 VALUES LESS THAN(TO_DATE('1-APR-2002','DD-MON-YYYY')), PARTITION q2_2002 VALUES LESS THAN(TO_DATE('1-JUL-2002','DD-MON-YYYY')), PARTITION q3_2002 VALUES LESS THAN(TO_DATE('1-OCT-2002','DD-MON-YYYY')), PARTITION q4_2002 VALUES LESS THAN(TO_DATE('1-JAN-2003','DD-MON-YYYY')));

  45. Capitalizing on - Partitioning External Table to Partition using INSERT /*+APPEND*/ File(s) Target Source • FTP 02Nov2003_Sales Extract • “Alter Table” add file(s) to location • Perform INSERT /*+Append*/ INTO partitioned table

  46. Capitalizing on – Partitioning PQ1 Sales Extract External PQ2 INSERT/*+APPEND*/ INTO SALES (PQ3) SELECT * FROM SALES_EXTRACT_EXTERNAL PQ3

  47. Capitalizing on – Partitioning • What to partition: • Fact Tables • Generally Range-Hash composite • Range for some date (partition elimination) • Hash on the driving dimension key (partition-wise join) • Dimension • Hashing on the primary key of dimension tables facilitates full and partial partition-wise joins. • To for a full partition-wise join between a fact and dimension table you need to hash partition on the same key in the same number of buckets. • Materialized Views • Generally Range-Hash composite • Generally mirror the fact tables partition scheme

  48. Capitalizing on – Materialized Views Materialized Views • Enable queries to be re-written to take advantage of pre-calculated summaries thus reducing or eliminating sorts and joins. • Materialized Views can dramatically increase performance of queries when applied judiciously: • Reduce the number of sorts • Reduce the number of joins • Pre-filters data • Can be Indexed and Partitioned • Seamless to end users • Enhancements in 9i include • Removed restrictions enabling them to be leveraged in more situations • Fast refresh is now possible on a materialized views containing the UNION ALL operator.

  49. Capitalizing on – Materialized Views • Materialized Views cont… SELECT T.FISCAL_QUARTER_DESC, C.CUST_CITY, COUNT(*) AS SALE_COUNT, SUM(S.AMOUNT_SOLD) AS SALE_DOLLARS FROM oradata.CUSTOMERS C, oradata.SALES S, oradata.TIMES T WHERE C.CUST_ID = S.CUST_ID AND T.TIME_ID = S.TIME_ID GROUP BY T.FISCAL_QUARTER_DESC, C.CUST_CITY Query is re-written to be resolved from the MV instead of from the base tables

More Related