1 / 21

The Model Clause explained

The Model Clause explained. Tony Hasler, UKOUG Birmingham 2012. Who is Tony Hasler?. Google Tony Hasler! My blog contains all the material from this presentation on the front page Additional blog entry (October) on the model clause. Health Warning about the model clause.

umika
Download Presentation

The Model Clause explained

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Model Clause explained Tony Hasler, UKOUG Birmingham 2012 Tony Hasler, Anvil Computer Services Ltd.

  2. Who is Tony Hasler? Google Tony Hasler! My blog contains all the material from this presentation on the front page Additional blog entry (October) on the model clause Tony Hasler http://tonyhasler.wordpress.com

  3. Health Warning about the model clause • No support for Active Session History (ASH) whilst on the CPU • Limited support for SQL Performance Monitor • No session statistics related to modelling • No trace events for diagnosis (that I can find) • Performance is poor when data spills to disk • Introduced in 10g but no new features in 11g • The main example in chapter 22 (10g) or chapter 23 (11g) in the Data Warehousing Guide (calculating mortgage amortization) is incorrect and always has been (mortgage fact ‘Payment’ should be ‘PaymentAmt’) • Perhaps this is not a flagship component! • However, the model clause: • Enables you to do stuff that cannot otherwise be (easily) done in SQL • The parallelisation capabilities can be useful in some cases Tony Hasler http://tonyhasler.wordpress.com

  4. What is in this talk and what is not • IN: • What the model clause is • What it is used for and what it isn’t • Syntax and semantics of the main features • Recommendations on how to use the model clause • A real life example • Not in due to time constraints: • An exhaustive list of all model clause features • This can be found in the Data Warehousing Guide Chapter 22 (10g) or Chapter 23 (11g) • No demos Tony Hasler http://tonyhasler.wordpress.com

  5. What the model clause is and what is it used for Part of a query block meaning it can appear after any select keyword. Provides spreadsheet like capabilities inside SQL Can perform unimplemented SQL features, e.g. moving medians. => Just because something is inefficient to execute doesn’t mean you don’t need to do it! Iteration is possible, such as with ‘Feedback loops’ (see Example 4   Calculating Using Simultaneous Equations in the Data Warehousing guide) A series of calculations within the model can be sequenced manually or dependencies can be determined by the engine (only use when necessary, not because of laziness) Tony Hasler http://tonyhasler.wordpress.com

  6. Model terms compared with Excel Rows/columns <-> Dimensions Values <-> Measures Formulas <-> Rules Worksheets <-> Partitions Tony Hasler http://tonyhasler.wordpress.com

  7. Comparison of Model clause terms with Excel spreadsheet terms Tony Hasler http://tonyhasler.wordpress.com

  8. Overview of a query block with no model clause Joins and outer join predicates Only scalar functions allowed Logically precedes Only scalar functions allowed WHERE clause: inner join predicates and selection predicates Analytic calculations Logically precedes GROUP BY clause Only scalar functions allowed Logically precedes HAVING clause Scalar and aggregate functions allowed Logically precedes SELECT list Scalar, aggregate, and analytic functions allowed Logically precedes DISTINCT Logically precedes ORDER BY clause Scalar, aggregate, and analytic clauses allowed Tony Hasler http://tonyhasler.wordpress.com

  9. Overview of a query block with a model clause Analytic calculations Joins, join predicates, selection predicates and GROUP BY clause Only scalar functions allowed Logically precedes Scalar and aggregate functions allowed HAVING clause Logically precedes Model PARTITION, DIMENSION and MEASURES clauses Scalar, aggregate, and analytic functions allowed Logically precedes Model RULES clause Independent calculations Logically precedes Only scalar functions on model outputs allowed SELECT list Logically precedes DISTINCT Logically precedes Only scalar functions on model outputs allowed ORDER BY clause Tony Hasler http://tonyhasler.wordpress.com

  10. Recommendations on use of the model clause Use factored subqueries to avoid mixing aggregate functions and analytic functions with the model clause Use only scalar values as inputs to the model clause Perform all calculations in the model clause so that… The final select list is just a simple list of identifiers Be careful using the model clause in a sub query as the CBO assumes that the cardinality of the output of a model clause is the same as the cardinality of the input. Tony Hasler http://tonyhasler.wordpress.com

  11. A simplified version of the real world problem CREATETABLE stock_holdings ( customer_name VARCHAR2(100) ,stock_name VARCHAR2(100) ,business_date DATE ,VALUENUMBER ); Calculate the moving one year average (mean) valueand moving standard deviation, and zscore for each customer, stock, and business date. Tony Hasler http://tonyhasler.wordpress.com

  12. Mathematical terms loosely explained The Average (mean) is a “typical value” The Standard Deviation is a “typical difference between a value and the mean” The Zscore is the number of standard deviations that a particular value deviates from the mean. In other words it is a measure of how unusual a value is. Tony Hasler http://tonyhasler.wordpress.com

  13. Problem 1: Oracle doesn’t support moving intervals of months or years properly!!! WITH q1 AS(SELECTDATE'2012-01-01'+ROWNUM mydate FROM DUAL CONNECTBYLEVEL<=400) ,q2 AS(SELECT mydate ,COUNT(*) OVER( ORDERBY mydate RANGEBETWEENINTERVAL'1'YEARPRECEDINGANDCURRENTROW) cnt FROM q1) SELECT * FROM q2 WHERE mydate =DATE'2013-01-10'; Result: MYDATE CNT 10/01/2013 367 Tony Hasler http://tonyhasler.wordpress.com

  14. Model clause solution (1) WITH q1 AS(SELECTDATE'2012-01-01'+ROWNUM mydate,0 cnt FROM DUAL CONNECTBYLEVEL<=400) SELECT mydate, cnt FROM q1 MODELRETURNUPDATEDROWS DIMENSIONBY(mydate) MEASURES(cnt) RULES (cnt [DATE'2013-01-10'] = COUNT(*) [mydate BETWEENADD_MONTHS(CV()+1,-12)ANDCV()]); Tony Hasler http://tonyhasler.wordpress.com

  15. Model clause solution (2) CREATETABLE business_dates ( business_date PRIMARYKEYNOTNULL ,business_days_in_year NOTNULL ,first_day_in_year NOTNULL ,business_day_number NOTNULL ) ORGANIZATIONINDEX AS WITH q1 AS( SELECTDISTINCT business_date,0AS business_days_in_year, SYSDATEAS first_day_in_year,0AS business_day_number FROM stock_holdings ) SELECT business_date ,business_days_in_year ,first_day_in_year ,business_day_number FROM q1 MODEL DIMENSIONBY(business_date) MEASURES(business_days_in_year,first_day_in_year,business_day_number) RULES( first_day_in_year[ANY] =ADD_MONTHS(CV(business_date)+1,-12), business_days_in_year[ANY] = COUNT(*)[business_date BETWEEN first_day_in_year[CV()] ANDCV()], business_day_number[ANY] = ROW_NUMBER()OVER(ORDERBY BUSINESS_DATE)); Tony Hasler http://tonyhasler.wordpress.com

  16. Execution plan for previous statement ------------------------------------------------------- | Id | Operation | Name | ------------------------------------------------------- | 0 | CREATE TABLE STATEMENT | | | 1 | LOAD AS SELECT | BUSINESS_DATES | | 2 | SQL MODEL ORDERED | | | 3 | VIEW | | | 4 | HASH UNIQUE | | | 5 | TABLE ACCESS FULL | STOCK_HOLDINGS | | 6 | WINDOW (IN SQL MODEL) SORT| | ------------------------------------------------------- Tony Hasler http://tonyhasler.wordpress.com

  17. Problem 2: avoiding data densification Oracle recommends the use of partitioned outer joins for data densification to simplify analytic functions. The model clause can also be used for data densification In my case, however, this would have multiplied the number of rows enormously as customers tended to hold particular stocks for about 10% of the time. We can use the model clause to avoid this by performing multiple calculations in sequence Tony Hasler http://tonyhasler.wordpress.com

  18. Model clause solution SELECT customer_name,stock_name,business_date,first_day_in_year,VALUE,mov_avg,mov_stdd,zscore FROM stock_holdings sh, business_dates bd WHERE sh.business_date=bd.business_date MODEL PARTITIONBY(customer_name,stock_name) DIMENSIONBY(sh.business_date) MEASURES(VALUE,business_days_in_year, first_day_in_year,0AS mov_avg, 0AS mov_stdd,0AS zscore,0AS mov_sum,0AS mov_cnt) RULES( mov_sum[ANY] =SUM(VALUE)[business_date BETWEEN first_day_in_year[CV()]ANDCV()], mov_cnt[ANY] =COUNT(*) [business_date BETWEEN first_day_in_year[CV()] ANDCV()], mov_avg[ANY] = mov_sum[CV()]/ business_days_in_year[CV()], mov_stdd[ANY] = SQRT(( (VAR_POP(VALUE)[business_date BETWEEN first_day_in_year[CV()] ANDCV()] *mov_cnt[CV()]) +(mov_sum[CV()]* (AVG(VALUE)[business_date BETWEEN first_day_in_year[CV()] ANDCV()] - mov_avg[CV()]))) /business_days_in_year[CV()] ), zscore[ANY] =DECODE(mov_stdd[CV()],0,0,(VALUE[CV()]-mov_avg[CV()])/mov_stdd[CV()]) ); Tony Hasler http://tonyhasler.wordpress.com

  19. Parallel execution plan -------------------------------------------------------- | Id | Operation | Name | -------------------------------------------------------- | 0 | SELECT STATEMENT | | | 1 | PX COORDINATOR | | | 2 | PX SEND QC (RANDOM) | :TQ10001 | | 3 | BUFFER SORT | | | 4 | SQL MODEL ORDERED | | | 5 | PX RECEIVE | | | 6 | PX SEND HASH | :TQ10000 | | 7 | NESTED LOOPS | | | 8 | PX BLOCK ITERATOR | | | 9 | TABLE ACCESS FULL| STOCK_HOLDINGS | |* 10 | INDEX UNIQUE SCAN | SYS_IOT_TOP_78729 | -------------------------------------------------------- Tony Hasler http://tonyhasler.wordpress.com

  20. Summary The model clause allows you to build your own analytical functions and/or your own analytical windows – amongst other things. The model clause also allows you to parallelise calculations and can be useful even if the calculations are supported without the model clause However, model clause aggregates will be slower than standard analytic functions Performance degrades rapidly when partitions spill to disk. Tony Hasler http://tonyhasler.wordpress.com

  21. Questions Tony Hasler http://tonyhasler.wordpress.com

More Related