Diagnosing and Improving Query Performance through EXPLAIN plan analysis

Diagnosing and Improving Query Performance through EXPLAIN plan analysis Calisto Zuzarte IBM calisto@ca.ibm.com

Agenda • Basics • Some Not So Good Plans • Analyzing Plans • Diagnostics • Cardinality Estimation correction • Expensive Operations • An Example • Summary

Optimizer Concepts 101 • Evaluates operation order • Joins, predicate application, aggregation • Evaluates implementation to use: • table scan vs. index scan • nested-loop join vs. sorted-merge join vs. hash join • Evaluates parallel join strategies • Co-located join vs directed join vs broadcast join vs repartitioned join • Enumerates alternate plans and chooses the optimum • Costs the alternatives based on estimated costs

YES ! Is The Optimizer Perfect ? • No not really … we are always improving it. • It is a model • There are thousands of variables • It depends on the accuracy of information provided as input • If you can help provide better information to the optimizer, it has a chance of doing a better job

How does the EXPLAIN facility help ? • Offers clues as to why the optimizer has made particular decisions • Allows the user to maintain a history of problem query access plans during key transition periods • New index additions • Large data updates/additions • RUNSTATS changes • Release to Release migration • Significant DB or DBM configuration changes • Problem determination is easier, and often faster with a reference plan to compare against

Invoking db2exfmt • Create EXPLAIN tables ~/sqllib/misc/EXPLAIN.DDL • Populate the EXPLAIN tables • EXPLAIN PLAN FOR <SQL statement> • For dynamic SQL … SET EXPLAIN MODE • For static SQL set EXPLAIN bind option to YES • Invoke db2exfmt • db2exfmt -d <dbname> -1 -o <filename>

Global Information DB2_VERSION: 09.05.4 SOURCE_NAME: SQLC2G15 SOURCE_SCHEMA: NULLID SOURCE_VERSION: EXPLAIN_TIME: 2010-03-10-15.29.00.983210 EXPLAIN_REQUESTER: CALISTO Database Context: ---------------- Parallelism: Inter-Partition Parallelism CPU Speed: 2.479807e-07 Comm Speed: 100 Buffer Pool size: 114500 Sort Heap size: 20000 Database Heap size: 3615 Lock List size: 30000 Maximum Lock List: 80 Average Applications: 1 Locks Available: 1536000 Package Context: --------------- SQL Type: Dynamic Optimization Level: 5 Blocking: Block All Cursors Isolation Level: Cursor Stability STATEMENT 1 SECTION 203 ------------------------ QUERYNO: 1 QUERYTAG: Statement Type: Select Updatable: No Deletable: No Query Degree: 1

Original and Rewritten Statement Original Statement: ------------------ SELECT * from V1 where V1.C1=(select C1 from CALISTO.T3 where C2=0) Optimized Statement: ------------------- SELECT Q7.$C0 AS "C1", Q7.$C1 AS "C2" FROM (SELECT Q1.C1 FROM CALISTO.T3 AS Q1 WHERE (Q1.C2 = 0)) AS Q2, (SELECT Q3.C1, Q3.C2 FROM CALISTO.T2 AS Q3 UNION ALL SELECT Q5.C1, Q5.C2 FROM CALISTO.T1 AS Q5 ) AS Q7 WHERE (Q7.$C0 = Q2.$C0)

Plan Graph Access Plan: ------------ Total Cost: 121.438 Rows RETURN ( 1) Cost I/O | 10.96 NLJOIN ( 2) 591.438 51 /---+---\ 1 10.96 TBSCAN FILTER ( 3) ( 4) 38.8478 512.183 1 50 | | 121 9900 TABLE: CALISTO UNION T3( 5) Continued … 9900 UNION ( 5) 492.8183 50 /------+-----\ 8530 1370 TBSCAN TBSCAN ( 6) ( 7) 126.422 360.409 30 20 | | 1530 1370 TABLE: CALISTO TABLE: CALISTO T2 T1

EXPLAIN Diagnostic Messages Extended Diagnostic Information: -------------------------------- Diagnostic Identifier: 1 Diagnostic Details: EXP0148W The following MQT or statistical view was considered in query matching: "CALISTO ". “REPLICATED_DIM_TIME". Diagnostic Identifier: 2 Diagnostic Details: EXP0148W The following MQT or statistical view was considered in query matching: "CALISTO ". "MQT_FCT_CUR_MONTH". Diagnostic Identifier: 3 Diagnostic Details: EXP0149W The following MQT was used (from those considered) in query matching: "CALISTO ". “REPLICATED_DIM_TIME".

The RETURN Operator 1) RETURN: (Return Result) Cumulative Total Cost: 489328 Cumulative CPU Cost: 1.68991e+12 Cumulative I/O Cost: 56936.9 Cumulative Re-Total Cost: 249.906 Cumulative Re-CPU Cost: 1.00777e+09 Cumulative Re-I/O Cost: 0 Cumulative First Row Cost: 84081.3 Cumulative Comm Cost: 7.42953e+07 Cumulative First Comm Cost: 6.14019e+07 Estimated Bufferpool Buffers: 4843 Arguments: --------- BLDLEVEL: (Build level) DB2 v9.5.0.4 ENVVAR : (Environment Variable) DB2_ANTIJOIN = EXTEND ENVVAR : (Environment Variable) DB2_REDUCED_OPTIMIZATION = YES HEAPOVER: (Overcommitted on concurrent sortheap usage) FALSE HEAPUSE : (Maximum Statement Heap Usage) 5808 Pages PREPNODE: (Prepare Node Number) 1 PREPTIME: (Statement prepare time) 2004 milliseconds SHEAPCAP: (Cap on concurrent sortheap usage) 100 % STMTHEAP: (Statement heap size) 200000

Operator Details 2) HSJOIN: (Hash Join) Cumulative Total Cost: 2079.53 Cumulative CPU Cost: 3.95132e+07 Cumulative I/O Cost: 126 Cumulative Re-Total Cost: 2079.53 Cumulative Re-CPU Cost: 3.95132e+07 Cumulative Re-I/O Cost: 126 Cumulative First Row Cost: 2079.53 Estimated Bufferpool Buffers: 126 Arguments: --------- BITFLTR : (Hash Join Bit Filter used) FALSE EARLYOUT: (Early Out flag) LEFT HASHCODE: (Hash Code Size) 24 BIT OUTERJN : (Outer Join type) LEFT (ANTI) TEMPSIZE: (Temporary Table Page Size) 4096 Predicates: ---------- 2) Predicate used in Join Relational Operator: Equal (=) Subquery Input Required: No Filter Factor: 0.0001 Predicate Text: -------------- (Q2.C1 = Q1.C1) Input Streams: ------------- 2) From Operator #3 Estimated number of rows: 10000 Number of columns: 3 Subquery predicate ID: Not Applicable Column Names: ------------ +Q2.C3+Q2.C2+Q2.C1 4) From Operator #4 Estimated number of rows: 10000 Number of columns: 1 Subquery predicate ID: Not Applicable Column Names: ------------ +Q1.C1 Output Streams: -------------- 5) To Operator #1 Estimated number of rows: 0.0001 Number of columns: 3 Subquery predicate ID: Not Applicable Column Names: ------------ +Q4.C3+Q4.C2+Q4.C1

Object Details Name : TIME_DIM Schema: CALISTO Number of Columns: 50 Number of Pages with Rows: 20 Number of Pages: 20 Number of Rows: 5120 Table Overflow Record Count: 0 Width of Rows: 590 Time of Creation: 2009-12-16-23.20.43.601285 Last Statistics Update: 2010-02-15-03.25.54.394067 Primary Tablespace: TEST Tablespace for Indexes: TEST Tablespace for Long Data: NULLP Number of Referenced Columns: 2 Number of Indexes: 2 Volatile Table: No Number of Active Blocks: -1 Number of Column Groups: 0 Number of Data Partitions: 1 Average Row Compression Ratio: 0.000000 Percent Rows Compressed: 0.000000 Average Compressed Row Size: 0 Statistics Type: U

Identifying Expensive Operators 3.00579e+6 >NLJOIN ( 10) 1.54341e+06 548903 /---------+---------\ 3.00579e+6 1 HSJOIN IXSCAN ( 11) ( 15) 1.23169e+06 311718 284686 0 /------+------\ | 1.01317e+07 215511 1317 TBSCAN BTQ INDEX: CALISTO ( 12) ( 13) IDX_TBL_F 116551 1.07116e+06 Q5 93464 181971 | | 1.01317e+07 215511 TABLE: CALISTO TBSCAN TBL_E ( 14) Q3 1.07016e+06 181971 | 215511 TABLE: CALISTO TBL_D • Cardinality is the number above the operator • Cost is the number just below the operator number • Operator cost is the increase in the cost compared to the previous operator (or sum of all input operator costs) • Number of I/Os is just below the cost Cardinality Cost I/O

33467.5 NLJOIN ( 11) 15491.5 10529.4 /-----------+-----------\ 267.277 125.216 BTQ FETCH ( 12) ( 17) 884.75 55.7205 376.972 37.9624 | /------+-----\ 13.3638 125.216 3.16068e+08 FETCH RIDSCN DP-TABLE: CALISTO ( 13) ( 18) TRANSFACT 884.666 46.1653 Q10 376.972 5.96244 /---+---\ | 4686.66 18722 125.216 RIDSCN TABLE: CALISTO SORT ( 14) TIMEDIM ( 19) 529.869 Q9 46.1651 68 5.96244 | | 4686.66 125.216 SORT IXSCAN ( 15) ( 20) 529.869 46.1508 68 5.96244 | | 4686.66 3.16068e+08 IXSCAN INDEX: CALISTO ( 16) TRANSFACT_IDX 528.787 Q10 Example : Try To Avoid SORT On The Inner Of A NLJN ? • Problem ? • “List-prefetch” chosen because of poor clustering • Solution ? • Index-Only Access ? • REORG ?

Example: OR Predicates 7.96127 FETCH ( 2) 230.236 6.9672 /----+----\ 7.96127 298854 RIDSCN TABLE: CALISTO ( 3) BIGTABLE 30.2756 Q1 4 /-----+------\ 2 6 SORT SORT ( 4) ( 6) 15.138 15.138 2 2 | | 2 6 IXSCAN IXSCAN ( 5) ( 7) 15.1373 15.1373 2 2 | | 298854 298854 INDEX: SAPP29 INDEX: SAPP29 INDX~0 INDX~0 Q1 Q1 • Problem (C1 = 5 AND C2 = 10) OR (C1 = 6 AND C2 = 12) • Large table scan to fetch a few rows • Solution • Consider indexes for an Index-Oring plan 7.96127 TBSCAN ( 2) 3312.15 702 | 298854 TABLE: CALISTO BIGTABLE Q1

8 TBSCAN ( 2) 3312.15 702 | 298854 TABLE: CALISTO BIGTABLE Q1 IN Predicates • Problem C1 IN (10. 100, 500, 10000) • Large table scan to fetch a few rows • Solution • Consider an index with the column • For example: index on (C1, C3) • Index (C2, C1) is also good if the query has an equality predicate on C2 8 NLJOIN ( 9) 122.22 16 /-----+-----\ 4 2 TBSCAN FETCH ( 10) ( 13) 0.014061 30.3206 0 4 | /----+----\ 4 2 298854 TABFNC: SYSIBM IXSCAN TABLE: CALISTO GENROW ( 3) BIGTABLE Q3 30.2756 Q1 3 | 298854 INDEX: SAPP29 INDX~0 Q1

Low Cardinality Estimates 3.00579e-5 >NLJOIN ( 10) 1.54341e+06 548903 /---------+---------\ 3.00579e-5 3.22724e-6 HSJOINTBSCAN ( 11) ( 15) 1.23169e+06 311718 284686 264217 /------+------\ | 1.01317e+07 215511 1.01317e+07 TBSCAN BTQ TABLE: CALISTO ( 12) ( 13) TBL_F 116551 1.07116e+06 Q5 93464 181971 | | 1.01317e+07 215511 TABLE: CALISTO TBSCAN TBL_E ( 14) Q3 1.07016e+06 181971 | 215511 TABLE: CALISTO TBL_D • Problem • Unexpected low cardinality …3e-22 !! • Table scan on the inner of a NLJN • Solution • Consider Column Group Statistics on columns used in HSJN(11) predicates • Consider Indexes on TBL_F RUNSTATS ON TABLE CALISTO.TBL_E ON ALL COLUMNS AND COLUMNS ((d_year, d_qtr_name)) WITH DISTRIBUTION AND INDEXES ALL;

Join Underestimation – Overloaded Dimensions SELECT MIN(d_date), MAX(d_date) FROM inventory • 1998-01-01 to 2002-12-26 • 5 Year spread between MIN and MAX SELECT MIN(d_date), MAX(d_date) FROM date_dim • 1900-01-02, 2100-01-01 • 200 Year spread between MIN and MAX SELECT Inventory.* FROM Inventory, date_dim WHERE d_date_sk = inv_date_sk and d_quarter_name = '2000Q1'; • ESTIMATE: ~1/800th of the fact table assuming that the date keys corresponding to the year 2000 has the same probability of being in the fact table as the year 1955 • REALITY: 1/20th of the fact table (20 Quarters 299498 ^HSJOIN ( 8) 7098.88 2814.62 /---------+---------\ BETTER >> 299498 9091 NLJOIN TBSCAN ( 9) ( 18) 6453.95 614.897 2211.62 603 /--------+--------\ | 89.5342 3345.06 9091 MBTQ FETCH TABLE: TPCDS ( 10) ( 14) ITEM 691.788 22.8145 Q1 655 3 | /---+---\ 44.7671 80.8848 5.91947e+06 TBSCAN IXSCAN TABLE: TPCDS ( 12) ( 14) INVENTORY 691.717 15.2153 Q3 655 2 | | 36592 5.91947e+06 TABLE: TPCDS INDEX: TPCDS DATE_DIM INV_INVDATE Q2 Q3 7241.95 HSJOIN ( 8) 3349.26 1526.6 /---------+--------\ 9091 7241.95 <<<< UNDERESTIMATED! TBSCAN NLJOIN ( 9) ( 10) 614.897 2732.68 603 923.603 | /--------+--------\ 9091 89.5342 80.8848 TABLE: TPCDS BTQ FETCH ITEM ( 11) ( 13) Q1 691.788 22.8145 655 3 | /---+---\ 44.7671 80.8848 5.91947e+06 TBSCAN IXSCAN TABLE: TPCDS ( 12) ( 14) INVENTORY 691.717 15.2153 Q3 655 2 | | 36592 5.91947e+06 TABLE: TPCDS INDEX: TPCDS DATE_DIM INV_INVDATE Q2 Q3 CREATE VIEW V_inventory_date AS (SELECT date_dim.*, inv_date_sk FROM inventory, date_dim WHERE inv_date_sk = d_date_sk ); ALTER VIEW V_inventory_date ENABLE QUERY OPTIMIZATION; RUNSTATS ON TABLE tpcds.v_inventory_date WITH DISTRIBUTION;

SORT Spills • The details for a SORT will indicate if the SORT spilled • The I/Os indicate that there was spilling associated with the SORT. • Minimize spills by considering indexes and (Also discussed later by balancing SORTHEAP, SHEAPTHRES and BUFFERPOOL) SORT ( 16) 6.14826e+06 1.30119e+06 | 3.65665e+07 TBSCAN ( 17) 2.00653e+06 1.14286e+06 | 3.74999e+07 TABLE: TPCD.ORDERS

Diagnostic Step 1 - Basics • Look for the obvious diagnostic messages • No RUNSTATS on table ? • Look at the global settings • BUFFERPOOL • SORTHEAP • OPTIMIZATION LEVEL • Settings in the RETURN operator • Any anomalies

Diagnostic Step 2 – Cardinality Underestimation • Look for cardinality underestimation clues • Get rid of very small numbers like 3.00579e-5 if possible with column group statistics if there are multiple predicates applied to a table • Get rid of underestimation because of skew in the fact table join columns • Get rid of underestimation with over loaded dimensions • Range predicates on dates [DATECOL >= ‘2010-05-15’] • Try increasing the number of quantile statistics if estimates look off • Try not to put expressions around columns if possible • Re-Explain the query once you have made changes to try and correct the cardinality estimate Skew ? Extremely Small Cardinality Overloaded Dimensions Expressions

Diagnostic Step 3 – Operator Costs • Typically look for the expensive operators relative to the total cost • Depending on the operator, there might be different possibilities • If there seems nothing that you can easily do, consider the next most expensive operator • Let us consider some operators in the next few slides Most Expensive Operators

Anomaly With Merge Join or Nested Loop Join? • Occasionally the MSJN or NLJN cost may be less than the sum of its inputs !! • T1.C1 … Values from 1 to 100 • T2.C1 … Values from 100 to 200 • SELECT * FROM T1, T2 WHERE T1.C1 = T2.C1 Distribution Statistics ?

Expensive Table Scan • Are all the columns needed ? • … Not much you can do • Only a few columns needed ? • … Index only access possible ? • Predicates filter significantly ? • … ISCAN–FETCH make sense ? • … indexes for IN or OR predicates possibility ? • Is this a fact table with aggregation ? • Could we define a Materialized Query Table • Are similar local predicates frequently used on this table • Could it be partitioned by range ? • Could this be a Multi-dimension Clustered table ? • Both MDC + Range partitioning ? Indexes ? MQTs ? MDC ? Range Partitioning ?

Expensive ISCAN and FETCH • FETCH cost is very high ? OR • List Prefetch plans on the inner of nested loop joins ? • Look at the CLUSTERRATIO (or CLUSTERFACTOR) • If it is not close to 100 (or 1), consider REORG against this index if this is the key index used in most queries (this does not apply to Multi-dimension Clustered tables) • Index-Only Access possible ? • If good filtering commonly used predicates are applied at the FETCH • Consider adding the column or columns to the index REORG ? Index Only ? Add Columns To An Index

Expensive SORT • Is there spilling ? • Could you increase SORTHEAP or BUFFERPOOL ? … do not increase SORTHEAP too high if there are concurrent Hash joins or sorts in this query and specially in a multi-user environment • Could you create an index to avoid the sort • Note that an ISCAN-FETCH may be more expensive • Perhaps an index-only access ? SHEAPTHRES ? SORTHEAP ? Index ? BUFFERPOOL ?

Expensive SORT (Continued) • Do you have a DISTINCT in a subquery ? • The optimizer sometimes postpones the duplicate removal • Consider using a GROUP BY instead of the DISTINCT 4.90699e+08 SORT ( 7) 7.86401e+07 1.58631e+07 | 4.90699e+08 HSJOIN ( 8) 35685.3 34146 /-----------+-----------\ 3.25208e+0684979 HSJOIN TBSCAN ( 9) ( 14) 12161.1 19697.9 12484 21662 /----------+----------\ | 60637024719 84979 ^HSJOIN TBSCAN CUSTOMER ( 10) ( 13) 10764.8 1322.24 11038 1446 /------+-------\ | 606370 7671 39033 TBSCAN IXSCAN PRODUCT ( 11) ( 12) 10667.2 61.6409 11025 13 | | 606370 7671 FACT INDEX: TIME_IDX DISTINCT SELECT DISTINCT SELECT ... GROUP BY DISTINCT

Expensive Nested Loop Join • A TSCAN or SORT on the inner should be avoided as far as possible • Consider an index • Consider clustering if the SORT is for a list prefetch plan • Outer is large and you have an ISCAN-FETCH on the inner ? • Even if each ISCAN FETCH is not so expensive, it could be executed thousands of times • See previous slide on Expensive ISCAN-FETCH • Expression on join predicate ? • Could you avoid the expression on the inner table column so that you could use an index with start-stop keys ? • Could you use generated columns with that expression ? Expressions ? Avoid TSCAN or SORT on the Inner

Expensive Hash Join • Is there spilling ? • Optimizer estimates spilling if • HSJN I/O cost > sum of the I/O of the inputs • Hash Join uses SORTHEAP to keep the hash tables SORTHEAP ? BUFFERPOOL ? SHEAPTHRES ?

Expensive Communication Between Partitions ? • Large Table Queues (TQs) • Consider replicated dimension tables • Could the large dimension table that is commonly joined be partitioned the same way as the fact table ? Replicated Tables Collocate largest Dimension with Fact

TPCH Query Example SELECT nation, o_year, SUM(amount) AS sum_profit FROM (SELECT n_name as nation, year(o_orderdate) as o_year, l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity as amount FROM part, supplier, lineitem, partsupp, orders, nation WHERE s_suppkey = l_suppkey and ps_suppkey = l_suppkey and ps_partkey = l_partkey and p_partkey = l_partkey and o_orderkey = l_orderkey and s_nationkey = n_nationkey and p_name like '%coral%' ) AS profit GROUP BY nation, o_year ORDER BY nation, o_year desc

Access Plan: ----------- Rows RETURN ( 1) | 225 GRPBY ( 2) | 36000 MDTQ ( 3) | 225 GRPBY ( 4) | 225 TBSCAN ( 5) | 225 SORT ( 6) | 2.2704e+07 HSJOIN ( 7) /-------------+-------------\ 2.2704e+07 633503 DTQ HSJOIN ( 8) ( 18) /-------------+-------------\ 2.2704e+07 633503 DTQ HSJOIN ( 8) ( 18) | /------+-----\ 2.2704e+07 633503 25 HSJOIN TBSCAN BTQ ( 9) ( 19) ( 20) /----+----\ | | 5.07707e+07 2.45672e+07 633503 25 TBSCAN DTQ TABLE: TPCD FETCH ( 10) ( 11) SUPPLIER ( 21) | | /----+----\ 5.07707e+07 2.45672e+07 25 25 TABLE: TPCD HSJOIN IXSCAN TABLE: TPCD PARTSUPP ( 12) ( 22) NATION /-------+-------\ | 9.5221e+07 2.45672e+07 25 TBSCAN HSJOIN INDEX: TPCD ( 13) ( 14) N_NK | /---+---\ 9.5221e+07 3.80893e+08 1.30986e+08 TABLE: TPCD TBSCAN BTQ ORDERS ( 15) ( 16) | | 3.80893e+08 818664 TABLE: TPCD TBSCAN LINEITEM ( 17) | 1.26927e+07 TABLE: TPCD PART Overall Access Plan

Summary • The db2exfmt output provides very detailed information about the access plan • There are some diagnostics provided by DB2 • Correct major cardinality estimation errors • If estimates are reasonable look at operator costs and consider ways to reduce the cost of the plan

Calisto Zuzartecalisto@ca.ibm.com

Diagnosing and Improving Query Performance through EXPLAIN plan analysis

Diagnosing and Improving Query Performance through EXPLAIN plan analysis

Presentation Transcript

improving performance through strength

Improving Outcomes through Quality and Performance Improvement

Improving Performance Through Empowerment, Teamwork, and Communication

Improving BSO Services and SME Performance Through Cleaner Production

Improving BSO Services and SME Performance Through Cleaner Production

Improving internal audit performance through quality assurance

Module 8 Improving Performance through Nonclustered Indexes

Improving Performance of Rubber Materials Through Blending

Improving Semantic Search Using Query Log Analysis

Improving Performance

Improving the analysis performance

Improving BSO Services and SME Performance Through Cleaner Production

Improving BSO Services and SME Performance Through Cleaner Production

Improving BSO Services and SME Performance Through Cleaner Production

Improving PBIS Practices Through Data Analysis

Improving Performance through Bilateral Utility Partnerships

Improving Safety Performance Through Measurement

Improving Application Performance through Swap Compression

Improving Index Performance through Prefetching

Improving our performance through communications and PR Steve Murphy

Improving BSO Services and SME Performance Through Cleaner Production

Improving BSO Services and SME Performance Through Cleaner Production