1 / 75

Modern Performance - SQL Server

Modern Performance - SQL Server. Joe Chang www.qdpma.com Jchang6 @ yahoo. About Joe. SQL Server consultant since 1999 Query Optimizer execution plan cost formulas (2002) True cost structure of SQL plan operations (2003?) Database with distribution statistics only, no data 2004

Download Presentation

Modern Performance - SQL Server

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Modern Performance - SQL Server Joe Chang www.qdpma.com Jchang6 @ yahoo

  2. About Joe • SQL Server consultant since 1999 • Query Optimizer execution plan cost formulas (2002) • True cost structure of SQL plan operations (2003?) • Database with distribution statistics only, no data 2004 • Decoding statblob/stats_stream • writing your own statistics • Disk IO cost structure • Tools for system monitoring, execution plan analysis See http://www.qdpma.com/ Download: http://www.qdpma.com/ExecStatsZip.html Blog: http://sqlblog.com/blogs/joe_chang/default.aspx

  3. Overview • General SQL Server Performance • Why performance is still important today? • Brute force? • Yes, but … • Special Topics – spectacular fails • Automating data collections • SQL Server Engine • What developers/DBA need to know?

  4. Not in this session • List of rules to be followed blindly • without consideration for the underlying reason • and whether rule actually applies in the current circumstance DBA skill: cause and effect analysis & assessment

  5. Common Themes? • execution plan • Very large (multiple order of magnitude) error in row estimate • Single (execute) of large operation • Might still be tolerable • Multiple (executes) of large operations

  6. selecta.Header,a.CUSIP,a.SecNo,a.Security,a.Symbol,a.Split_rep,a.Sales_Person_Name,cast(sum(a.January)asfloat)as January ,cast(sum(a.February)asfloat)asFebruary ,cast(sum(a.March)asfloat)as March ,cast(sum(a.April)asfloat)asApril ,cast(sum(a.May)asfloat)as May ,cast(sum(a.June)asfloat)asJune ,cast(sum(a.July)asfloat)as July ,cast(sum(a.August)asfloat)asAugust ,cast(sum(a.September)asfloat)as September ,cast(sum(a.October)asfloat)as October ,cast(sum(a.November)asfloat)as November ,cast(sum(a.December)asfloat)as December ,cast(sum(a.Total)asfloat)as Total from( select cast(hdr.Headerasvarchar(100))as Header ,cast(AcctSec.CUSIPasvarchar(100))as CUSIP ,cast(AcctSec.Sec_Noasvarchar(100))asSecNo ,cast(AcctSec.Sec_Desc1 asvarchar(100))asSecurity ,cast(AcctSec.Symbolasvarchar(100))as Symbol ,casewhenRefMonth.[MonthName] ='January'thenfct.Commelse 0 endas January ,casewhenRefMonth.[MonthName] ='February'thenfct.Commelse 0 endas February ,casewhenRefMonth.[MonthName] ='March'thenfct.Commelse 0 endas March ,casewhenRefMonth.[MonthName] ='April'thenfct.Commelse 0 endas April ,casewhenRefMonth.[MonthName] ='May'thenfct.Commelse 0 endas May ,casewhenRefMonth.[MonthName] ='June'thenfct.Commelse 0 endas June ,casewhenRefMonth.[MonthName] ='July'thenfct.Commelse 0 endas July ,casewhenRefMonth.[MonthName] ='August'thenfct.Commelse 0 endas August ,casewhenRefMonth.[MonthName] ='September'thenfct.Commelse 0 endas September ,casewhenRefMonth.[MonthName] ='October'thenfct.Commelse 0 endas October ,casewhenRefMonth.[MonthName] ='November'thenfct.Commelse 0 endas November ,casewhenRefMonth.[MonthName] ='December'thenfct.Commelse 0 endas December ,fct.Commas Total ,AcctEmp.split_rep ,AcctEmp.Sales_Person_Name fromPayoutSystemDW.[dbo].[PS_FactAccountSummary] fct joinPayoutSystemDW.dbo.PS_DimensionRptBusRptBusonfct.DimRptBusID=RptBus.DimRptBusID joinPayoutSystemDW.dbo.PS_DimensionHeaderhdronfct.DimHeaderID=hdr.DimHeaderID joinPayoutSystemDW.dbo.PS_DimensionCurrencycur onfct.DimCurID=cur.DimCurIDandcur.DimCurID= 1 joinPayoutSystemDW.dbo.PS_DimensionAcctEmpAcctEmponfct.DimAcctEmpID=acctemp.DimAcctEmpIDandAcctEmp.Empno= 8125 andAcctEmp.Split_repin('PB54') joinPayoutSystemDW.dbo.PS_DimensionAcctSecAcctSeconfct.DimAcctSecID=AcctSec.DimAcctSecID joinPayoutSystemDW.dbo.PS_DimensionRefBuySellbsonfct.DimRefBuySellID=bs.DimRefBuySellID joinPayoutSystemDW.[dbo].[PS_DimensionAcctOrg] AcctOrgonfct.DimAcctOrgID=AcctOrg.DimAcctOrgIDandAcctOrg.OrgCodein('38C') joinPayoutSystemDW.[dbo].[PS_DimensionAcctClt] asAcctCltonAcctClt.DimAcctCltID=AcctClt.DimAcctCltID andAcctClt.ClientName='BRACY DENNIS M' joinPayoutSystemDW.dbo.PS_DimensionTradeIndtionti.DimTradeIndID=fct.DimTradeIndIDandti.[Trade_Ind_Year] = 2014 joinPayoutSystemDW.dbo.PS_DimensionRefMonthRefMonthonRefMonth.MonthID=ti.Trade_Ind_Month whereRptBus.ReportID= 1 )a groupbya.Header,a.CUSIP,a.SecNo,a.Security,a.Symbol,a.Split_rep,a.Sales_Person_Name

  7. select fct.Commas Total, … From FactAccountSummaryfct join DimensionRptBusRptBus on fct.DimRptBusID = RptBus.DimRptBusID Join DimensionCurrencycur on fct.DimCurID = cur.DimCurID join DimensionRefBuySellbs on fct.DimRefBuySellID = bs.DimRefBuySellID join DimensionAcctOrg] AcctOrg on fct.DimAcctOrgID = AcctOrg.DimAcctOrgID joinDimensionAcctCltasAcctClt onAcctClt.DimAcctCltID=AcctClt.DimAcctCltID

  8. CPU & Memory 2001 versus 2014 DMI 2 PCI-E P P L2 QPI FSB MCH QPI P P QPI QPI 2001 – 4 sockets, 4 cores Pentium III Xeon, 900MHz 4-8GB memory? Xeon MP 2002-4 QPI DMI 2 PCI-E PCI-E PCI-E PCI-E PCI-E PCI-E PCI-E PCI-E PCI-E PCI-E PCI-E PCI-E PCI-E PCI-E PCI-E PCI-E PCI-E PCI-E PCI-E PCI-E PCI-E Each core today is more than 10x over Pentium III (700MHz?) Xeon E7 v2 (Ivy Bridge), 15 cores, 3 QPI 4 x 15 = 60 cores 3TB (96 x 32GB) 24 DIMMs per socket 40 PCI-E gen3 lanes + x4 g2 / socket QPI QPI QPI QPI PCI-E PCI-E PCI-E PCI-E C4 C4 C4 C4 LLC LLC LLC LLC 14 C5 14 C5 14 C5 C5 14 C3 C3 C3 C3 C6 C6 C6 C6 13 13 13 13 C2 C2 C2 C2 C7 12 12 C7 C7 12 12 C7 C1 C1 C1 C1 11 11 C8 11 11 C8 C8 C8 C0 C0 C0 C0 C9 C9 10 10 C9 10 10 C9 MI MI MI MI MI MI MI MI GFX Mem___2013 __ 2014 16GB __ $191 __ $180 32GB __ $794 __ $650 64GB_______$4510 MC x4 x4 x4 x4 DMI PCH

  9. CPU & Memory 2001 versus 2012 DMI 2 PCI-E P P QPI QPI PCI-E PCI-E QPI QPI PCI-E PCI-E L2 C3 C3 LLC LLC C4 C4 C3 C3 LLC LLC C4 C4 QPI C2 C2 C5 C5 C2 C2 C5 C5 FSB C1 C1 C6 C6 C1 C1 C6 C6 C0 C0 C7 C7 C0 C0 C7 C7 MCH MI MI MI MI MI MI MI MI P P QPI QPI 2001 – 4 sockets, 4 cores Pentium III Xeon, 900MHz 4-8GB memory? Xeon MP 2002-4 QPI PCI-E PCI-E PCI-E PCI-E PCI-E PCI-E PCI-E PCI-E PCI-E PCI-E DMI 2 PCI-E PCI-E PCI-E PCI-E PCI-E PCI-E PCI-E PCI-E PCI-E PCI-E PCI-E Each core today is more than 10x over Pentium III (700MHz?) Xeon E5 (Sandy Bridge), 8 cores, 2 QPI 4 x 8 = 32 cores total • Westmere-EX 1TB (64x16GB) (3 QPI) Sandy Bridge E5: 768GB (48 x 16GB) (2 QPI) Mem___2013 __ 2014 16GB __ $191 __ $180 32GB __ $794 __ $650 64GB_______$4510

  10. Intel E5 & E7 v2 (Ivy-Bridge) E3 v3 GFX MC x4 x4 x4 x4 DMI PCH

  11. Processor – Core

  12. Microprocessor Pipeline 3GHz 0.33ns clock 1st 2nd BP BP IF IF ID ID RAT RAT ROB ROB Sch Sch Exec Exec Flags Flags Retire Retire 5 ns from start to finish 200MHz BP Microprocessor (core) is (multi-lane) assembly line Each core is superscalar Processor (socket) has multiple cores System has multiple sockets Branch Predict Instruction Fetch Decode Register Allocate & Rename Re-Ordering Buffer Schedule Execute Flags Retire

  13. Micro-architecture Sandy-Bridge

  14. Haswell (Xeon E5/7 v3)

  15. CPU Access Times Core – 3.33GHz 1 CPU cycle = 0.3ns L1 cache – 4 CPU clocks (1ns) L2 cache 12 CPU cycles (4ns?) L3 cache 29+ cycles Local node memory 28 cycles + 49 ns (open page) 28 cycles + 56 ns (random page) Remote node (1-hop) memory 28 + 100ns 2-hop 150-300ns+? Logical 1 Logical 0 L1 I L1 D L2 Unified L3 Slice DRAM

  16. Latency Orders of Magnitude Core – 3.33GHz 1 CPU cycle = 0.3ns L1 cache – 4 CPU clocks (1ns) L2 cache 12 CPU cycles (4ns?) L3 cache 29+ cycles Local node memory 28 cycles + 49 ns (open page) 28 cycles + 56 ns (random page) Remote node (1-hop) memory 28 + 100ns 2-hop 150-300ns+? Core L1 Cache L1 Cache LLC GFX MC x4 x4 x4 x4 DMI PCH

  17. Westmere-EX 8-Socket System QPI QPI QPI QPI QPI QPI QPI QPI QPI QPI QPI QPI QPI QPI QPI QPI QPI QPI QPI QPI QPI QPI QPI QPI QPI QPI QPI QPI QPI QPI QPI QPI Large server systems are very complicated Software developed without consideration for system architecture will likely have severe problems This applies to the OS, SQL Server and the application C4 LLC C5 C4 LLC C5 C3 C6 C3 C6 C2 C7 C2 C7 IOH 0 C1 C8 C1 C8 IOH 1 C0 C9 C0 C9 QPI QPI MC MC MC MC QPI QPI SMB SMB SMB SMB SMB SMB SMB SMB C4 LLC C5 C4 LLC C5 C3 C6 C3 C6 C2 C7 C2 C7 QPI C1 C8 C1 C8 C0 C9 C0 C9 MC MC MC MC C4 LLC C5 C4 LLC C5 C3 C6 C3 C6 QPI C2 C7 C2 C7 C1 C8 C1 C8 C0 C9 C0 C9 MC MC MC MC QPI QPI C4 LLC C5 C4 LLC C5 IOH 2 IOH 3 QPI C3 C6 C3 C6 PCI-E x8 PCI-E x8 PCI-E x8 PCI-E x8 PCI-E x4 ESI C2 C7 C2 C7 PCH C1 C8 C1 C8 C0 C9 C0 C9 MC MC MC MC

  18. Storage 2001 versus 2012/13 QPI 192 GB MCH SSD SSD SSD SSD QPI PCIe x4 PCIe x8 PCIe x8 PCIe x8 PCIe x8 PCIe x8 PCI PCI PCI PCI HDD HDD HDD HDD RAID RAID RAID RAID 10GbE IB RAID RAID RAID RAID HDD HDD HDD HDD HDD HDD HDD HDD 2001 100 x 10K HDD 125 IOPS each = 12.5K IOPS IO Bandwidth limited: 1.3GB/s (1/3 memory bandwidth) 2013 64 SSDs, >10K+ IOPS each, 1M IOPS total possible 10-20GB/s+ IO Bandwidth easy 6.4GB/s on each PCIe G3 x8 SAN vendors – questionable BW http://www.qdpma.com/Storage/Storage2013.html http://www.qdpma.com/ppt/Storage_2013.pptx

  19. SAN Node 1 PCIe Node 2 PCIe Node 1 Node 2 HBA HBA 768 GB 768 GB 1024 GB 1024 GB x8 x8 x8 x8 x8 x8 x8 x8 8 Gb FC SSD SSD or 10Gb FCOE SSD SSD Switch Switch Switch Switch 0.8 GB/s 8 Gb FC SP A SP B SP A SP B 24 GB 24 GB 24 GB 24 GB x4 SAS 2GB/s x4 SAS 2GB/s Data 10 Log 1 Data 16 Data 15 Data 11 Data 14 Data 8 Data 13 Log 3 Data 9 Data 3 Data 4 Data 12 Log 2 Data 5 Log 4 Data 7 Data 1 Data 2 Data 6 Auto-tier pools Hot Spares SSD 10K 7.2K SSD 1 SSD 2 SSD 3 SSD 4 http://sqlblog.com/blogs/joe_chang/archive/2013/05/10/enterprise-storage-systems-emc-vmax.aspx http://sqlblog.com/blogs/joe_chang/archive/2013/02/25/emc-vnx2-and-vnx-future.aspx

  20. Performance Past, Present, Future • When will servers be so powerful that … • Been saying this for a long time • Today – 10 to 100X overkill • 32-cores in 2012, 60-cores in 2014 • Enough memory that IO is only sporadic • Unlimited IOPS with SSD • What can go wrong? Today’s topic

  21. SQL Performance SQL Tablesnatural keys Indexes Statistics& Compile parameters Tables and SQL combined implement business logic Natural keys with unique indexes, not SQL Compile Row estimate propagation errors Query Optimizer Index and Statistics maintenance policy DOP Memory Parallel plans Execution Plan Recompile temp table / table variable 1 Logic may need more than one execution plan? API Server Cursors: open, prepare, execute, close? Storage Engine Index & Stats Maintenance Compile cost versus execution cost? SET NO COUNT Information messages Hardware Plan cache bloat? The Execution Plan links all the elements of performance Index tuning alone has limited value Over indexing can cause problems as well

  22. Factors to Consider SQL Tables Indexes Statistics Query Optimizer Compile Parameters Storage Engine DOP memory Hardware

  23. Special Topics • Data type mismatch • Multiple Optional Search Arguments (SARG) • Function on SARG • Parameter Sniffing versus Variables • Statistics related (big topic) • OR, AND/OR combinations IN/NOT IN, EXISTS • Complex Query with sub-expressions • Parallel Execution Not in order of priority http://blogs.msdn.com/b/sqlcat/archive/2013/09/09/when-to-break-down-complex-queries.aspx

  24. 1a. Data type mismatch DECLARE@namenvarchar(25) = N'Customer#000002760' SELECT * FROM CUSTOMER WHERE C_NAME = @name Table column is varchar Parameter/variable is nvarchar SELECT* FROM CUSTOMER WHERE C_NAME = CONVERT(varchar, @name) .NET auto-parameter discovery? Unable to use index seek

  25. 1b. Type Mismatch – Row Estimate SELECT* FROM CUSTOMER WHERE C_NAME LIKE'Customer#00000276%' SELECT* FROM CUSTOMER WHERE C_NAME LIKEN’Customer#00000276%' Row estimate error could have severe consequences in a complex query

  26. SELECT TOP + Row Estimate Error SELECT TOP 1000[Document].[ArtifactID]FROM[Document](NOLOCK)WHERE[Document].[AccessControlListID_D]IN(1,1000064,1000269)AND EXISTS (SELECT[DocumentBatch].[BatchArtifactID]FROM[DocumentBatch](NOLOCK)INNER JOIN[Batch](NOLOCK)ON[Batch].ArtifactID=[DocumentBatch].[BatchArtifactID]WHERE [DocumentBatch].[DocumentArtifactID]=[Document].[ArtifactID]AND[Batch].[Name]LIKEN'%Value%')ORDER BY[Document].[ArtifactID] Data type mismatch – results in estimate rows high Top clause – easy to find first 1000 rows In fact, there are few rows that match SARG Wrong plan for evaluating large number of rows http://www.qdpma.com/CBO/Relativity.html

  27. Multiple Optional SARG

  28. 2. Multiple Optional SARG DECLARE@Orderkeyint, @Partkeyint = 1 SELECT* FROMLINEITEM WHERE (@Orderkey IS NULL OR L_ORDERKEY = @Orderkey) AND (@PartkeyIS NULL OR L_PARTKEY = @Partkey) AND (@Partkey IS NOT NULL OR @Orderkey IS NOT NULL)

  29. IF block DECLARE@Orderkeyint, @Partkeyint = 1 IF (@OrderkeyISNOTNULL) SELECT* FROMLINEITEM WHERE (L_ORDERKEY = @Orderkey) AND(@PartkeyIS NULL OR L_PARTKEY = @Partkey) ELSE IF (@PartkeyISNOTNULL) SELECT * FROM LINEITEM WHERE(L_PARTKEY = @Partkey) These are actually the stored procedure parameters Need to consider impact of Parameter Sniffing, Consider the OPTIMIZER FOR hint

  30. Dynamically Built Parameterized SQL DECLARE@Orderkeyint,@Partkeyint=1 ,@SQLnvarchar(500),@Paramnvarchar(100) SELECT@SQL= N‘/* Comment */ SELECT * FROM LINEITEM WHERE 1=1‘ ,@Param=N'@Orderkey int, @Partkey int' IF (@OrderkeyISNOTNULL) SELECT@SQL=@SQL+N' AND L_ORDERKEY = @Orderkey' IF (@PartkeyISNOTNULL) SELECT@SQL=@SQL+N' AND L_PARTKEY = @Partkey' PRINT@SQL execsp_executesql@SQL,@Param,@Orderkey,@Partkey IF block is easier for few options Dynamically built parameterized SQL better for many options Consider /*comment*/ to help identify source of SQL

  31. 2b. Function on column SARG SELECTCOUNT(*), SUM(L_EXTENDEDPRICE) FROM LINEITEM WHEREYEAR(L_SHIPDATE) = 1995 ANDMONTH(L_SHIPDATE) = 1 SELECTCOUNT(*), SUM(L_EXTENDEDPRICE) FROM LINEITEM WHEREL_SHIPDATE BETWEEN'1995-01-01'AND'1995-01-31' DECLARE@Startdate date, @Daysint = 1 SELECTCOUNT(*), SUM(L_EXTENDEDPRICE) FROMLINEITEM WHEREL_SHIPDATE BETWEEN@Startdate ANDDATEADD(dd,1,@Startdate)

  32. Estimated versus Actual Plan - rows Estimated Plan – 1 row??? Actual Plan – actual rows 77,356

  33. 3 Parameter Sniffing -- first call, procedure compiles with these parameters execp_Report @startdate = '2011-01-01', @enddate = '2011-12-31' -- subsequent calls, procedure executes with original plan execp_Report @startdate = '2012-01-01', @enddate = '2012-01-07' Assuming date data type Need different execution plans for narrow and wide range Options: 1) OPTIMIZE FOR – one plan for all ranges 2) WITH RECOMPILE – compile on each execute 3) main procedure calls 1 of 2 identical sub-procedures One sub-procedure is only called for narrow range Other called for wide range Skewed data distributions also important Example: Large & small customers

  34. STATISTICS

  35. 4 Statistics • Auto-recompute points • Sampling strategy • How much to sample - theory? • Random pages versus random rows • Histogram Equal and Range Rows • Out of bounds, value does not exist • etc. Statistics Used by the Query Optimizer in SQL Server 2008Eric N. Hanson and YavorAngelov, Contributor: LuborKollar Optimizing Your Query Plans with the SQL Server 2014 Cardinality EstimatorJoseph Sack http://msdn.microsoft.com/en-us/library/dd535534.aspx

  36. Statistics Structure • Stored (mostly) in binary field Scalar values Density Vector – limit 30, half in NC, half Cluster key Histogram Up to 200 steps Consider not blindly using IDENTITY on critical tables Example: Large customers get low ID values Small customers get high ID values http://sqlblog.com/blogs/joe_chang/archive/2012/05/05/decoding-stats-stream.aspx

  37. Statistics Auto/Re-Compute • Automatically generated on query compile • Recompute at 6 rows, 500, every 20%? Has this changed? 2008 R2 Trace 2371 – lower threshold auto recomputed for large tables http://support.microsoft.com/kb/2754171

  38. Statistics Sampling • Sampling theory • True random sample • Sample error - square root N • Relative error 1/ N • SQL Server sampling • Random pages • But always first and last page??? • All rows in selected pages

  39. Row Estimate Problems (at source) • Skewed data distribution • Out of bounds • Value does not exist Row estimate errors at source – is classified under statistics topic

  40. Loop Join - Table Scan on Inner Source Estimated out from first 2 tabes (at right) is zero or 1 rows. Most efficient join to third table (without index on join column) is a loop join with scan. If row count is 2 or more, then a fullscan is performed for each row from outer source Default statistics rules may lead to serious ETL issues Consider custom strategy

  41. Compile Parameter Not Exists Main procedure has cursor around view_Servers First server in view_Servers is ’CAESIUM’ Cursor executes sub-procedure for each Server sql: SELECTMAX(ID) FROMTReplWS WHERE Hostname = @ServerName But CAESIUM does not exist in TReplWS!

  42. Good and Bad Plan?

  43. SqlPlan Compile Parameters

  44. SqlPlan Compile Parameters <?xmlversion="1.0"encoding="utf-8"?> <ShowPlanXMLxmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan"Version="1.1"Build="10.50.2500.0"> <BatchSequence> <Batch> <Statements> <StmtSimpleStatementText="@ServerNamevarchar(50) SELECT @maxid = ISNULL(MAX(id),0) FROM TReplWS WHERE Hostname = @ServerName" StatementId="1"StatementCompId="43"StatementType="SELECT"StatementSubTreeCost="0.0032843"StatementEstRows="1" StatementOptmLevel="FULL"QueryHash="0x671D2B3E17E538F1"QueryPlanHash="0xEB64FB22C47E1CF2" StatementOptmEarlyAbortReason="GoodEnoughPlanFound"> <StatementSetOptionsQUOTED_IDENTIFIER="true"ARITHABORT="false"CONCAT_NULL_YIELDS_NULL="true"ANSI_NULLS="true" ANSI_PADDING="true"ANSI_WARNINGS="true"NUMERIC_ROUNDABORT="false" /> <QueryPlanCachedPlanSize="16"CompileTime="1"CompileCPU="1"CompileMemory="168"> <RelOpNodeId="0"PhysicalOp="Compute Scalar"LogicalOp="Compute Scalar" EstimateRows="1"EstimateIO="0"EstimateCPU="1e-007“ AvgRowSize="15"EstimatedTotalSubtreeCost="0.0032843"Parallel="0"EstimateRebinds="0"EstimateRewinds="0"> </RelOp> <ParameterList> <ColumnReferenceColumn="@ServerName"ParameterCompiledValue="'CAESIUM'" /> </ParameterList> </QueryPlan> </StmtSimple> </Statements> </Batch> </BatchSequence> </ShowPlanXML> Compile parameter values at bottom of sqlplan file

  45. AND – OR, IN / NOT IN, EXISTS / NOT EXISTS combinations

  46. 5a Single Table OR -- Single table SELECT*FROMLINEITEM WHEREL_ORDERKEY= 1 ORL_PARTKEY=184826

  47. 5a Join 2 Tables, OR in SARG -- subsequent calls, procedure executes with original plan SELECTO_ORDERDATE,O_ORDERKEY,L_SHIPDATE,L_QUANTITY FROMLINEITEM INNERJOINORDERSONO_ORDERKEY=L_ORDERKEY WHEREL_PARTKEY=184826 ORO_CUSTKEY= 137099

  48. 5a UNION (ALL) instead of OR SELECTO_ORDERDATE,O_ORDERKEY,L_SHIPDATE,L_QUANTITY,O_CUSTKEY,L_PARTKEY FROMLINEITEMINNERJOINORDERSONO_ORDERKEY=L_ORDERKEY WHEREL_PARTKEY= 184826 UNION (ALL) SELECTO_ORDERDATE,O_ORDERKEY,L_SHIPDATE,L_QUANTITY,O_CUSTKEY,L_PARTKEY FROMLINEITEMINNERJOINORDERSONO_ORDERKEY=L_ORDERKEY WHEREO_CUSTKEY= 137099 -- AND (L_PARTKEY<> 184826 OR L_PARTKEY IS NULL) -- Caution: select list should have keys to ensure correct rows UNION removes duplicates (with Sort operation) UNION ALL does not -- Hugo Kornelis trick --

  49. 5b AND/OR Combinations SELECT xx FROM A WHERE col1 IN (expr1) AND col2 NOT IN (expr2) SELECT xx FROM A WHERE (expr1) AND (expr2 OR expr3) • Hash Join is good method to process many rows • Requirement is equality join condition • AND/OR, IN NOT IN, EXISTS NOT EXISTS combinations • Query optimizer may not be to determine that equality join condition exists • Execution plan will use loop join, • and attempt to force hash join will be rejected • Re-write using UNION in place of OR • And LEFT JOIN in place of NOT IN More on AND/OR combinations: http://www.qdpma.com/CBO/Relativity3.html

  50. Complex QUERies

More Related