1 / 57

SQL Server Scaling on Big Iron (NUMA) Systems

TPC-H. SQL Server Scaling on Big Iron (NUMA) Systems. Joe Chang jchang6@yahoo.com www.qdpma.com. About Joe Chang. SQL Server Execution Plan Cost Model True cost structure by system architecture Decoding statblob (distribution statistics) SQL Clone – statistics-only database Tools

kermit
Download Presentation

SQL Server Scaling on Big Iron (NUMA) Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. TPC-H SQL Server Scaling on Big Iron (NUMA) Systems Joe Chang jchang6@yahoo.com www.qdpma.com

  2. About Joe Chang • SQL Server Execution Plan Cost Model • True cost structure by system architecture • Decoding statblob (distribution statistics) • SQL Clone – statistics-only database • Tools • ExecStats – cross-reference index use by SQL-execution plan • Performance Monitoring, • Profiler/Trace aggregation

  3. TPC-H

  4. TPC-H • DSS – 22 queries, geometric mean • 60X range plan cost, comparable actual range • Power – single stream • Tests ability to scale parallel execution plans • Throughput – multiple streams • Scale Factor 1 – Line item data is 1GB • 875MB with DATE instead of DATETIME • Only single column indexes allowed, Ad-hoc

  5. Observed Scaling Behaviors • Good scaling, leveling off at high DOP • Perfect Scaling ??? • Super Scaling • Negative Scaling • especially at high DOP • Execution Plan change • Completely different behavior

  6. TPC-H Published Results

  7. TPC-H SF 100GB 2-way Xeon 5355, 5570, 5680, Opt 6176 Between 2-way Xeon 5570, all are close, HDD has best throughput, SATA SSD has best composite, and Fusion-IO has be power. Westmere and Magny-Cours, both 192GB memory, are very close

  8. TPC-H SF 300GB 8x QC/6C & 4x12C Opt, 6C Istanbul improved over 4C Shanghai by 45% Power, 73% Through-put, 59% overall. 4x12C 2.3GHz improved17% over 8x6C 2.8GHz

  9. TPC-H SF 1000 Oracle RAC, 64-nodes, 128 Xeon 5450 quad-core 3.0GHz processors Power 782,608, 5.6X higher than Superdome 2 with 64-cores

  10. TPC-H SF 3TB X7460 & X7560 Nehalem-EX 64 cores better than 96 Core 2.

  11. TPC-H SF 100GB, 300GB & 3TB SF100 2-way Westmere and Magny-Cours are very close Between 2-way Xeon 5570, all are close, HDD has best through-put, SATA SSD has best composite, and Fusion-IO has be power SF300 8x QC/6C & 4x12C 6C Istanbul improved over 4C Shanghai by 45% Power, 73% Through-put, 59% overall. 4x12C 2.3GHz improved17% over 8x6C 2.8GHz SF 3TB X7460 & X7560 Nehalem-EX 64 cores better than 96 Core 2.

  12. TPC-H Published Results • SQL Server excels in Power • Limited by Geometric mean, anomalies • Trails in Throughput • Other DBMS get better throughput than power • SQL Server throughput below Power • by wide margin • Speculation – SQL Server does not throttle back parallelism with load?

  13. Processors GHz Total Cores Mem GB SQL SF Power Through put QphH 2 Xeon 5355 2.66 8 64 5sp2 100 23,378.0 13,381.0 17,686.7 5570 Fusion 2x5570 SSD 2x5570 HDD 2.93 2.93 2.93 8 8 8 144 144 144 8sp1 8sp1 8sp1 100 100 100 72,110.5 70,048.5 67,712.9 38,019.1 36,190.8 37,749.1 51,085.6 51,422.4 50,738.4 2 Xeon 5680 3.33 12 192 8r2 100 99,426.3 55,038.2 73,974.6 2 Opt 6176 2.3 24 192 8r2 100 94,761.5 53,855.6 71,438.3 TPC-H SF100

  14. Processors GHz Total Cores Mem GB SQL SF Power Through put QphH 4 Opt 8220 2.8 8 128 5rtm 300 25,206.4 13,283.8 18,298.5 8 Opt 8360 2.5 32 256 8rtm 300 67,287.4 41,526.4 52,860.2 8 Opt 8384 2.7 32 256 8rtm 300 75,161.2 44,271.9 57,684.7 8 Opt 8439 2.8 48 256 8sp1 300 109,067.1 76,869.0 91,558.2 4 Xeon 7560 4 Opt 6176 2.26 2.3 48 32 512 640 8r2 8r2 300 300 129,198.3 152,453.1 96,585.4 89,547.7 107,561.2 121,345.6 TPC-H SF300 All of the above are HP results?, Sun result Opt 8384, sp1, Pwr 67,095.6, Thr 45,343.5, QphH 55,157.5

  15. Processors GHz Total Cores Mem GB SQL SF Power Through put QphH Xeon 5450 8 Opt 8439 Itanium 9350 Itanium 9140 8 Opt 8439 3.0 1.73 2.8 1.6 2.8 48 512 48 64 64 384 384 512 2048 512 ASE O11R2 8R2? O RAC O11g 1000 1000 1000 1000 1000 111,557.0 108,436.8 95,789.1 139,181.0 782,608.7 96,652.7 128,259.1 1,740,122 69,367.6 141,188.1 140,181.1 1,166,977 81,367.6 102,375.3 123,323.1 TPC-H 1TB

  16. Processors GHz Total Cores Mem GB SQL SF Power Through put QphH POWER6 16 Xeon 7460 8 Xeon 7560 SPARC 5.0 2.26 2.66 2.88 64 128 96 64 512 512 1024 512 8r2 8r2 Sybase O11R2 3000 3000 3000 3000 120,254.8 182,350.7 185,297.7 142,790.7 171,607.4 142,685.6 216,967.7 87,841.4 162,601.7 198,907.5 102,254.8 156,537.3 TPC-H 3TB

  17. TPC-H Published Results Processors GHz Total Cores Mem GB SQL SF Power Through put QphH 2 Xeon 5355 2.66 8 64 5sp2 100 23,378 13,381 17,686.7 2 Xeon 5570 2.93 8 144 8sp1 100 72,110.5 36,190.8 51,085.6 2 Xeon 5680 3.33 12 192 8r2 100 99,426.3 55,038.2 73,974.6 2 Opt 6176 2.3 24 192 8r2 100 94,761.5 53,855.6 71,438.3 4 Opt 8220 2.8 8 128 5rtm 300 25,206.4 13,283.8 18,298.5 8 Opt 8360 2.5 32 256 8rtm 300 67,287.4 41,526.4 52,860.2 8 Opt 8384 2.7 32 256 8rtm 300 75,161.2 44,271.9 57,684.7 8 Opt 8439 2.8 48 256 8sp1 300 109,067.1 76,869.0 91,558.2 4 Opt 6176 2.3 48 512 8r2 300 129,198.3 89,547.7 107,561.2 8 Xeon 7560 2.26 64 512 8r2 3000 185,297.7 142,685.6 162,601.7

  18. SF100 Big Queries (sec) Query time in sec Xeon 5570 with SATA SSD poor on Q9, reason unknown Both Xeon 5680 and Opteron 6176 big improvement over Xeon 5570

  19. SF100 Middle Q Query time in sec Xeon 5570-HDD and 5680-SSD poor on Q12, reason unknown Opteron 6176 poor on Q11

  20. SF100 Small Queries Query time in sec Xeon 5680 and Opteron poor on Q20 Note limited scaling on Q2, & 17

  21. SF300 Big Queries Query time in sec Opteron 6176 poor relative to 8439 on Q9 & 13, same number of total cores

  22. SF300 Middle Q Query time in sec Opteron 6176 much better than 8439 on Q11 & 19 Worse on Q12

  23. SF300 Small Q Query time in sec Opteron 6176 much better on Q2, even with 8439 on others

  24. SF1000 Sybase vs. SQL Server Query time, Sybase relative SQL Server, both on DL785 48-core

  25. SF1000 Large Queries

  26. SF1000 Middle Queries

  27. SF1000 Small Queries

  28. SF1000 Itanium - Superdome Query time, Superdome 2 versus Superdome, 16-way quad-core and 32-way dual-core

  29. 512-core C2 RAC vs. 64-core It2 Query time, Superdome 2 versus RAC, 16-way quad-core (64 cores) and 64-node 2-way quad-core (512 cores) Oracle RAC 5.6X higher Power

  30. SF 3TB – 8×7560 versus 16×7460 5.6X Broadly 50% faster overall, 5X+ on one, slower on 2, comparable on 3

  31. 64 cores, PWR6 vs. Xeon 7560 Query time, POWER6 relative to X7560 Overall, Xeon 7560 is 30% faster on power, but wide variations on individual queries, some with Pwr6 faster

  32. SF3000 Big Queries

  33. SF3000 Middle and Small Q

  34. TPC-H Summary • Scaling is impressive on some SQL • Limited ability (value) is scaling small Q • Anomalies, negative scaling

  35. TPC-H Queries

  36. Q1 Pricing Summary Report

  37. Query 2 Minimum Cost Supplier Wordy, but only touches the small tables, second lowest plan cost (Q15)

  38. Q3

  39. Q6 Forecasting Revenue Change

  40. Q7 Volume Shipping

  41. Q8 National Market Share

  42. Q9 Product Type Profit Measure

  43. Q11 Important Stock Identification Parallel Non-Parallel

  44. Q12 Random IO?

  45. Q13 Why does Q13 have perfect scaling?

More Related