Proactive Index Design using QUBE Courtesy of Tapio Lahdenmäki

Proactive Index Design using QUBECourtesy of Tapio Lahdenmäki Lauri Pietarinen Relational Consulting Session Code: F04 Nov 09, 2010 8:30 – 9:30 AM | Platform: z/OS, LUW

Q U B E Q - Quick U - Upper B - Bound E - Estimate A simple formula for estimating CPU and elapsed time for queries, created by Tapio Lahdenmäki and others at IBM-Finland

Index Basics SELECT INO, CNAME FROM INVOICE WHERE CNO = :CNO AND IDATE > :IDATE • 1,000,000 invoices • Per customer: • Max 10,000 invoices • Max 300 recent invoices 3 4 2 CNO, IDATE, INO, CNAME CNO, IDATE CNO 1 300 T 10,000 T INVOICE INVOICE INVOICE 300 T T = Touch 1,000,000 T 10,000 T 300 T

COL B-Tree Index T A B L E 1 3 7 Continue until last level single page 8 7 12 20 20 39 21 33 Non-leaf pages 39 WHERE COL = 12 WHERE COL BETWEEN 2 AND 10 Leaf pages • Normally 3 levels if 1,000,000 table rows • Number of non-leaf pages much lower than number of leaf pages • Non-leaf pages tend to stay in pool with current hardware • Reasonable (2010): Ignore cost of non-leaf page processing

Recommended Mental Image COL T 1 T 3 TABLE T 7 T T 8 T 12 20 T 21 COL is M column (matching column) Predicate COL BETWEEN 2 AND 10 defines index slice (matching predicate) T = Touch TR = Number of random touches TS = Number of sequential touches TR = TS =

Request Tracking Customers 10s DEADLINE STATUS RPK CNO... 3.6.2005 X X X BO = 15 3.6.2005 X X X Req Req 4.6.2005 X X X Req 20 rows / screen Insurance company 1,000 per day Average: 5 STATUS changes per request Primary key of REQUEST = RPK, foreign keys CNO and BO BO = Branch office (100 branch offices, the largest one covers 3% of requests) STATUS: 1...9 (9 = Closed) DL = Deadline DL = 31.12.2099 BO = Latest BO 99%

Common Transaction SELECT DL, STATUS, RPK, CNO, C1, C2 FROM REQUEST WHERE STATUS < 9 AND BO = :BO ORDER BY DL FF = 1% FF = 0...3% CNO RPK BO REQUEST 1,000,000 rows, average length 300 bytes FF = Filter Factor: 0...100%

Which One Faster? 3% CNO BO RPK REQUEST 1,000,000 rows Alternative 1 Alternative 2 Read index slice Read 30,000 table rows Pick rows that satisfy STATUS < 9 Sort 300 result rows Read 1,000,000 table rows Pick rows that satisfy both predicates Sort 300 result rows

Sequential Read in 2010 Processor READ CACHE CPU cache RAM Buffer pool • I/O time • DBMS and the disk subsystem read ahead -- lots of pages with one rotation • Not all pages at once -- just trying to stay ahead: when the program needs a page it should be in the buffer pool • If sequential read speed 40 MB/s, I/O time per 4K page 0.1 ms; if 10 rows per page, I/O time per row = 10 us (microseconds) • CPU time • Rule of thumb: CPU time per examined row = 5 us with sequential read • FETCH (move qualifying row to application pgm) may take 50 us of CPU time

Random Read in 2010 Processor READ CACHE CPU cache RAM Buffer pool • Disk I/O time • If needed page not in pool: disk read • If needed page in read cache: I/O time may be 1 ms • Random read from disk drive may take 10 ms • CPU time • Retrieving a row and evaluating it may take 50 us ofCPU time (random read) • FETCH one row may take 50 us of CPU time -- as with sequential read Serious

Random Read from Disk Drive Depends on drive busy Queuing (Q) 3 ms Seek 4 ms Half a rotation 2 ms Transfer 1 ms S = Service time Total I/O time 10 ms One random read keeps a drive busy for 6 ms Q = (u / (1-u)) x S Q = Average queuing time u = Average drive busy S = Average service time 50 random reads a second u = 50 read/s x 0.006 s/read = 0.3 Q = (0.3 /(1- 0.3)) x 6 ms = 3 ms

Disk Drives -- the Bottleneck 2 TB 1 TB 300 GB 145 GB 72 GB 3 GB 1992 2002 2003 2005 2007 2009 • Storage density grows dramatically • Sequential I/O getting faster • Random I/O remains slow (and may even become slower) u 30% to 60% Q 3 ms to 9 ms Random read 10 ms to 16 ms Q = (u / (1-u)) x S Q = Average queuing time u = Average drive busy S = Average service time

Quick Upper-Bound Estimate (QUBE) ET = TR x 10 ms + TS x 0.01 ms + F x 0.1 ms CPU = TR x 50 us + TS x 5 us + F x 50 us ET = Elapsed time (SQL) CPU = CPU time (SQL) TR = Number of random touches TS = Number of sequential touches F = Number of rows returned to program (Fetches)

Alternative 1 REQUEST 1,000,000 rows Index Table Worst input: F = 300 1 M = 1,000,000 TR TS TR TS 1 1 M TS/1000 F/100 TR 1,000 3 10 s ET = ( + ) x 10 ms = 1 + TR TS/10 F + 1 300 100,000 5 s + CPU = ( ) ms / 20 =

Alternative 2A 3% RPK CNO BO REQUEST 1,000,000 rows Index Table Worst input: F = 300 TR TS TR TS 1 30,000 30,000 TS/1000 F/100 TR 30,001 30 3 300 s ET = ( + ) x 10 ms = + TR TS/10 F 2 s + 30,001 300 3,000 + CPU = ( ) ms / 20 =

Alternative 2B C RPK CNO BO 3% REQUEST 1,000,000 rows 3% Index Table Worst input: F = 300 TR TS TR TS 1 30,000 1 30,000 TS/1000 F/100 TR 2 60 3 1 s ET = ( + ) x 10 ms = + TR TS/10 F 0,3 s + 2 300 6,000 + CPU = ( ) ms / 20 =

Basic Question All predicate columns in one index? WHERE STATUS < 9 AND BO = :BO If yes, index is semi-fat BO STATUS 3 1 T T 88 1 REQUEST T 2 3 T 4 70 T 9 1 9 2 9 2 T Matching Screening BO = :BO evaluated in index STATUS < 9 defines index slice Touch table only when WHERE clause true

Better Semi-Fat Index BO STATUS 1 3 T T 2 3 REQUEST T 2 6 T 2 9 3 1 70 4 T 1 88 Matching Matching STATUS < 9 AND BO = 2 defines index slice The index slice contains only qualifying index rows

QUBE for Semi-Fat Index – Your Turn! 0.03% SELECT DL, STATUS, RPK, CNO, C1, C2 FROM REQUEST WHERE STATUS < 9 AND BO = :BO ORDER BY DL BO, STATUS M FF = 1% M M FF = 3% M F = 300 REQUEST MC = 2 SC = 0 IXONLY = N SORT = Y Index Table TR TS TR TS TR TS/1000 F/100 ET = ( + ) x 10 ms = + TR TS/10 F + + CPU = ( ) ms / 20 =

QUBE for Semi-Fat Index – Solution 0.03% SELECT DL, STATUS, RPK, CNO, C1, C2 FROM REQUEST WHERE STATUS < 9 AND BO = :BO ORDER BY DL BO, STATUS M FF = 1% M M FF = 3% M F = 300 REQUEST MC = 2 SC = 0 IXONLY = N SORT = Y Index Table TR TS TR TS 1 300 300 TS/1000 TR F/100 0 301 3 3 s + + ) x 10 ms = ET = ( TS/10 F TR 300 30 ms 301 30 + + CPU = ( ) ms / 20 =

Still Too Long - What Next? The problem: 300 random table touches 300 x 10 ms = 3 s Fat index No table touches 20 FETCHes - 20 table touches?

When Do Touches Take Place? Sort very fast today (say, 10 us CPU per row) but... DECLARE CURSOR... OPEN CURSOR FETCH CURSOR ---- while found CLOSE CURSOR ? ? Access path with sort Access path without sort OPEN CURSOR: All result rows FETCH: One result row OR

No Sort, 20 FETCHes SELECT DL, STATUS, RPK, CNO, C1, C2 FROM REQUEST WHERE STATUS < 9 AND BO = :BO ORDER BY DL, RPK FETCH FIRST 20 ROWS ONLY BO, DL, RPK, STATUS FF = 1% FF = 3% F = 20 MC = 1 SC = 1 IXONLY = N SORT = N Index Table First screen TR TS TR TS 1 19 20 TR TS/1000 F/100 21 0 0 210 ms ET = ( + ) x 10 ms = + TR TS/10 F CPU = ( 21 2 20 2 ms + + ) ms / 20 =

BO, DL, RPK, STATUS Worst-Input Estimates Elapsed time (ET) CPU time Note No index 10 s 5 s BO (non-C) 300 s 2 s BO (C) 1 s 0,3 s Semi-fat BO, STATUS 3 s 0.03 s No Sort First screen Modify pgm 0.2 s 0.002 s

Fat Index with Sort SELECT DL, STATUS, RPK, CNO, C1, C2 FROM REQUEST WHERE STATUS < 9 AND BO = :BO ORDER BY DL BO, STATUS, RPK, DL, CNO, C1, C2 FF = 1% F = 300 FF = 3% MC = 2 SC = 0 IXONLY = Y SORT = Y Index Table TR TS TR TS 1 300 TR TS/1000 F/100 1 0 3 40 ms ET = ( + ) x 10 ms = + TR TS/10 F 1 30 300 20 ms + + CPU = ( ) ms / 20 =

BO, STATUS, BO, DL, RPK, RPK... STATUS Worst-Input Estimates Elapsed time (ET) CPU time Note No index 10 s 5 s BO (non-C) 300 s 2 s BO (C) 1 s 0,3 s Semi-fat BO, STATUS 3 s 0.03 s No Sort First screen Modify pgm 0.2 s 0.002 s Fat 0.04 s 0.02 s

Too Expensive? BO, STATUS, RPK, DL, CNO, C1, C2 BO • Disk space 1,000,000 rows 80 bytes per row • RAM (Non-leaf pages) • INSERT • UPDATE 1000 new rows per day 5000 STATUS updates per day • DELETE • Index reorg (rebuild) • Something else?

The Cost of Adding an Index DELETE UPDATE INSERT Roughly 10 ms per added row when columns of new index updated Add 10 ms if move Add 10 ms if split Roughly 10 ms per removed row Roughly 10 ms per added row Add 10 ms if split Asynchronous writes Drive busy up Assumptions: Upper index levels in DB cache Leaf page not in DB cache

The Cost of Adding an Index Column DELETE UPDATE INSERT None if adequate distributed free space (!) Roughly 10 ms when only the new column updated Add 10 ms if move Add 10 ms if split None (!) I/O moves the whole page, not a row

So, Too Expensive? BO, STATUS, RPK,DL, CNO, C1, C2 BO • Disk space • RAM (Non-leaf pages) 1,000,000 rows 80 bytes per row • INSERT • UPDATE 1000 new rows per day 5000 STATUS updates per day • DELETE • Index reorg (rebuild) • Something else? No Low cost compared to dramatic reduction in response time and cost of SELECT The only issue Index slice read time increases if index reorg interval too long

Obsolete Underindexing: a common mistake • Do not index volatile columns • STATUS • Max N indexes per table • Max 5 columns per index • etc • INSERT and DELETE fast enough after index added? • TR = 1 per added or removed index row • UPDATE fast enough after index columns added? • TR = 1 or 2 per updated index column • Index reorg requirement OK? • Long index rows (more than 5% of leaf page) • Hot spots (except end of index) • Drive load caused by index maintenance • If dozens of random index row inserts or deletes a second • Index storage cost (disk & RAM) • Diminishing year after year TR RAM e/GB/m e/GB/m

Index BO Was Not Adequate For This SELECT SELECT DL, STATUS, RPK, CNO, C1, C2 FROM REQUEST WHERE STATUS < 9 AND BO = :BO ORDER BY DL Who should have seen this? When?

Summary It can be used to prevent performance problems Qube is a way of thinking about indexes It can be used in conjunction with other tools It can be used to understand and analyze performance problems

Lauri PietarinenRelational Consultinglauri.pietarinen@relational-consulting.com F04 Proactive Index Design using QUBE

Proactive Index Design using QUBE Courtesy of Tapio Lahdenmäki