SQL on Fire! Part 1

SQL on Fire! Part 1 Tips and Tricks around SQL

Agenda • Part I • SQL vs. SQL PL example • Error handling • Tables out of nowhere • Pivoting • Aggregation • Deleting duplicate rows • Alternatives to Min and Max • Part II • Mass deletes • Order processing • Moving rows between tables • Recursion • Merge • Calculating nesting levels Easy Difficult

Motivation – The OLTP Mantra • Reduce Codepath • Reduce Logical and Physical I/O • Reduce Network Traffic • Minimize Lock Contention • Avoid Deadlocks • High performance starts with the application.

SQL vs. SQL PL - Example • DDL • CREATE TABLE emp(empID INTEGER NOT NULLsalary INTEGER,name VARCHAR(20),deptID INTEGER); • CREATE UNIQUE INDEX empIdxPKON emp(empID) INCLUDE (deptID); • ALTER TABLE emp ADD PRIMARY KEY (empID);

SQL vs. SQL PL

SQL vs. SQL PL • Business request:Give everyone in deptID 100 a 5% raise.

SQL vs. SQL PL – First Attempt DECLARE empcur CURSOR FORSELECT * FROM emp FOR UPDATE; OPEN empcur;emploop: LOOP FETCH empcur INTO vempid, vsalary, vname, vdeptid; IF SQLSTATE = ‘02000’ THEN LEAVE emploop; END IF; IF deptid = 100 THEN SET vsalary = vsalary * 1.05; UPDATE emp SET salary = vsalaryWHERECURRENTOF empcur; END IF;END LOOP emploop; CLOSE empcur; Read whole table Only need few rows

SQL vs. SQL PL - Second Attempt DECLARE empcur CURSOR FORSELECT * FROM emp WHERE deptID = 100 FOR UPDATE; OPEN empcur;emploop: LOOP FETCH empcur INTO vempid, vsalary, vname, vdeptid; SET vsalary = vsalary * 1.05; IF SQLSTATE = ‘02000’ THEN LEAVE emploop; END IF; UPDATE emp SET salary = vsalaryWHERECURRENTOF empcur;END LOOP emploop; CLOSE empcur; Retrieve all columns Use only one column

SQL vs. SQL PL - Third Attempt DECLARE empcur CURSOR FORSELECT salary FROM emp WHERE deptID = 100 FOR UPDATE; OPEN empcur;emploop: LOOP FETCH empcur INTO vsalary; SET vsalary = vsalary * 1.05; IF SQLSTATE = ‘02000’ THEN LEAVE emploop; END IF; UPDATE emp SET salary = vsalaryWHERECURRENTOF empcur;END LOOP emploop; CLOSE empcur; Use cursor Trivial logic

SQL vs. SQL PL - Final Attempt UPDATE emp SET salary = salary * 1.05WHERE deptID = 100; Tips:WHERE deptID IN (100, 200)WHERE deptID IN (SELECT deptID FROM dept WHERE deptlocation = ‘Shenzhen’)SET salary = salary * CASE WHEN name = ‘Zhang’ THEN 1.05 ELSE 1.03 END

SQL vs. SQL PL – Aggregation • Return all departments with more than 5 employees:SELECT deptID FROM empGROUPBY deptIDHAVINGCOUNT(*) > 5 • Return the salary of the most highly paid employee by department for departments with more than 5 employees:SELECT deptID, MAX(salary) FROM empGROUPBY deptIDHAVINGCOUNT(*) > 5

Create table if it does not exist BEGIN DECLARE CONTINUE HANDLERFOR SQLSTATE '42710'BEGIN /* Already exists?Ignore */ END;CREATE TABLE T(c1 INT); END Tip 1:There is no need to avoid SQL errorsas long as they are handled… Tip 2:.. except when statement rollback implies undo operation from the log

SQL PL – DDL cleanup BEGINDECLARE CONTINUE HANDLERFOR SQLSTATE '42704'BEGIN /* Does not exist? Ignore */ END;DROP TABLE T1;DROP TABLE T2;DROP SEQUENCE S1; END Tip: Lost errors are very hard to find.So be specific and never ignore generic SQLEXCEPTION.

SQL PL – Is this table empty? IF EXISTS(SELECT 1 FROM emp WHERE dept = 100)THEN .. END IF; Instead of:IF EXISTS(SELECT 1 FROM emp WHERE empid = 5) THEN UPDATE emp SET salary = salary * 1.05 WHERE empID = 5;END IF; Use:UPDATE emp SET salary = salary * 1.05 WHERE empID = 5; Tip:Don’t be afraid of NOTFOUND (SQLSTATE ‘02000’). NOT FOUND not an error

Tables from nowhere • Create tables without INSERT:VALUES (1, 2), (3, 4), …. (10, 11)orVALUES (CAST(? ASINTEGER),CAST(? ASDOUBLE)), (?, ?), … (?, ?) • Use anywhere likeINSERTINTO T VALUES (1, 2), (3, 4), (5, 6)

VALUES for multi row INSERT CREATETABLE T(c1 INT, c2 VARCHAR(10)); INSERTINTO T VALUES (?, ?), (?, ?), (?, ?), … (?, ?); Tip:For mass inserts prepare INSERT statements with: 1, 10, 100, 1000, 10000 rows of parameter markers each. Execute the biggest that fits the remaining load in a loop.

PIVOT and UNPIVOT CREATE TABLE Sales(Year INTEGER, Quarter INTEGER, Results INTEGER);

PIVOT and UNPIVOT CREATE TABLE SalesAgg(year INTEGER,q1 INTEGER,q2 INTEGER,q3 INTEGER,q4 INTEGER);

PIVOT SELECT Year, MAX(CASEWHENQuarter = 1 THEN Results END) AS Q1, MAX(CASEWHEN Quarter = 2 THEN Results END) AS Q2, MAX(CASEWHEN Quarter = 3 THEN Results END) AS Q3, MAX(CASEWHEN Quarter = 4 THEN Results END) AS Q4FROM SalesGROUPBY Year Tip: Use MAX() because it is supported for all comparable types including strings

PIVOT Access Plan: ----------- RETURN ( 1) | GRPBY ( 2) | FETCH ( 3) /----+---\ IXSCAN TABLE: SALES ( 4) | INDEX: SALESIDX

UNPIVOT SELECT Year, Quarter, ResultsFROM SalesAgg AS S,LATERAL(VALUES(1, S.q1), (2, S.q2), (3, S.q3), (4, S.q4)) AS Q(Quarter, Results); Tip:Use LATERAL (or TABLE) to allow correlation of S.q* to the left.

UNPIVOT Access Plan: ----------- RETURN ( 1) | NLJOIN ( 2) /------+-----\ TBSCAN TBSCAN ( 3) ( 4) | | TABLE: SALESAGG TABFNC: GENROW

Aggregation • Problem:DB2 does not support user defined aggregates • Thoughts: • XMLAGG() provides aggregation without loss of data • Use mathematical rules • Examples: • Aggregate concatenation • Aggregate multiplication

XML functions – a primer • XMLAGG() • Aggregates XML values into an XML sequence • XMLELEMENT() • Tags a scalar value and returns XMLe.g. XMLELEMENT(NAME ‘x’ 5) => <x>5</x> • XMLSERIALIZE() • Casts an XML value into a string

Aggregate concatenation CREATETABLE Employee(name VARCHAR(15), dept VARCHAR(15));

Aggregate concatenation

Aggregate concatenation Strip last comma SELECT Dept,SUBSTR(Names, 1, LENGTH(names) -1) FROM(SELECT Dept,REPLACE (REPLACE (XMLSERIALIZE (CONTENTXMLAGG(XMLELEMENT(NAME a, name)ORDERBY name) ASVARCHAR(60)), '<A>', ''), '</A>', ',') AS NamesFROM Employee GROUPBY Dept) AS X; Replace end tags with commas Strip start tags XML to VARCHAR Aggregate in order of names Tag a name to become XML

Aggregate multiplication CREATETABLE probabilities( model VARCHAR(10), event VARCHAR(10), percent FLOAT);

Aggregate multiplication • a = EXP(LOG(a))X* Y = EXP(LOG(X * Y))X * Y = EXP(LOG(X) + LOG(Y)) • PROD(Xi)i=1..n = EXP(SUM(LOG(Xi)i=1..n)

Aggregate multiplication SELECTmodel, DEC(EXP(SUM(LOG(percent))), 8, 7) AS percentFROM probabilitiesWHERE event IN (‘Engine 1’, ‘Fin’)GROUPBY model

Retrieving MAXimum row CREATE TABLE emp(name VARCHAR(10),dept VARCHAR(10),salary INTEGER); CREATEINDEX emp_ind ON emp(dept, salary DESC); • Standard using selfjoin: • SELECT name, salary FROM empWHERE salary = (SELECT MAX(salary) FROM emp WHERE dept = ‘SQL Compiler’AND dept = ‘SQL Compiler’);

Retrieving MAXimum row SELECT name, salary FROM empWHERE dept='SQL Compiler'ORDER BYsalaryDESCFETCH FIRST ROW ONLY;

Retrieving MAXimum row Access Plan: ------------ RETURN ( 1) | FETCH ( 2) /----+---\ IXSCAN TABLE: EMP ( 3) | INDEX: EMP_IND

Delete duplicate rows CREATETABLE Inventory(Item INTEGER, Quantity INTEGER, InvDate DATE); CREATEUNIQUEINDEX InvIdx ON Inventory(Item ASC, InvDate DESC);

Delete duplicate rows

Delete duplicate rows - classic • DELETEFROMInventoryASD WHERE (Item, InvDate)NOTIN (SELECT Item, InvDate FROMInventory IWHERED.Item = I.Item ORDER BY InvDate DESCFETCH FIRST ROW ONLY);

Delete duplicate rows - improved • Using OLAPDELETE FROM (SELECT row_number() OVER(PARTITIONBYItemORDERBYInvDateDESC) AS rn FROM Inventory) WHERE rn > 1; Tip: Remove ORDER BY clause if it doesn’t matter which duplicates get eliminated.

Delete duplicate rows Access Plan: ------------ RETURN ( 1) | DELETE ( 2) /---+---\ FETCH TABLE: INVENTORY ( 3) /---+---\ FILTER TABLE: INVENTORY ( 4) | IXSCAN ( 5) | INDEX: INVIDX

Conclusion • Exploit SQL to: • Increase concurrency • Reduce I/O • Reduce code-path • Make the application more readable • SQL provides powerful support • Tip: • Part II with even meaner examples after the break

Serge Rielau SQL on Fire! Part 1 IBM srielau@ca.ibm.com

SQL on Fire! Part 1

SQL on Fire! Part 1

Presentation Transcript

- Statistical Analysis 2 -

Why Test?

Test Administrators

The Test is Yet to Come!

COLOR TEST COLOR TEST COLOR TEST COLOR TEST

How to Write a Test Plan?

Mardi le 5 mars

USB 2.0 Production Testing

Test Data Generation

Test Data Generation

System Test Evaluation

Progression of Test Categories

ITRS Test & Test Equipment

Test Anxiety

We need t o test XYZ

Chi-square test or c 2 test

Training Course Outlines - Digital

Braze Test

SQL on Fire! Part 1

SQL on Fire! Part 1

Presentation Transcript

- Statistical Analysis 2 -

Why Test?

Test Administrators

The Test is Yet to Come!

COLOR TEST COLOR TEST COLOR TEST COLOR TEST

How to Write a Test Plan?

Mardi le 5 mars

USB 2.0 Production Testing

Test Data Generation

Test Data Generation

System Test Evaluation

Progression of Test Categories

ITRS Test &amp; Test Equipment

Test Anxiety

We need t o test XYZ

Chi-square test or c 2 test

Training Course Outlines - Digital

Braze Test

ITRS Test & Test Equipment