1 / 49

Characteristic Functions

Characteristic Functions. Characteristic Functions. (from fin_data table in Sybase Sample Database) Have: Year quarter code amount 2001 Q1 e1 198 2001 Q2 e1 204 2001 Q3 e1 214 2001 Q4 e1 231. Want: Year Code Q1Amt Q2Amt Q3Amt Q4Amt

holly
Download Presentation

Characteristic Functions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Characteristic Functions

  2. Characteristic Functions (from fin_data table in Sybase Sample Database) Have: Yearquartercodeamount 2001 Q1 e1 198 2001 Q2 e1 204 2001 Q3 e1 214 2001 Q4 e1 231 Want: YearCodeQ1AmtQ2AmtQ3AmtQ4Amt 2001 e1 198 204 214 231

  3. Characteristic Functions Have: Normalized Yearquartercodeamount 2001 Q1 e1 198 2001 Q2 e1 204 2001 Q3 e1 214 2001 Q4 e1 231 Want: Un-normalized YearCodeQ1AmtQ2AmtQ3AmtQ4Amt 2001 e1 198 204 214 231

  4. Characteristic Functions Similar to Excel Database Format Yearquartercodeamount 2001 Q1 e1 198 2001 Q2 e1 204 2001 Q3 e1 214 2001 Q4 e1 231 Pivot Table YearCodeQ1AmtQ2AmtQ3AmtQ4Amt 2001 e1 198 204 214 231

  5. Possible Solutions(without using Characteristic Functions) Self Join as in type 4 queries

  6. Possible Solution – Self Join Use a self join as in type 4 queries Start with just Q1 and Q2, code E1, Year 2001 YearcodeQ1AmtQ2Amt 2001 e1 198 204 Write the SQL query

  7. Possible Solution – Self Join YearcodeQ1AmtQ2Amt 2001 e1 198 204 SELECT F1.year, F1.code, F1.amount AS Q1Amt, F2.amount AS Q2Amt FROM fin_data F1, fin_data F2 WHERE F1.year = 2001 // Only 2001 AND F2.year = 2001 AND F1.code = ‘e1’ // Only financial code e1 AND F2.code = ‘e1’ AND F1.quarter = ‘Q1’ // Get Q1 amount AND F2.quarter = ‘Q2’ // Get Q2 amount Code

  8. Possible Solution – Self Join Expand to all years YearcodeQ1AmtQ2Amt 1999 e1 101 93 2000 e1 153 149 2001 e1 198 204

  9. Possible Solution – Self Join Expand to all years YearcodeQ1AmtQ2Amt 1999 e1 101 93 2000 e1 153 149 2001 e1 198 204 Code SELECT F1.year, F1.code, F1.amount AS Q1Amt, F2.amount AS Q2Amt FROM fin_data F1, fin_data F2 WHERE F1.year = F2.year// Same Year AND F1.code = ‘e1’ // Only financial code e1 AND F2.code = ‘e1’ AND F1.quarter = ‘Q1’ // Get Q1 amount AND F2.quarter = ‘Q2’ // Get Q2 amount

  10. Possible Solution – Self Join Expand to all four quarters YearcodeQ1AmtQ2AmtQ3AmtQ4Amt 1999 e1 101 93 129 145 2000 e1 153 149 157 163 2001 e1 198 204 214 231

  11. Possible Solution – Self Join All four quarters YearcodeQ1AmtQ2AmtQ3AmtQ4Amt 1999 e1 101 93 129 145 2000 e1 153 149 157 163 2001 e1 198 204 214 231 Code SELECT F1.year, F1.code, F1.amount AS Q1Amt, F2.amount AS Q2Amt, F3.amount AS Q3Amt, F4.amount AS Q4Amt FROM fin_data F1, fin_data F2, fin_data F3, fin_data F4 WHERE F1.year = F2.year // Same Year AND F2.year = F3.year AND F3.year = F4.year AND F1.code = ‘e1’ // Only financial code e1 AND F2.code = ‘e1’ AND F3.code = ‘e1’ AND F4.code = ‘e1’ AND F1.quarter = ‘Q1’ // One record for each quarter AND F2.quarter = ‘Q2’ AND F3.quarter = ‘Q3’ AND F4.quarter = ‘Q4’

  12. Possible Solution – Self Join Problems Coding: Suppose we wanted months instead of quarters… Performance: Suppose fin_data had 100,000 records instead of 84… We need a better solution!

  13. Possible Solutions(without using Characteristic Functions) SubQueries

  14. Possible Solution – SubQueries Use a subquery as a field in the select clause Start with just Q1 and Q2, code E1, Year 2001 YearcodeQ1AmtQ2Amt 2001 e1 198 204 Write the SQL query

  15. Possible Solution – SubQueries Use a subquery as a field in the select clause Code SELECT F1.year, F1.code, F1.amount AS Q1Amt, ( SELECT F2.amount as Q2Amt FROM fin_data F2 WHERE F2.quarter = ‘Q2’ AND F2.code = ‘e1’ AND F2.year = 2001 ) FROM fin_data F1 WHERE F1.quarter = ‘Q1’ AND F1.code = ‘e1’ AND F1.year = 2001 YearcodeQ1AmtQ2Amt 2001 e1 198 204

  16. Possible Solution – SubQueries Expand to all years YearcodeQ1AmtQ2Amt 1999 e1 101 93 2000 e1 153 149 2001 e1 198 204

  17. Possible Solution – SubQueries Expand to all years Code SELECT F1.year, F1.code, F1.amount AS Q1Amt, ( SELECT F2.amount as Q2Amt FROM fin_data F2 WHERE F2.quarter = ‘Q2’ AND F2.code = ‘e1’ AND F2.year = F1.year ) FROM fin_data F1 WHERE F1.quarter = ‘Q1’ AND F1.code = ‘e1’ //AND F1.year = 2001 YearcodeQ1AmtQ2Amt 1999 e1 101 93 2000 e1 153 149 2001 e1 198 204 This is now a correlated subquery

  18. Possible Solution – SubQueries Expand to all four quarters YearcodeQ1AmtQ2AmtQ3AmtQ4Amt 1999 e1 101 93 129 145 2000 e1 153 149 157 163 2001 e1 198 204 214 231

  19. Possible Solution – SubQueries All four quarters YearcodeQ1AmtQ2AmtQ3AmtQ4Amt 1999 e1 101 93 129 145 2000 e1 153 149 157 163 2001 e1 198 204 214 231 Code SELECT F1.year, F1.code, F1.amount AS Q1Amt, ( SELECT F2.amount as Q2Amt FROM fin_data F2 WHERE F2.quarter = ‘Q2’ AND F2.code = ‘e1’ AND F2.year = F1.year ), ( SELECT F2.amount as Q3Amt FROM fin_data F2 WHERE F2.quarter = ‘Q3’ AND F2.code = ‘e1’ AND F2.year = F1.year ), ( SELECT F2.amount as Q4Amt FROM fin_data F2 WHERE F2.quarter = ‘Q4’ AND F2.code = ‘e1’ AND F2.year = F1.year ) FROM fin_data F1 WHERE F1.quarter = ‘Q1’ AND F1.code = ‘e1’

  20. Possible Solution – Self Join Coding: Again, Suppose we wanted months instead of quarters… Problems Performance: Our example requires the effective execution of 10 queries • The F1 query is run once to return the year, code and Q1Amt columns • The F2 query is run 9 times return the Q2, Q3, and Q4 amounts for 1999, 2000 and 2001 If our table had 100,000 rows F2 would run 300,000 times! We still need a better solution!

  21. Possible Solution – Temporary Table Strategy Create a table with fields for year, code, Q1Amt, Q2Amt, Q3Amt, and Q4Amt Insert Query to add records and fill in year, code, Q1Amt Update Query to add Q2Amt Update Query to add Q3Amt Update Query to add Q4Amt

  22. Possible Solution – Temporary Table Create a Table Create a table with fields for year, code, Q1Amt, Q2Amt, Q3Amt, and Q4Amt CREATE TABLE QAmt ( year char(4), code char(2), Q1Amt numeric(9), Q2Amt numeric(9), Q3Amt numeric(9), Q4Amt numeric(9) );

  23. Possible Solution – Temporary Table DROP a Table There will be times when you want to get rid of a table you have created. Perhaps you made an error when you created it. Or, you may simply be through using it. To get rid of a table you DROP it from the database. DROP TABLE QAmt

  24. Possible Solution – Temporary Table INSERT Query Insert Query to add records and fill in year, code, Q1Amt INSERT INTO QAmt (year, code, Q1Amt) SELECT year, code, amount FROM fin_data WHERE quarter = 'Q1' AND code = 'e1'

  25. Possible Solution – Temporary Table Delete Query There may also be times when you want to keep a table, but get rid of all the records in the table… to empty it. If, for example, you find a problem with your INSERT query, you may want to empty the table before you run the corrected INSERT query. To empty a table you DELETE all the records FROM the table. To empty the QAmt table you: DELETE FROM QAmt

  26. Possible Solution – Temporary Table After INSERT Query Check the results: SELECT * FROM QAmt YearcodeQ1AmtQ2AmtQ3AmtQ4Amt 1999 e1 101 (NULL) (NULL) (NULL) 2000 e1 153 (NULL) (NULL) (NULL) 2001 e1 198 (NULL) (NULL) (NULL)

  27. Possible Solution – Temporary Table Update Query Update Query to add Q2Amt UPDATE QAmt Q SET Q2Amt = ( SELECT amount FROM fin_data F WHERE F.year = Q.year AND F.code = Q.code AND F.quarter = 'Q2‘ ) We do two more update queries to fill in Q3Amt and Q4Amt YearcodeQ1AmtQ2AmtQ3AmtQ4Amt 1999 e1 101 93 (NULL) (NULL) 2000 e1 153 149 (NULL) (NULL) 2001 e1 198 204 (NULL) (NULL)

  28. Possible Solution – Temporary Table Is this really different At this point you may have noticed that the temporary table approach looks a lot like the subquery approach. In fact, it is virtually the same. It seems conceptually simpler because the temporary table allows us break down the problem into separate and distinct subtasks. We can complete part of the solution, check the results, complete some more, etc. For this reason, you will see it used. It has some legitimate applications, but this isn’t one of them. This point becomes even more evident if we use either the self-join approach or the subquery approach to populate the table in one step.

  29. Possible Solution – Temporary Table One Step Using Self Join INSERT INTO QAmt (year, code, Q1Amt, Q2Amt, Q3Amt, Q4Amt) SELECT F1.year, F1.code, F1.amount, F2.amount, F3.amount, F4.amount FROM fin_data F1, fin_data F2, fin_data F3, fin_data F4 WHERE F1.year = F2.year AND F2.year = F3.year AND F3.year = F4.year AND F1.code = 'e1' AND F2.code = 'e1' AND F3.code = 'e1' AND F4.code = 'e1' AND F1.quarter = 'Q1' AND F2.quarter = 'Q2' AND F3.quarter = 'Q3' AND F4.quarter = 'Q4'

  30. Possible Solution – Temporary Table One Step Using SubQuery INSERT INTO QAmt (year, code, Q1Amt, Q2Amt, Q3Amt, Q4Amt) SELECT year, code, amount, (SELECT amount FROM fin_data F WHERE F.year = Q.year AND F.code = Q.code AND F.quarter = 'Q2' ), (SELECT amount FROM fin_data F WHERE F.year = Q.year AND F.code = Q.code AND F.quarter = 'Q3' ), (SELECT amount FROM fin_data F WHERE F.year = Q.year AND F.code = Q.code AND F.quarter = 'Q4' ) FROM fin_data Q WHERE quarter = 'Q1' AND code = 'e1'

  31. Possible Solution – Temporary Table Problems So… using a temporary table really isn’t a distinct approach at all. It can make the solution conceptually simpler by allowing us to divide the solution into discrete steps, but it does nothing to reduce overall coding complexity or to improve performance. In addition, it introduces potential problems with regard to updates made between the time the table is created and the time it is used (currency). If you really do want a snapshot of the data at a particular time then this can be useful. In general, it’s undesirable You can overcome the currency problem by creating a view instead of a table but, so far, you’re still basically using either a self-join or a subquery method. If you want to create a View you’re better off using Characteristic Functions to do it! We still need a better solution!

  32. Better Solution Characteristic Functions

  33. Characteristic Functions Distribute amount to one of four new columns: yearcodequarteramountQ1AmtQ2AmtQ3AmtQ4Amt 1999 e1 Q1 101 101 0 0 0 1999 e1 Q2 93 0 93 0 0 1999 e1 Q3 129 0 0 129 0 1999 e1 Q4 145 0 0 0 145 2000 e1 Q1 153 153 0 0 0 2000 e1 Q2 149 0 149 0 0 2000 e1 Q3 157 0 0 157 0 2000 e1 Q4 163 0 0 0 163 etc. Strategy

  34. Characteristic Functions yearcodequarteramountQ1AmtQ2AmtQ3AmtQ4Amt 1999 e1 Q1 101 101 0 0 0 1999 e1 Q2 93 0 93 0 0 1999 e1 Q3 129 0 0 129 0 1999 e1 Q4 145 0 0 0 145 etc. Strategy and GROUP BY year, summing the amounts in the new columns for each year. yearfin_codeQ1AmtQ2AmtQ3AmtQ4Amt 1999 e1 101 93 129 145 etc.

  35. Characteristic Functions In the previous examples, the reason we needed four copies of the table or four different queries is that the determination of which rows to include was made in the WHERE clause. WHERE quarter = ‘Q1’ allows us to put only the amounts for Q1 in the Q1Amt column. That’s good. But… it also removes the possibility of filling in the Q2Amt column in the same query. So Q2Amt must be filled in using a separate copy of the table or a separate subquery with a WHERE clause that allows us to see the amounts in the rows WHERE quarter = ‘Q2’ WHERE can be Bad

  36. Characteristic Functions Characteristic Functions allow us to control which rows get represented in a particular column without using a WHERE clause. With Characteristic Functions the determination of whether amount is represented in Q1Amt or Q2Amt or Q3Amt or Q4Amt is made in the SELECT clause.

  37. Characteristic Functions • A Characteristic Functions is an expression that evaluates to: • 1 if a field should be represented in a particular column • 0 if it should not. For example, a characteristic function to determine whether an amount should be represented in Q1Amt would return 1 if quarter is ‘Q1’, 0 if it isn’t. We might call this particular characteristic function CF1. 1 or 0

  38. Characteristic Functions Assume for a moment that we have created CF1. We could then use it to control whether amount is represented in Q1Amt by defining Q1Amt as CF1* amount. SELECT year, code, quarter, amount, CF1, CF1 * amount AS Q1Amt FROM fin_data WHERE code = ‘e1’ Note: this obviously won’t run yet since we haven’t actually created CF1 at this point. Would yield: yearcodequarteramountCF1Q1Amt 1999 e1 Q1 101 1 101 1999 e1 Q2 93 0 0 1999 e1 Q3 129 0 0 1999 e1 Q4 145 0 0 2000 e1 Q1 153 1 153 2000 e1 Q2 149 0 0 etc.

  39. Characteristic Functions We then create CF2 to return 1 if quarter contains ‘Q2’, 0 otherwise. We use CF2 in the same SELECT clause to control whether amount is represented in Q2Amt SELECT year, code, quarter, amount, CF1, CF1 * amount AS Q1Amt, CF2, CF2 * amount AS Q2Amt FROM fin_data WHERE code = ‘e1’ Would yield: yearcodequarteramountCF1Q1AmtCF2Q2Amt 1999 e1 Q1 101 1 101 0 0 1999 e1 Q2 93 0 0 1 93 1999 e1 Q3 129 0 0 0 0 1999 e1 Q4 145 0 0 0 0 2000 e1 Q1 153 1 153 0 0 2000 e1 Q2 149 0 0 1 149 etc.

  40. Characteristic Functions We then GROUP BY year and SUM the Q1Amt and Q2Amt for the year SELECT year, MAX(code), SUM (CF1 * amount) AS Q1Amt, SUM (CF2 * amount AS Q2Amt FROM fin_data WHERE code = ‘e1’ GROUP BY year Would yield: yearcodeQ1AmtQ2Amt 1999 e1 101 93 2000 e1 153 149 2001 e1 198 204

  41. Characteristic Functions • Recall that CF1 evaluates to: • 1 if quarter is ‘Q1’, • 0 otherwise. • Both Sybase and SQL Server contain a CASE statement that can be used to implement CF1 as • (CASE WHEN quarter = ‘Q1’ THEN 1 ELSE 0 END) Implementing CF1

  42. Characteristic Functions SELECT year, code, quarter, amount, (CASE WHEN quarter = ‘Q1’ THEN 1 ELSE 0 END) as CF1, (CASE WHEN quarter = ‘Q2’ THEN 1 ELSE 0 END) as CF2 FROM fin_data WHERE code = ‘e1’ Yields: Year code quarter amount CF1 CF2 1999 e1 Q1 101 1 0 1999 e1 Q2 93 0 1 1999 e1 Q3 129 0 0 1999 e1 Q4 145 0 0 2000 e1 Q1 153 1 0 2000 e1 Q2 149 0 1 CF1 and CF2

  43. Characteristic Functions SELECT year, code, quarter, amount, (CASE WHEN quarter = ‘Q1’ THEN 1 ELSE 0 END) * amount AS Q1Amt, (CASE WHEN quarter = ‘Q2’ THEN 1 ELSE 0 END) * amount AS Q2Amt FROM fin_data WHERE code = ‘e1’ Yields: Year code quarter amount Q1Amt Q2Amt 1999 e1 Q1 101 101 0 1999 e1 Q2 93 0 93 1999 e1 Q3 129 0 0 1999 e1 Q4 145 0 0 2000 e1 Q1 153 153 0 2000 e1 Q2 149 0 149 Using CF1 & CF2

  44. Characteristic Functions SELECT year, MAX(code) AS fin_code, SUM((CASE WHEN quarter = ‘Q1’ THEN 1 ELSE 0 END) * amount) AS Q1Amt, SUM((CASE WHEN quarter = ‘Q2’ THEN 1 ELSE 0 END) * amount) AS Q2Amt FROM fin_data WHERE code = ‘e1’ GROUP BY year Group By Year • Yields: • yearfin_codeQ1AmtQ2Amt • 1999 e1 101 93 • 2000 e1 153 149 • 2001 e1 198 204

  45. Characteristic Functions SELECT year, MAX(code) AS fin_code, SUM((CASE WHEN quarter = ‘Q1’ THEN 1 ELSE 0 END) * amount) AS Q1Amt, SUM((CASE WHEN quarter = ‘Q2’ THEN 1 ELSE 0 END) * amount) AS Q2Amt, SUM((CASE WHEN quarter = ‘Q3’ THEN 1 ELSE 0 END) * amount) AS Q3Amt, SUM((CASE WHEN quarter = ‘Q4’ THEN 1 ELSE 0 END) * amount) AS Q4Amt FROM fin_data WHERE code = ‘e1’ GROUP BY year All 4 quarters • Yields: • yearfin_codeQ1AmtQ2AmtQ3AmtQ4Amt • 1999 e1 101 93 129 145 • 2000 e1 153 149 157 163 • 2001 e1 198 204 214 231

  46. Characteristic Functions yearcodeQ1AmtQ2AmtQ3AmtQ4Amt 1999 e1 101 93 129 145 1999 e2 403 459 609 632 1999 e3 1437 2033 2184 2145 1999 e4 623 784 856 1043 1999 e5 381 402 412 467 1999 r1 1023 2033 2998 3014 1999 r2 234 459 601 944 2000 e1 153 149 157 163 2000 e2 643 687 898 923 etc. Can you modify the query to show all codes, not just e1?

  47. Characteristic Functions SELECT year, code, SUM((CASE WHEN quarter = 'Q1' THEN 1 ELSE 0 END) * amount) AS Q1Amt, SUM((CASE WHEN quarter = 'Q2' THEN 1 ELSE 0 END) * amount) AS Q2Amt, SUM((CASE WHEN quarter = 'Q3' THEN 1 ELSE 0 END) * amount) AS Q3Amt, SUM((CASE WHEN quarter = 'Q4' THEN 1 ELSE 0 END) * amount) AS Q4Amt FROM fin_data GROUP BY year, code ORDER BY year, code Can you modify the query to show all codes, not just e1? yearcodeQ1AmtQ2AmtQ3AmtQ4Amt 1999 e1 101 93 129 145 1999 e2 403 459 609 632 1999 e3 1437 2033 2184 2145 1999 e4 623 784 856 1043 etc.

  48. Characteristic Functions Result: We create a “Pivot” with only a single pass through the table and without resorting to programming! yearcodeQ1AmtQ2AmtQ3AmtQ4Amt 1999 e1 101 93 129 145 1999 e2 403 459 609 632 1999 e3 1437 2033 2184 2145 1999 e4 623 784 856 1043 1999 e5 381 402 412 467 1999 r1 1023 2033 2998 3014 1999 r2 234 459 601 944 etc.

  49. Characteristic Functions Timing for Join Timing for SubQuery Timing for Temporary Table Timing for Characteristic Function Add Slide

More Related