1 / 27

SQL/OLAP

SQL/OLAP. Sang-Won Lee Let’s e-Wha ! Email: swlee@ewha.ac.kr URL: http://home.ewha.ac.kr/~swlee Jul. 12th, 2001. Contents. Introduction to OLAP and SQL Issues Current OLAP Solutions SQL/OLAP Future OLAP Trends. OLAP. On-Line Analytical Processing E.F. Codd coined the term “OLAP”([1])

orde
Download Presentation

SQL/OLAP

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SQL/OLAP Sang-Won LeeLet’se-Wha!Email: swlee@ewha.ac.krURL: http://home.ewha.ac.kr/~swleeJul. 12th, 2001 ISO/IECJTC1 SC32

  2. Contents • Introduction to OLAP and SQL Issues • Current OLAP Solutions • SQL/OLAP • Future OLAP Trends ISO/IECJTC1 SC32

  3. OLAP • On-Line Analytical Processing • E.F. Codd coined the term “OLAP”([1]) • Multi-dimensional data model • vs. On-Line Transaction Processing • vs. Data warehouse ISO/IECJTC1 SC32

  4. Data Warehouse Architecture ISO/IECJTC1 SC32

  5. Multi-dimensional Data Model • Sales(prod-id,store-id,time-id,qty,amt) Dimension: Product, Store, Time Hierarchy: • Product -> Category -> Industry • Store->City -> State -> Country • Date -> Month -> Quarter -> Year ISO/IECJTC1 SC32

  6. Multi-dimensional Data Model(2) • Operations • roll-up/drill-down • slice/dice • pivot • ranking • comparisons • drill-across • etc. • Example • for each state show me top 10 products based on total sales • what is the percentage growth of Jan-99 total sales over total Jan-98? • for each product show me the quantity shipped and sold ISO/IECJTC1 SC32

  7. Database Back in the OLAP Game- History of SQL Evolutions in 1990s(OLAP Area) - • Requirements from industries(‘95 ~ ‘96) • R. Kimball, “Why Decision Support Fails and How to Fix it?” ([2]); see also [3], [4] • Reactions from researchers(‘96) • Jim Gray et al., “Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab and Sub Totals,” ([7,8]) • Chatziantoniou, K. Ross, “Querying Multiple Features in Relational Databases,”([9]) • Commercial DBMSs and SQL standards(‘98 ~ ) • commercial products: e.g. Oracle, “Analytical Functions for Oracle8i”, Oct., 1999 • SQL standards • ANSI X3H2-96-205(R3): Super Sets(The Cube and Beyond) • ANSI NCITS H2-99-154: Introduction to OLAP • see also [6] ISO/IECJTC1 SC32

  8. OLAP Operations • Many business operations was hard or impossible to express in SQL • multiple aggregations • comparisons(with aggregation) • reporting features • Be prepared for serious performance penalty • Client and middle-ware tools provide the necessary functionality • OLAP server: ROLAP vs. MOLAP ISO/IECJTC1 SC32

  9. Cross Tab By Color Chevy Ford RED WHITE BLUE By Make Sum Multiple Aggregations • Create a 2-dimensional spreadsheets that shows sum of sales by maker as well as model of car • Each subtotal requires a separate aggregate query SELECT color, make, sum(amt) FROM sales GROUP BY color, make union SELECT color, sum(amt) FROM sales GROUP BY color union SELECT make, sum(amt) FROM sales GROUP BY make union SELECT sum(amt) FROM sales ISO/IECJTC1 SC32

  10. Comparisons • Examples: • last year’s sales vs. this year’s sales for each product • requires a self-join VIEW: create or replace view v_sales as select prod-id, year, sum(qty) as sale_sum from sales group by prod-id, year; QUERY: select cur.year cur_year, cur.sale_cur_sales, last.sum last_sales from v_sales curr, v_sales last where curr.year=(last.year+1) ISO/IECJTC1 SC32

  11. Reporting Features • It was too complex to express • rank(top 10) and N_tile(“top 30%” of all products) • median, mode, … • running total, moving average, cumulative totals ISO/IECJTC1 SC32

  12. Reporting Features(2) • Examples: • a moving average(over 3 day window) of total sales for each product for 2000 VIEW: create or replace view v_sales as select prod-id, time-id, sum(qty) as sale_sum from sales group by prod-id, time-id; QUERY: select end.time, avg(start.sale_sum) from v_sales start, v_sales end where end.time >= start.time and end.time <= start.time+2 group by end.time ISO/IECJTC1 SC32

  13. OLAP Servers Processing MD queries efficiently ISO/IECJTC1 SC32

  14. ROLAP OLAP Client OLAP Client OLAP Client To map warehouse schema into a MD model OLAP Engine meta-data Relational Database (Star or Snowflake Schema) ISO/IECJTC1 SC32

  15. ROLAP(2) • Example: Oracle Discoverer 4i leverages Oracle 8i • 8i - biggest SQL improvements in a decade! • more powerful analysis using new analytic functions • sharing query redirection(rewrite) using MVs • 100% automated summary management ISO/IECJTC1 SC32

  16. MOLAP • A multidimensional database(MDDB) stores data in a series of array structures, indexed to provide optimal access time to any element in the array. • Example: Oracle Express stores arrays of data ISO/IECJTC1 SC32

  17. Propose SQL Constructs • Multiple aggregations • Gray et. al., “Cube and Roll-Up”[6,7] • Comparison • Chatziantoniou and Ross, “Group By Column Variable”[8] SELECT subscriber, r.login-time FROM log GROUP BY subscriber: r SUCH THAT r.spent-time = max(spent_time) • Reporting • Redbrick provides SQL extensions in RISQL • rank, tertile, ratio-to-report etc ISO/IECJTC1 SC32

  18. Aggregate Group By Sum (with total) By Color RED Cross Tab WHITE BLUE By Color Chevy Ford RED WHITE Sum BLUE By Make Sum The Data CUBE Relational Operator Generalizes Group By and Aggregates The Data Cube and The Sub-Space Aggregates 1990 FORD 1991 CHEVY 1992 1993 By Year By Make By Make & Year RED WHITE BLUE By Color & Year By Make & Color Sum By Color source:[6] ISO/IECJTC1 SC32

  19. Getting Sub-totals: ROLLUP Operation SELECT year, brand, SUM(qty) FROM salesGROUP BY ROLLUP (year, brand); YEAR BRAND SUM(qty) 1996 Ford 2501996 Honda 300 1996 Toyota 450 1997 Ford 300 … 1996 1000 1997 1200 2200 ISO/IECJTC1 SC32

  20. Getting Cross-tabs: CUBE Operation SELECT year, brand, SUM(amount) FROM salesGROUP BY CUBE (year, brand); YEAR BRAND SUM(AMOUNT) 1996 Ford 250 ...1996 Toyota 4501997 Ford 300 ...1997 1200 2200 Ford 550 Honda 650 Toyota 1000 ISO/IECJTC1 SC32

  21. Flexible Grouping: GROUPING_SETS Operator SELECT year, brand, color, SUM(qty) FROM salesGROUP BY GROUPING_SETS ((year, brand), (brand,color),()); YEAR BRAND COLOR SUM(QTY) 1996 Ford 2501996 Honda 300 1996 Toyota 4501997 Ford 3001997 Honda 3501997 Toyota 550 Ford Blue 400 Ford Red 150 Honda Blue 650 Toyota Red 700 Toyota White 300 2200 Year, Brand Brand, Color Grand total ISO/IECJTC1 SC32

  22. LAG Operator SQL> SELECT timekey, sales 2 LAG(sales, 12) OVER 3 (ORDER BY timekey) AS sales_last_year, 4 (sales - sales_last_year) AS sales_change 5 FROM sales; TIMEKEY SALES SALES_LAST_YEAR SALES_CHANGE 98-1 1100 - - ….. …… ... 99-1 1200 1100 100 99-2 1500 1450 50 99-3 1700 1350 250 99-4 1600 1700 -100 99-5 1800 1600 200 99-6 1500 1450 50 99-7 1300 1250 50 99-8 1400 1200 200 ISO/IECJTC1 SC32

  23. MOVING Average SELECT time-id, avg(sum(qty)) over (order by time-id RANGE INTERVAL ‘2’ DAY PRECEDING ) as mvg_avg_sales from sales group by time_id ; ISO/IECJTC1 SC32

  24. SQL/OLAP • Why enhance the RDBMS for OLAP calculations? • Performance • Scalability • Simpler SQL development • Productivity ISO/IECJTC1 SC32

  25. Database Back in the OLAP Game • Materialized views • Index techniques: e.g. bitmap (join) index • Partitioning: e.g. range/hash/list • Query optimization: e.g. star query optimization • ...... ISO/IECJTC1 SC32

  26. Future OLAP Trends To be or not to be? • OLAP API: • OLE DB for OLAP • JOLAP ISO/IECJTC1 SC32

  27. References [1] E.F. Codd et al., “Providing OLAP(On-line Analytical Processing) to User-Analysts: An IT Mandate,” Available from Arborsoft’s Web Site(http://www.arborsoft.com) [2] R. Kimball, “Why Decision Support Fails and How to Fix it?” SIGMOD Record, Sep.,1995 [3] R. Kimball, “The Problem with Comparisons,” DBMS Magazine, Jan., 1996(also available from http://www.rkimball.com/html/articles.html) [4] R. Kimball, “SQL Roadblocks and Pitfalls,” DBMS Magazine, Feb., 1996(also available from http://www.rkimball.com/html/articles.html [5] R. Winter, “Database Back in the OLAP Game,” Intelligent Enterprise Magazine, Dec., 1998,(available from http://www.intelligententerprise.com) [6] R. Winter, “SQL-99’s New OLAP Functions,” Intelligent Enterprise Magazine, Jan., 2000,(available from http://www.intelligententerprise.com) [7] Jim Gray et al., “Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab and Sub Totals,” Proceedings of International Conferences on Data Engineering, p. 152 - 159, 1996 [8] Jim Gray et al., “Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab and Sub Totals,” Data Mining and Knowledge Discovery Journal, Vol. 1, No. 1, 1997 [9] D. Chatziantoniou, K. Ross, “Querying Multiple Features in Relational Databases,”, Proc. Of VLDB Conf., 1996 ISO/IECJTC1 SC32

More Related