1 / 27

Explicit and Implicit LIST Aggregate Function for Relational Databases

Explicit and Implicit LIST Aggregate Function for Relational Databases. Witold Litwin Université Paris 9 Dauphine mailto: Witold.litwin@dauphine.fr. Summary. New Aggregate Function Transforms a set of values into single one Char type A basic long time need Should be highly useful. Plan.

matsu
Download Presentation

Explicit and Implicit LIST Aggregate Function for Relational Databases

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Explicit and Implicit LIST Aggregate Function for Relational Databases Witold LitwinUniversité Paris 9 Dauphine mailto:Witold.litwin@dauphine.fr

  2. Summary • New Aggregate Function • Transforms a set of values into single one • Char type • A basic long time need • Should be highly useful

  3. Plan • Motivating Examples • Explicit LIST • Implicit LIST • Conlusion • Further Work

  4. Motivating Example 1 • The Supplier-Part (SP) table of the best-known S-P database

  5. Motivating Example 1 • The classical query : select SP.[S#], Sum(SP.Qty) AS [Total Qty] from SP group By SP.[S#]; S# Total Qty S1 1300 S2 700 S3 200 S4 900 • How to get also the individual quantities ?

  6. Motivating Example 2 • A database of persons having: • Multiple Hobbies • Multiple preferred Restaurants • Many Friends • Best design: • four 4-NF tables • P (SS #, Name), H (SS#, Hobby), R (SS#, Rest), F (SS#, Friend)

  7. Database

  8. Fragment

  9. Query Select Name, Friends, Restaurants, Hobbies, of Person ‘SS1’ SQL : select P.[SS#], P.Name, F.Friend, R.Rest, H.Hobby from ((P INNER JOIN F ON P.[SS#] = F.[SS#]) INNER JOIN H ON P.[SS#] = H.[SS#]) INNER JOIN R ON P.[SS#] = R.[SS#] where P.[SS#] ="ss1" ;

  10. Result Usable ???

  11. General Problem • Current RDBs manage tables in 1NF • All attributes are single-valued (atomic values) • Example 1 ; We wished • Single-valued attribute : • SUM(QTY) • Multi-valued attribute • Individual quantities • The result would not be 1NF

  12. General Problem • RDB manages tables in 1NF • All attributes are single-valued • Example 2 ; We wished : • Single-valued attributes : • S#, Name • Multi-valued attributes (multi-sets): • Hobby, Rest, Friend • The result is normalized to 1NF • {(ss1,Witold, x, y, z) : x  Hobby, y  Rest, z  Friend } • The table is not in 4NF • Subject to well-known anomalies

  13. Solutions • Design RDBS for 0NF tables • A revolution • 0NF RDBS will not be here for years • Aggregate set or multi-set values into atomic values • An evolution • All RDBS already do it using: • SUM, AVG, COUNT… • perhaps with GROUP BY • We need a new aggregate leaving the entire set visible • E.g: (multi)-set of values X => (single) list of values X

  14. Local CulinaryExample • The set-valued attribute: • (Schwarz, Wälder; Kirchen, Chocoladen, Torte) • The aggregated attribute: • Schwarzwälderkirchenchocoladentorte • Local specialty, try it !

  15. Explicit LIST function Select S#, sum (Qty) AS [Total Qty], LIST(Qty) AS Histogram from SP group by S#;

  16. P Name Friend Rest Hobby SS1 Witold Alexis, Christopher, Ron, Jim, Donna, Elisabeth, Dave, Peter, Per-Ake, Thomas Bengal, Cantine Paris 9, Chef Wu, Ferme de Condé, Miyake, Louis XIII,  Mela, North Beach Pizza, Pizza Napoli, Sushi Etoile Bike, Classical Music, Good food, Hike, Movie, Science Fiction, Ski, Swim, Tennis, Wine Explicit LIST function select P.SS#, Name, LIST (DISTINCT (Friend)), LIST (DISTINCT (Rest)), LIST (DISTINCT (Hobby)) from P, F, R, H where P.SS# = F.SS# and F.SS# = R.SS# and R.SS# = H.SS# and P.SS# ="ss1" group by P.SS#, Name ;

  17. Explicit LIST function • Simulated actual output using MsAccess forms with list boxes • Form with three subforms • No SQL query used

  18. P# Total Qty Per supplier p1 600 s1 300 s2 300 p2 1000 s1 200 s2 400 s3 200 s4 200 p3 400 s1 400 p4 500 s1 200 s4 300 p5 500 s1 100s4 400 p6 100 s1 100 Explicit LIST function select P#, SUM (Qty) as [Total Qty], LIST (S#, Qty) as [Per supplier] from SP group by P#;

  19. Implicit LIST function • For any single-valued A : • A = LIST (A) • Any non-aggregated attribute in an SQL query has to be in the GROUP BY clause • Now, any non-aggregated perhaps composite attribute A from a single table and not in GROUP BY clause is implicitly under • LIST (DISTINCT (A)) • Queries may become less procedural

  20. Implicit LIST function select P#, SUM (Qty) as [Total Qty], S#,Qty from SP group by P# having ‘S# QTY’ like ‘*s4*’; • Implicit LIST is LIST (S#, QTY)

  21. Implicit LIST function • Query Select S.*, P#, Qty From S, SP Where S.S# = SP.S# Repeats all the data of the supplier S in every resulting tuple • 6 times for S1: its Name, City, Status • Query Select S.*, P#, Qty From S, SP Where S.S# = SP.S# Group By S.S# Does it only once per supplier • Less redundancy

  22. Implicit Joins and From • Equijoins following the referential semantic links or integrity may be implicit • MsAccess, SQL Server… • FROM clause content can be inferred from the attribute names • Even less procedural formulation may result: select P.SS#, Name, Friend, Rest, Hobby group by P.SS#, Name ;

  23. Implementation Issues • Should be easy for the RDBS owner • Any RDB already processes the aggregates Already done hiding the list Should also be shown

  24. Implementation Issues • For explicit LIST, foreign function interface may suffice • Oracle, DB2, Yukon… • See related work in the paper for current (limited) proposals • Oracle & iAnywhere (core code) • Not for the implicit LIST • Access to core code is necessary

  25. Conclusion • LIST is a new aggregate function • Aggregates a multi-valued attribute into a single value • Responds to a long-standing fundamental RDBS user need - 30 years ? • Should be rather easy to implement • Future work should start with the implementation • Using foreign functions for explicit LIST

  26. Research Support • European Commission ICONS Project • no. IST-2001-32429. • Microsoft Research

  27. Thank You for Your Attention Witold LitwinUniversité Paris 9 Dauphine mailto:Witold.litwin@dauphine.fr

More Related