210 likes | 323 Views
SD-SQL Server: a Scalable Distributed Database. Soror SAHRI http://ceria.dauphine.fr/ soror/soror.html soror.sahri@dauphine.fr. Objective. Most of DBSs are parallel : SQL Server, Oracle,..etc. DBSs do not provide scalable data partitioning. Application of the SDDS technology to DBSs.
E N D
SD-SQL Server: a Scalable Distributed Database Soror SAHRI http://ceria.dauphine.fr/soror/soror.html soror.sahri@dauphine.fr
Objective • Most of DBSs are parallel : SQL Server, Oracle,..etc. • DBSs do not provide scalable data partitioning. • Application of the SDDS technology to DBSs.
SD-SQL Server Architecture User/Application User/Application SD-SQL Server Managers SD-SQL client SD-SQL server SD-SQL peer D1 Di D2 Node DBs Split SQL Server SQL Server SQL Server
SD-SQL server SD-SQL Server Architecture Segments of a Scalable Table DB1 DB2 DB3 …… Segment Segment Segment SD-SQL peer SD-SQL client Scalable Distributed View Scalable Distributed View ……
The SD-SQL server • manages the scalable tables, • and the split mechanism. • The split is launched by a trigger which verifies the capacity (the maximal size in tuples) of a scalable table. • If a scalable table exceeds its capacity, it will be splitted into segments. • It uses for this management the meta-tables SD-C, SD-RP, SD-Size and SD-Site.
The SD-SQL server:Split mechanism DB1 DB2 Segment Segment ……… Split Split SD_C SD_RP SD_C SD_RP Meta-tables Meta-tables SD_SITE SD_SIZE SD_SITE SD_SIZE SQL … SQL Server 1 SQL Server 2
The SD-SQL client • manages the scalable distributed views. What ‘s a scalable distributed view? • Distributed & Partitioned Views allow the data in a large table to be split into smaller member tables in distributed servers. • The data is partitioned between the member tables based on ranges of data values. • Horizontal partitioning. • The data ranges for each member table are defined in a CHECK constraint specified on the partitioning key. • A view that uses UNION ALL to combine selects of all the member tables into a single result set is then defined.
The SD-SQL client Distributed Partitioned Union allView ………… Db1.Segment Db2. Segment
The SD-SQL client:View Adjustment Find all the scalable tables in Q Check each scalable view Adjust the outdate ones Pass Q to DBS for usual execution
The SD-SQL client:View Adjustment Example: Q: SELECT * FROM Server2.DB_1.dbo.Customer, T 1.Find the scalable tables in Q: Customer 2.Check the correctness of the view usingthe SD-RP and C-Image meta-tables. • Count Customer segments in SD-RP as n. • Check n and n’ the number in C-Image meta-table. 3. If n’< n then Adjust the view. 4.Execute Q.
The SD-SQL peer • An SD-SQL peer has the role of an SD-SQL sever and an SD-SQL client. • It contains then all meta-tables and scalable commands of an SD-SQL server and an SD-SQL client.
User/Application • The application manipulates scalable tables • They perform the usual SQL schema manipulations and queries in particular: • Create a scalable table • Alter a scalable • Drop a scalable table • Select, Insert, Update and Delete
The achieved work: SD-SQL server • The Split mechanism • Split between 2 servers • Split between several servers (5 servers) • Management of a scalable table: • Create a scalable table • Alter a scalable table • Drop a scalable table • Create a scalable view • Management of Indexes on a scalable table
The achieved work: SD-SQL client • The View Adjustment • Query Processing, we allow a large number of queries: • Most of the SQL commands (aggregations…) in the SELECT clause. • Most of the SQL commands (joins…) in the WHERE • Several objects in the FROM clause: tables, views… • Tables with aliases • Scalable view at several levels (3 levels)
Performance Measurements • Use of the SkyServer Database from the website: http://research.microsoft.com/~gray/SDSS • Make of PhotoObj table of SkyServer DB as a scalable table. PhotoObj is a table of 158,426 tuples (about 260 MB)
Performance Measurements SD-SQL Server Segment split time
Performance Measurements SD-SQL Server query (Q1) execution time (Q1)SELECT TOP 10 objid FROM PhotoObj WHERE objid NOT IN (SELECT objid FROM _Photoobj)
Performance Measurements SD-SQL Server query (Q2) execution time (Q2)SELECT COUNT (*) FROM T1
Performance Measurements Comparison between SQL Server and SD-SQL Server execution time of query (Q2) (Q3)SELECT COUNT (*) FROM PhotoObj PhotoObj contains 1 segment of _PhotoObj
Conclusion • Scalable Tables are now Reality • For SD-SQL Server at least at present. • Dynamic data partitioning. • Splitting time is practical • Small for small segments, larger for larger segments…. • Time to test a scalable view or to adjust it is negligible. • The prototype is available in CERIA Lab.
Future Works • More Performance Measurements • more comparisons between SQL Server and SD-SQL Server • Measurements with more than 2 partitioned segments • Generalize the scalable tables management to scalable databases management