480 likes | 640 Views
Thesis Presentation – CERIA Laboratory. Design and Implementation of a Scalable Distributed Database System: SD-SQL Server. Soror SAHRI Soror.Sahri@dauphine.fr http://ceria.dauphine.fr/soror/soror.html. 1. Introduction. 2. State of The Art. P L A N. 3. SD-SQL Server Architecture.
E N D
Thesis Presentation – CERIA Laboratory Design and Implementation of a Scalable Distributed Database System: SD-SQL Server Soror SAHRI Soror.Sahri@dauphine.fr http://ceria.dauphine.fr/soror/soror.html
1. Introduction 2. State of The Art P L A N 3. SD-SQL Server Architecture 4. SD-SQL Server Application Interface 5. Implementation of SD-SQL Server 6. Performance Measurements 7. Conclusion & Future Work
Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work Facts Objectve Facts • Most of DBSs have distributed/parallel versions • SQL Server, Oracle, DB2… • DBSs do not provide dynamically scalable tables • All require manual repartitioning when tables scale-up
Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work Facts Objective Objective • Scalable Distributed Partitioning ofRelational Tables Scalable Distributed Database SystemSD-SQL Server
1. Introduction 2. State of the Art P L A N 3. SD-SQL Server Architecture 4. SD-SQL Server Application Interface 5. Implementation of SD-SQL Server 6. Performance Measurements 7. Conclusion & Future Work
Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work Parallel DBMSs SDDSs Parallel DBMSs • Oracle 10g [LB05] [LB05]K, Loney & B, Bryla. Oracle Database 10g, DBA Handbook
Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work Parallel DBMS SDDSs SDDSs • An SDDS is a new class of data structures • Specific for multicomputers, P2P, Grids… • Why ? • SDDSs provide many scalable distributed partitioning schemes • LH*, RP*, k-RP*, LH*RS… • These schemes can serve as the basis for an SD-DBS architecture
1. Introduction 2. State of the Art P L A N 3. SD-SQL Server Architecture 4. SD-SQL Server Application Interface 5. Implementation of SD-SQL Server 6. Performance Measurements 7. Conclusion & Future Work
Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work Overview Nodes, NDBs & SDBs Scalable Tables Images SD-SQL Server? • SD-SQL Server is a Scalable Distributed Database System (SD-DBS) • SD-SQL Server uses the reference architecture • Proposed by Pr. Litwin, Pr. Schwartz & Pr. Risch • 2nd Intl. Workshop on Cooperative Internet Computing, 2002 • SD-SQL Server is based on the RP* SDDS principles • SD-SQL Server runs on Microsoft SQL Server 2000
User/Application User/Application sd_insert sd_alter_table SD-SQL ServerManagers SD-SQLserver SD-SQLpeer SD-SQLclient SD-SQLserver LinkedSQLServers Split NDBs P C S S N2 Ni Ni+1 N1 Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work Overview Nodes, NDBs & SDBs Scalable Tables Images Gross Architecture [Litwin & Sahri. WDAS 2004] • The SD-SQL Server originality ? • The automatic extension of the scalable tables • of their NDBs • of their SD-SQL Server nodes
…… Node1 Node2 Node3 Nodei DB1NDB DB2NDB DB1NDB DB2NDB DB1NDB DB2 SDB DB1 SDB MDB Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work Overview Nodes, NDBs & SDBs Scalable Tables Images The Nodes, NDBs & SDBs
Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work Overview Nodes, NDBs & SDBs Scalable Tables Images Scalable Tables • A scalable (distributed) table is a collectionof segments • Segments are SQL tables • A scalable table has, initially, only one primary segment • At some server or peer NDB • All the segments of a scalable table have the same scheme
Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work Overview Nodes, NDBs & SDBs Scalable Tables Images Scalable Tables: Meta-data • Each scalable table has meta-data: • The segment capacity • The actual partitioning of the scalable table • The check constraint of each segment A check constraint defines the Min and Max for each segment • These meta-data are stored in the meta-tables • excluding the check constraints
N1.DB1 N2.DB1 N3.DB1 Ni.DB1 Primary N1.DB1 N1.DB1 RP N2.DB1 N3.DB1 1000 Size Ni.DB1 Nodes Meta-Tables Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work Overview Nodes, NDBs & SDBs Scalable Tables Images Scalable Tables: Meta-data DB1 SDB ……. T Scalable Table
Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work Overview Nodes, NDBs & SDBs Scalable Tables Images Scalable Tables: Splitting • The number of segments in a scalable table is variable • A segment that overflows splits • A split occurs when an insert overflows the segmentcapacity • Every split produces one or more new segments for a scalable table
p=INT(b/2) C( S)= { c: c < h = c (b+1-p)} C( S1)={c: c > = c (b+1-p)} b+1-p p S S1 S Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work Overview Nodes, NDBs & SDBs Scalable Tables Images Scalable Tables: Splitting Check Constraint? b+1 b SELECT TOP P * INTO Ni.Si FROM S ORDER BY C ASC SELECT TOP P * WITH TIES INTO Ni.S1 FROM S ORDER BY C ASC
N1 N2 N3 N3 Ni sd_create_node_database N4 NDBDB1 NDBDB1 NDBDB1 NDBDB1 NDBDB1 NDBDB1 DB1 SDB DB1 SDB DB1SDB sd_insert sd_insert sd_insert T ScalableTable Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work Overview Nodes, NDBs & SDBs Scalable Tables Images Scalable Tables: Splitting sd_create_node sd_create_node_database …….
Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work Overview Nodes, NDBs & SDBs Scalable Tables Images Images • AnImagehides the scalable table partitioning • An image is an SQL Server distributed updateable partitioned view of the table • An SQL Server Union-all view with check constraints • An image resides on client or peer NDBs
Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work Overview Nodes, NDBs & SDBs Scalable Tables Images Image Types • Primary image • Resides at the creation node • Has the name of the scalable table • Secondary images • Reside at other client or peer NDBs of the SDB • Have a specific name, other than that of the table • To avoid name conflict
Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work Overview Nodes, NDBs & SDBs Scalable Tables Images Image Adjustment • An image presents the actual partitioning of its scalable table • Defines the partitioning as known to the client • It do not address any new segments resulted from a split • Are dynamically adjustable by the client • When a query to the image comes in • Image checking • Image adjustment if necessary
Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work Overview Nodes, NDBs & SDBs Scalable Tables Images Image Adjustment • Get the number of segments presented in the image, N1 • Get the number of segments of the scalable table, N2 • Compare N1 and N2: • If N1<N2 then Image Adjustment • Alter the partitioned view definition
N1.DB1 N2.DB1 N3.DB1 N4.DB1 PrimaryImage Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work Overview Nodes, NDBs & SDBs Scalable Tables Images Images: Example DB1 SDB T Image T Scalable Table CREATE VIEW TAS SELECT * FROM N2.DB1.SD._N1_T CREATE VIEW TAS SELECT * FROM N2.DB1.SD._N1_TUNION ALL SELECT * FROM N3.DB1.SD._N1_T UNION ALL SELECT * FROM N4.DB1.SD._N1_T
1. Introduction 2. State of the Art P L A N 3. SD-SQL Server Architecture 4. SD-SQL Server Application Interface 5. Implementation of SD-SQL Server 6. Performance Measurements 7. Conclusion & Future Work
Principles Nodes Management SDBs & NDBs Management Scalable Tables & Images Management Scalable Queries Management INSERTsd_insert • CREATE TABLEsd_create_table Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work Principles [Litwin, Schwartz & Sahri. IASTED-DBA 2006] • The application interface manipulates scalable tables through SD-SQL Server commands • The SD-SQL Server commandsstart with ‘sd_’ to distinguish from SQL Server commands for static tables
Principles Nodes Management SDBs & NDBs Management Scalable Tables & Images Management Scalable Queries Management Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work Nodes Management • Node Creation • sd_create_node‘Dell1’/* Serverby default */ • sd_create_node‘Ceria’, ‘client’ • Node Alteration • sd_alter_node‘Ceria’, ‘ADD server’/* Becomes peer*/ • Node Removal • sd_drop_node‘Ceria’
Principles Nodes Management SDBs & NDBs Management Scalable Tables & Images Management Scalable Queries Management Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work SDB & NDB Management • SDB Creation • sd_create_scalable_database ‘SkyServer’, ‘Dell1’, ‘Server’,2 /* Creates the primary SkyServer NDB as well at Dell1*/ • SDB Alteration • sd_create_node_database‘SkyServer’, ‘Ceria’, ‘Client’ • SDB Removal • sd_drop_scalable_database‘SkyServer’
Principles Nodes Management SDBs & NDBs Management Scalable Tables & Images Management Scalable Queries Management Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work Scalable Tables Management • Scalable Table Creation • sd_create_table‘PhotoObj (Objid BIGINT PRIMARY KEY..)’, 10000 • Scalable Table Alteration • sd_alter_table ‘PhotoObj ADD t INT’, 1000 • sd_create_index‘run_index ON Photoobj (run)’ • sd_drop_index‘PhotoObj.run_index’ • Scalable Table Removal • sd_drop_table‘PhotoObj’
Principles Nodes Management SDBs & NDBs Management Scalable Tables & Images Management Scalable Queries Management Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work Images Management • Secondary Image Creation • sd_create_image‘Ceria’, ‘PhotoObj’ • Secondary Image Removal • sd_drop_image'PhotoObj’
Principles Nodes Management SDBs & NDBs Management Scalable Tables & Images Management Scalable Queries Management Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work Scalable Queries Management USE SkyServer /* SQL Server command */ • Scalable Update Queries • sd_insert‘INTO PhotoObj SELECT * FROM Ceria.Skyserver-S.dbo.PhotoObj’ • Scalable Search Queries • sd_select‘* FROM PhotoObj’ • sd_select‘TOP 5000 * INTO PhotoObj1 FROM PhotoObj’, 500
1. Introduction 2. State of the Art P L A N 3. SD-SQL Server Architecture 4. SD-SQL Server Application Interface 5. Implementation of SD-SQL Server 6. Performance Measurements 7. Conclusion & Future Work
Image Binding Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work Internal Processing Concurrency Experiments Command Processing [Litwin, Schwartz & Sahri. WDAS 2006] • Let Q a scalable query using the PhotoObj image: • sd_select‘COUNT (*) FROM PhotoObj’ Find Images in Q PhotoObj Image Adjustment Execution of Q
Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work Internal Processing Concurrency Experiments Concurrency • SD-SQL Server processes every command as SQL distributed transaction at Repeatable Read isolation level • Much less blocking than at Serializable Level • SD-SQL Server performs the split asynchronously with the insert that triggered it • It launches the actual splitting as an asynchronous job called splitter
Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work Internal Processing Concurrency Experiments Concurrency • Splits use exclusive locks on segments and on tuples in RP meta-table. • Shared locks on other meta-tables: Primary, NDB meta-tables • Scalable queries use basically shared locks on meta-tables and any other table involved
Splitter sd_alter_table Dell3 RP Dell2 Dell1 PhotoObj Dell1.SkyServer Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work Internal Processing Concurrency Experiments Concurrency: example X Exclusive Lock Waiting Shared Lock X Exclusive Lock Exclusive Lock
1. Introduction 2. State of the Art P L A N 3. SD-SQL Server Architecture 4. SD-SQL Server Application Interface 5. Implementation of SD-SQL Server 6. Performance Measurements 7. Conclusion & Future Work
Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work Internal Processing Concurrency Experiments Experimental Environment • 6 Machines Pentium IV 1.7 GHz • RAM: 780 Mb & 1 Gb • Operating System: Windows 2K Server • Ethernet Network: max bandwidth of 1 Gb/s • Use of SQL Analyzer for editing queries • Use of SQL Profiler to take measurements
Principles Nodes Management SDBs & NDBs Management Scalable Tables & Images Management Scalable Queries Management Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work The SkyServer Benchmark • We use SkyServer database as benchmark • Provided and installed at Ceria by Dr. Gray • SkyServer brings the entire database of the Sloan Digital Sky Survey, SDSS • We use of thePhotoObj table as an example scalable table • In our experiments, PhotoObjhas 158,426 tuples (about 260 MB) • Originally, it has 14 M tuples
Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work Internal Processing Concurrency Experiments Splitting Measurements [Litwin, Sahri & Schwartz. WDAS 2004] Splitting of PhotoObj scalable table into 2, 3, 4 and 5 segments according to different capacities
Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work Internal Processing Concurrency Experiments Image Adjustment (Q1):sd_select‘TOP 10 objid FROM PhotoObj WHERE objid not in (SELECT objid FROM PhotoObj WHERE objid <= @objidMax’ Query (Q1) execution time
Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work Internal Processing Concurrency Experiments Comparison between SD-SQL Server and SQL Server [Litwin, Sahri & Schwartz. BNCOD 2006] (Q2):sd_select‘COUNT (*) FROM PhotoObj’ Execution time of (Q2) on SQL Server and SD-SQL Server
1. Introduction 2. State of the Art P L A N 3. SD-SQL Server Architecture 4. SD-SQL Server Application Interface 5. Implementation of SD-SQL Server 6. Performance Measurements 7. Conclusion & Future Work
Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work Conclusion Future Work Conclusion • Scalable distributed databases with scalable tables are now a reality with SD-SQL Server • No more manual repartitioning • Unlike in any other DBS we know about • The performance analysis proves • Efficiency of our design • Immediate utility of SD-SQL Server
Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work Conclusion Future Work Future Work • More performance measurements • With the SDSS queries • With the SkyServer benchmark of 80 Gb size • Error processing • Management of fault tolerance • Use of the high availability methods
Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work Conclusion Future Work Future Work • Application on other DBMSs • Oracle, DB2, etc. • Use of the SD-SQL Server principles on P2P systems or Grid Computing • Use of SD-SQL Server as core component of a virtual repository of eGov documents
Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work Conclusion Future Work Thanking • Work partly supported by • CEE Project eGov • MS Research • CEE Project ICONS
Introduction State of the Art SD-SQL Server Architecture SD-SQL Server Application Interface Implementation of SD-SQL Server Conclusion & Future Work Conclusion Future Work Thank you for your Attention Soror SAHRI Soror.Sahri@dauphine.fr http://ceria.dauphine.fr/soror/soror.html