170 likes | 297 Views
ROOT I/O for SQL databases. Sergey Linev, GSI, Germany. What is SQL?. SQL – Structured Query Language SQL is language to create, modify and retrieve data from Relational Database Management System (RDBMS) Standard since 1986, several revisions in 1992 – SQL2, 1999 – SQL3
E N D
ROOT I/O for SQL databases Sergey Linev, GSI, Germany
What is SQL? • SQL – Structured Query Language • SQL is language to create, modify and retrieve data from Relational Database Management System (RDBMS) • Standard since 1986, several revisions in 1992 – SQL2, 1999 – SQL3 • Supported in most commercial and open source RDBMS S.Linev ROOT I/O for SQL databases
SQL support in ROOT • Abstract interface via three classes: • TSQLServer – query execution • TSQLResult – result of single SELECT query • TSQLRow – content of single row • Implemented for MySQL, Oracle, PostgreSQL • TTreeSQL class provides TTree interface to database tables. Allow tree drawing and table modifications S.Linev ROOT I/O for SQL databases
Objects store in SQL database • Write code yourself • pro: probably, best performance • contra: development, maintenance • Take existing SQL I/O framework • pro: fast to implement • contra: second I/O scheme for your classes • Convert to binary buffer and write as BLOB • pro: uses ROOT I/O facility • contra: no access to data without ROOT S.Linev ROOT I/O for SQL databases
TFile as interface to database • Main features of TFile: • Subdirectories structure and keys list • Schema evolution • Pro of TFile interface to SQL database: • learning – you do not need to learn • using – same I/O code can be used • developing – code can be tested with standard or xml file format • Contra: • fixed table design • performance S.Linev ROOT I/O for SQL databases
How it works • New TBufferSQL2 class implemented: • inherited from TBuffer class • redefines all virtual methods for basic data types and objects I/O • When writing object: • splits object data on parts, belonging to different class members and produces SQL INSERT queries • When reading object: • generates SELECT queries and extract class member values from obtained tables subset S.Linev ROOT I/O for SQL databases
Class table structure • For each class version separate table • TBox_ver2, TGraph_ver4 and so on • Class members stored in separate columns: • basic data types • fixed-size array of basic types • object, pointer on object (object id) • parent class (version) • special treatment for TObject and TString • Anything else will be stored in “raw” format in special table like TGraph_streamer_ver4 S.Linev ROOT I/O for SQL databases
TSQLFile class • Inherited from TFile • Full support of schema evolution • Beside TTree all other ROOT classes are supported • Additional configuration possible for: • usage of table indexes • transactions • MySQL table types selection • suffixes for column names S.Linev ROOT I/O for SQL databases
Example with simple class class TBox: public TObject, public TAttLine, public TAttFill { Double_t fX1; Double_t fY1; Double_t fX2; Double_t fY2; ClassDef(TBox, 2); }; { TSQLFile f(“mysql://host.domain/test”, “create”,”user”,”pass”); for (int n=1;n<=10;n++) { TBox* b = new TBox(n,n*2,n*3,n*4); b->Write(Form(“box%d”,n)); } } SELECT * FROM TBox_ver2 TObject_ver1 Created tables: • TBox_ver2 • TObject_ver1 • TAttLine_ver1 • TAttFill_ver2 • KeysTable • ObjectsTable • Configurations S.Linev ROOT I/O for SQL databases
Common tables • Three common tables • KeysTable – list of all keys in the file • ObjectsTable – list of all objects id • Configurations – TSQLFile config KeysTable ObjectsTable Configurations S.Linev ROOT I/O for SQL databases
More complicated example TGraph_streamer_ver4 { TSQLFile f(“mysql://host.domain/test”, “update”,”user”,”pass”); TGraph* gr = new TGraph(10); for (int n=0;n<10;n++) gr->SetPoint(n, n+1, (n+1)*(n+1)); gr->Write(“gr”); } TGraph_ver4 parent classes NULL pointer TList_streamer_ver5 ObjectsTable S.Linev ROOT I/O for SQL databases
Support of custom streamers • Important, while lot of ROOT and some user classes has custom streamers • Data, produced by custom streamer, directly written to _streamer_ tables like TList_streamer_ver5 • Special case when custom streamer reads data, written by standard I/O • Even in custom streamer user can split data into different columns of “normal” class table S.Linev ROOT I/O for SQL databases
External access to SQL tables • Easy navigation with simple SELECT statements • One raw in table corresponds to one object • Class name and version for each object can be found in ObjectsTable • TSQLFile::MakeSelectQuery() produce SELECT statement, which aggregates data of object from different tables in one S.Linev ROOT I/O for SQL databases
Example with TBox class Query, produced by f->MakeSelectQuery(TBox::Class()): SELECT t1.`obj:id`, t2.UniqueId, t2.Bits, t2.ProcessId, t3.fLineColor, t3.fLineStyle, t3.fLineWidth, t4.fFillColor, t4.fFillStyle, t1.fX1, t1.fY1, t1.fX2, t1.fY2 FROM TBox_ver2 AS t1 LEFT JOIN TObject_ver1 AS t2 USING(`obj:id`) LEFT JOIN TAttLine_ver1 AS t3 USING(`obj:id`) LEFT JOIN TAttFill_ver1 AS t4 USING(`obj:id`) TObject TAttLine TAttFill TBox S.Linev ROOT I/O for SQL databases
SQL I/O performance • Two aspects: • Time and CPU usage on user host • Quality and number of SQL statements // TFile f(“test.root”,”recreate”); // TXMLFile f(“test.xml”,”recreate”); TSQLFile f(“mysql://host.domain/test”, “recreate”,”user”,”pass”); TClonesArray clones("TBox", 10000); for(int n=0;n<10000;n++) new (clones[n]) TBox(n+1,n+2,n+3,n+4); clones.Write("clones0", TObject::kSingleKey); gBenchmark->Start(“Write"); clones.Write("clones",TObject::kSingleKey); gBenchmark->Show(“Write"); // TFile f(“test.root”,”recreate”); // TXMLFile f(“test.xml”,”recreate”); TSQLFile f(“mysql://host.domain/test”, “read”,”user”,”pass”); TClonesArray* clon = 0, *clon0 = 0; f.GetObject("clones0", clon0); gBenchmark->Start(“Read"); f.GetObject("clones", clon); gBenchmark->Show(“Read"); S.Linev ROOT I/O for SQL databases
Performance measurement * MySQL 4.1 on Fedora Core 4. Pure text queries are used. With ODBC or native MySQL++ client factor 2 to 3 can be gained ** Query text length was limited to 50 KB and long queries were split *** Oracle 10g on SuSE 8. New TSQLStatement class (not yet in ROOT) was used to gain factor 25 in writing compare to standard ROOT S.Linev ROOT I/O for SQL databases
Conclusion • new TSQLFile provides new possibilities for usage of SQL database in ROOT • Tested with MySQL 4 and Oracle 10, can be adopted for other RDBMS • To be done: • enhancement of ROOT TSQLServer classes • new TSQLStatement class is required • investigation of ODBC usage in ROOT • performance optimization S.Linev ROOT I/O for SQL databases