210 likes | 300 Views
AraRat -. Simple and Safe SQL Queries with C++ Templates. M.Sc. Seminar - Keren Lenz Supervisor - Dr. Yossi Gil July 1 st 2007. Motivation. A relational DB with an Employee table Field 1: EMPNUM (integer) Field 2: DEPT ( smallint ) Field 3: FIRST_N (string)
E N D
AraRat- Simple and Safe SQL Queries with C++ Templates M.Sc. Seminar - Keren Lenz Supervisor - Dr. Yossi Gil July 1st 2007
Motivation • A relational DB with an Employee table • Field 1: EMPNUM (integer) • Field 2: DEPT (smallint) • Field 3: FIRST_N (string) • Field 4: LAST_N (string) • Field 5: SALARY (double) • Mission: refer to the DB from a high level language application.
SQL Queries – The Old waycan you spot the bugs? char* get_employees(int dept, char* first) { bool first_cond = true; string s(“SELECT FIRST_N, LAST_N FROM EMPLOYEES “); if (dept > 0){ // valid dept number s.append( “WHERE DEPT = ‘ “); s.append(itoa(dept)); s.append(“’”); first_cond = false; } if (first == null) return s; if (first_cond) s.append(“WHERE “); else s.append(“AND”); s.append(“FIRST_N= ‘ “); s.append(first); s.append(“’”); return s; } 1. Misspelled Name 3. Type Mismatch 2. Syntax error 4. Security Vulnerability , 5. Code Coverage ,6. Maintnance cost
Short and elegant code SQL Queries - AraRat’s Way Schema awareness – No misspelling 1 char* get_employees(shortdept, char* first) { 2 DEF_V(e,EMPLOYEE[FIRST_N,LAST_N]); 3 if(first != null) e /= (FIRST_N == first); 4 if(dept > 0) e /= (DEPT == dept); 5 returne.asSQL(); 6 } Type safe Natural C++ Syntax Automatic generation of queries – no syntax errors Security Vulnerability, Code Coverage , Maintenance cost Problems Solved
AraRat’s Overview • GOAL : safe production of SQL queries. • AraRat = Ara + Rat: • Ara is Augmented Relational Algebra. • Rat is Relational AlgebraTemplates. • In a sense, Rat is Ara’s compiler.
Ara’s Operators RA Operator Ara Operator SQL equivalent selection R/c select * from R where c projection R[f1,f2] select f1,f2 from R union R1+R2 R1 union R2 difference R1-R2 R1 - R2 natural join R1*R2 R1 join R2 left join R1<<R2 R1 left join R2 right join R1>>R2 R1 right join R2 rename b(a) a as b
Ara's Principles • Queries are composed via Query Objects • C++ first class objects. • A query object has: • Type – Encodes scheme of the resulting relation • Same set of fields => same type • Content – the procedure for constructing the query
Primitive Query Objects • A Primitive QO is defined for each DB table • Option 1: Manually, using macros • Option 2: Automatically by the DB2ARA tool • Represent a query that returns all fieldsSELECT * FROM <RELATION_NAME> DEF_F(EMPNUM) DEF_F(DEPT) DEF_F(FIRST_N) DEF_F(LAST_N) DEF_F(SALARY) DEF_R(EMPLOYEE, (EMPNUM/integer, FIRST_N/String, LAST_N/String, DEPT/SmallInt, SALARY/double));
Composite Query Objects • Constructed using Ara's operators • The operands: • QOs (either primitive or composite) • fields • variables • literals DEF_V(e,EMPLOYEE[FIRST_N,LAST_N] / (DEPT > 3));
A QO Can ... • ... be stored in a variable • ... receive the asSQL()message • ... be used in an Ara expression • ... be passed to the TUPLE_T macro • returns the type of tuples that the query return DEF_V(e,EMPLOYEE); e = EMPLOYEE / (DEPT == 3); string s = e.asSQL(); DEF_V(e1,e[FIRST_N, LAST_N]); e1 /= (EMPNUM < 100); TUPLE_T(e)** res = new TUPLE_T(e)*[100];
Using Ara’s Syntax #include"rat" // Global RAT declarations and macros #include"employee.h" // Primitive query objects and other DEF_F(FULL_N); DEF_F(ID); intmain(intargc, char* argv[]) { conststring s = ( (EMPLOYEE / (DEPT > 3 && SALARY < 3.14)) [FIRST_N, LAST_N, FULL_N(cat(LAST_N, ", ", FIRST_N)), ID(EMPNUM)] ).asSQL(); // ... execute the SQL query in s using e.g., ADO. return0; }
So Far… Ara provides an elegant way for composing queries… How does it do that? Do we need to change the C++ compiler ?
Template Programming • C++ templates are Turing-complete • Can compute anything at compile-time template <int N> struct Factorial { enum { value = N * Factorial<N - 1>::value }; }; template <> struct Factorial<0> { enum { value = 1 }; }; // Factorial<4>::value == 24 // Factorial<0>::value == 1 void foo() { int x = Factorial<4>::value; // == 24 int y = Factorial<0>::value; // == 1 }
A Compile-time List struct Nil { static const int length = 0; }; template<class F, class R=Nil> struct List { typedef F First; typedef R Rest; static const int length = 1 + Rest::length; }; typedef List<int, List<short, List<double, List<float, Nil> > > > typesList;
Implementation–Take I • Both scheme and content are compile-time constants: • Problem: • Every query has its own type • Query objects are immutable template<typenameFIELDS, typenameCOND …> classQueryObject { public: typedefFIELDS fields; typedefCOND cond; … }; QueryObject< LIST(EMPNUM, DEPT, FIRST_N, LAST_N), GREATER(DEPT, 3) > q;
Implementation–Take II • Dual query representation • Only the result scheme is encoded at compile time • Allows: e /= (DEPT < 3); • Problem: • Projected-out fields are not part of the type • Cant be used in selection conditions template<typenameFIELDS > classQueryObject { public: typedefFIELDS fields; R_TREE * t; }; Stores the evaluation procedure. This isa run-time value
Implementation–Take III • Projection Resilience • Allows SELECT criteria with fields that were projected out template<typenameFIELDS , typename DICTIONARY> classQueryObject{ public: typedefFIELDS fields; R_TREE * t; typedefDICTIONARY dict; ... }; Saves only Active Fields that dictate the result relation type Saves all the FIELDs we had so far DEF_V(e,EMPLOYEE[FIRST_N,LAST_N]); e /= (DEPT == dept);
Type Checking template<typename Q, typename E> Qoperator/(Q& q, const E& e); • Make sure all of E’s FIELDS exist in Q’s dictionary • Fetch E’s FIELD types from Q. • Enables using the same field name with distinct types in different tables • Resolve e’stype. • It must be bool • Incorporate e’scontent, e.t, into q.t • Changes q’scontent but not it’s type • e has no compile time representation in the q
Type Equivalence • Rat builds a Type for every query • This type relies on the FIELDs in the return relation • Problem: Schema FIELD order • Solution: __COUNTER__ • a unique ID for every FIELD in order to sort and achieve relation equivalence.
Further Research • Extend expressive power • Updates to the database • Allow GroupBy • Challenge: nested relations • Embedding of other little languages in C++ • XML • Challenge: recursive types • AraRat in Java • Require compiler modifications • Relational Algebra calculus for collections of objects.