280 likes | 297 Views
Simple and Safe SQL Queries with C++ Templates via the ARARAT System. By Yossi Gil and Keren Lenz Submitted to ICSE’07 Presented by Nimrod Partush. Part I: Introduction. Motivation. The mission: Incorporating DB into the high level language.
E N D
Simple and Safe SQL Queries with C++ Templates via the ARARAT System By Yossi Gil and Keren Lenz Submitted to ICSE’07 Presented by Nimrod Partush
Motivation • The mission: Incorporating DB into the high level language. • Traditional method: Compose a string, send it to the DB engine, hope for the best. • You’re actually writing programs in the SQL language. • No typing. Very easy to shoot yourself in the leg. • Overall a dreaded task. • The Result: Runtime errors, Injection attacks and overall Frail software. Simple and Safe SQL queries with C++ Templates
SQL Programming – A Grave State of Affairs • Shooting gallery: Find the bug. char* get_employees(int dept, char* first) { bool first_cond = true; string s(“SELECT * FROM EMPLOYEES “); if (dept > 0){ // valid dept number s.append( “WHERE DEPTNUM = ‘ “); s.append(itoa(dept)); s.append(“’”); first_cond = false; if (first == null) return s; if (first_cond) s.append(“WHERE “); else s.append(“AND”); s.append(“FIRST_N= ‘ “); s.append(first); s.append(“’”); return s; } 1. Misspelled Name 3. Type Mismatch 2. Syntax error 4. Security Vulnerability , 5. Code Coverage ,6. Maintnance cost Simple and Safe SQL queries with C++ Templates
ARARAT’s Solution Feature 8: Variables can be defined to hold Relations. They even have types and deal with Type Equivalence problems Feature 1: Short and elegant code Feature 9: The same Field name can be used in different Schema with different types. Feature 4: Modular composition of queries Feature 5: Relations and their Schema are identifiers in the C++ code 1 char* get_employees(shortdept, char* first) { 2 DEF_V(e,EMPLOYEE[FIRST_N,LAST_N]); 3 if(first != null) e /= (FIRST_N == first); 4 if(dept > 0) e /= (DEPTNUM == dept); 5 returne.asSQL(); 6 } Feature 6: Typing Feature 3: Natural C++ Syntax Feature 7: Relation “Projection Resistance” Feature 2: OOP Security Vulnerability, Code Coverage , Maintenance cost Problems Solved Simple and Safe SQL queries with C++ Templates
ARARAT’s Overview • The ARARAT system is designed for safe production of SQL queries. • ARARAT = ARA + RAT: • ARA is Augmented Relational Algebra. • RAT is Relational AlgebraTemplate. • In a sense, RAT is ARA’s compiler. Simple and Safe SQL queries with C++ Templates
PART II: ARA a.k.a Augmented Relational Algebra
Reliance on the RA Metaphor • The less verbose RA syntax integrates better in the programming language. • RA allows modular composition of queries. • Elegant statements: dbcon << (EMPLOYEE[SALARY] / (DEPTNUM == 3)); Simple and Safe SQL queries with C++ Templates
RA in ARA notation Simple and Safe SQL queries with C++ Templates
Performing Queries • Queries are executed via Query Objects which embody the desired result relation • A QO’s type is comprised of the scheme of it’s query result. • It’s contents is an abstract encoding of the query which will be translated to a SQL statement by at runtime. 1 char* get_employees(shortdept, char* first) { 2 DEF_V(e,EMPLOYEE[FIRST_N,LAST_N]); 3 if(first != null) e /= (FIRST_N == first); 4 if(dept > 0) e /= (DEPTNUM == dept); 5 returne.asSQL(); 6 } Feature: Modular composition of queries Feature: Variables can be defined to hold Relations. They even have types and deal with Type Equivalence problem Simple and Safe SQL queries with C++ Templates
Composing Query Objects ARA Language: Define two field names which were not defined in the input scheme, to be used in renaming inside the query. 1 #include"rat" // Global RAT declarations and macros 2 #include"employees.h" // Primitive query objects and other 3 // definitions for the ‘‘employees’’ example 4 DEF_F(FULL_N); 5 DEF_F(ID); 6 intmain(intargc, char* argv[]) { 7 conststring s = ( 8 (EMPLOYEE / (DEPTNUM > 3 && SALARY < 3.14)) 9 [FIRST_N, LAST_N, 10 FULL_N(cat(LAST_N, ", ", FIRST_N)), ID(EMPNUM)] 11 ).asSQL(); 12 // ... execute the SQL query in s using e.g., ADO. 13 return0; 14 } ARA Language: Selection ARA Language: Condition ARA Language: Projection ARA Language: Renaming Simple and Safe SQL queries with C++ Templates
ARA’s Syntax • Like all languages, ARA has a grammar: ARA provides us with an eloquent way to compose queries. Does this mean we have to integrate it into C++ compiler? Simple and Safe SQL queries with C++ Templates
PART III: RAT What makes it all work.
Reliance on Template Programming • RAT uses template programming to make the C++ compiler carry out it’s work. • Such as compile time type checking • Complex tasks such as symbol table management • Delegating the SQL type system to the compiler Modifies the C++ language without modifying the C++ compiler! Simple and Safe SQL queries with C++ Templates
Template Programming • Let’s compute in compile time: template <int N> struct Factorial { enum { value = N * Factorial<N - 1>::value }; }; template <> struct Factorial<0> { enum { value = 1 }; }; // Factorial<4>::value == 24 // Factorial<0>::value == 1 void foo() { int x = Factorial<4>::value; // == 24 int y = Factorial<0>::value; // == 1 } Template Recursive Invocation Template Specialization Simple and Safe SQL queries with C++ Templates
RAT vs. SEMT vs. Expression Templates • SEMT (By J. Gil and Z. Gutterman, Presented at COOTS’98) is a library package that performs Compile Time Symbolic Derivation with C++ Templates. • Expression Templates (By Todd Veldhuizen) is a C++ technique for passing expressions as function arguments. This allows syntax like: DEPARTMENT.select(DIVNUM == 2) Simple and Safe SQL queries with C++ Templates
Query Representation • We shall construct a simple query’s QO • Let’s start with the basics • EMPLOYEE will then be (EMPLOYEE / (DEPTNUM < 3))[F_NAME,L_NAME] 1 template<typenameFirst, typenameRest> 2 classRel { public: 3 typedefFirst F; 4 typedefRest R; 5 ... 6 }; Template Programming allows creating new types in a “The whole is a sum of it’s parts” manner Rel<EMPNUM,Rel<DEPTNUM,Rel<FIRST_N,LAST_N>>> EMPLOYEE; Simple and Safe SQL queries with C++ Templates
Query Dual Representation We know that that these are compile time values aimed at type checking, etc. 1 template<typenameFirst, typenameRest> 2 classRel { public: 3 typedefFirst F; 4 typedefRest R; 5 R_TREE * t; 5 ... 6 }; A new member Stores the evaluation procedure. This isa run-time value Every relation has this member therefore It doesn’t change the type • We will soon see the importance of this distinction. • Hint: e /= (DEPTNUM < 3) Simple and Safe SQL queries with C++ Templates
Scalar Representation • Scalars also have a Dual representation. • So our query’s scalar expression will feature run time representation compile time representation 1 class S {public: 2 const S_TREE *t; // Expression tree of S 3 typedef ... TYPS; // Compile time representation of t 4 typedef ... FLDS; // List of fields used in t 5 ... } (EMPLOYEE / (DEPTNUM < 3))[F_NAME,L_NAME] (DEPTNUM < int) (DEPTNUM < 3) DEPTNUM Simple and Safe SQL queries with C++ Templates
Query Composition 1 template<typename Relation, typename Scalar> 2Relationoperator/(const Relation rel, 3 const Scalar scalar) { 4 ... 5 }; • Stage 1: Make sure all of scalar’s FIELDS exist in rel. • Stage 2: Fetch scalar’s FIELD types from rel. • Stage 3: Resolve scalar’s type.perform type checks and make sure it evaluates to bool • Stage 4: Incorporate the scalar into the query’s content but not it’s type Take notice: The return type hasn’t changed! This enables using the same field name, possibly with distinct types, in different tables! After this stage, scalar has no compile time representation. Simple and Safe SQL queries with C++ Templates
Query Composition List of initialized and un-initialized fields of the projection criteria 1 template<typename Relation, typename VOCABULARY > 2Relation’operator[](const Relation rel, VOCABULARY v) { 3 ... 4 }; • Stage 1: Check v for naming correctness. • Renamed fields don’t already exist in rel • other fields exist in rel • Stage 2: Incorporate the projection into the query’s data (run-time). • Stage 3: Change the return type to fit the new schema and return (compile-time). typeof((EMPLOYEE / (DEPTNUM < 3))[F_NAME,L_NAME]) == Rel<FIRST_N,LAST_N> Take notice: The return type is now changed Simple and Safe SQL queries with C++ Templates
Projection Resilience • ARARAT allows SELECT criteria with fields that were projected out 1 template<typenameFirst, typenameRest> 2 classRel { public: 3 typedefFirst F; 4 typedefRest R; 5 R_TREE * t; 6 typedefFirst SymTableF; 7 typedefRest SymTableR; 8 ... 9 }; Saves only Active Fields that dictate the result relation type Saves all the FIELDs we had so far Simple and Safe SQL queries with C++ Templates
RAT vs. SEMT vs. Template Expressions – Round 2 Simple and Safe SQL queries with C++ Templates
Type Equivalence • RAT builds a Type for every query • This type relies on the FIELDs in the return relation • Problem: Schema FIELD order • Solution: __COUNTER__ • a unique ID for every FIELD in order to sort and achieve relation equivalence. Simple and Safe SQL queries with C++ Templates
PART IV: Conclusions אררט? זה טוב ליהודים?
Advantages • Dynamic generation of queries • Unlike SQLJ, SchemeSQL and Embedded SQL. • Generic solution • Unlike SQL DOM. • Minimal changes to the compiler • Unlike LINQ. • No external tools dependence Simple and Safe SQL queries with C++ Templates
Disadvantages • SQL expressive power still not fully achieved • No GROUP BY for instance. • Type equivalence problem. • Can’t union EMPLOYEE[NAME(char*)] and MANAGER[NAME(char*)]. • No reference to important DB issues • Security • Permissions • Concurrency • ARARAT only applies to C++ Simple and Safe SQL queries with C++ Templates
A Special Thanks To Keren Lenz