300 likes | 397 Views
Updating Issues in SBQL. Presentation prepared for OMG Object Database Technology Working Group OMG TECHNICAL MEETING, Brussels, Belgium June 25th-29th, 2007 by Prof . Kazimierz Subieta Polish-Japanese Institute of Information Technology, Warsaw, Poland subieta@pjwstk.edu.pl
E N D
Updating Issues in SBQL Presentation prepared for OMG Object Database Technology Working Group OMG TECHNICAL MEETING, Brussels, BelgiumJune 25th-29th, 2007 by Prof. Kazimierz Subieta Polish-Japanese Institute of Information Technology, Warsaw, Poland subieta@pjwstk.edu.pl http://www.ipipan.waw.pl/~subieta SBA/SBQL pages:http://www.sbql.pl
Topics • Approaches to updates via a query language • SBQL as integrated QL/PL • Updating in integrated QL/PL – issues • SBQL imperative statements • SBQL program control statements • SBQL procedures and methods • SBQL virtual updateable views • SBQL transactions
Approaches to updates via a query language • No approach: updates via methods in some host programming language - ODMG OQL • Bottom-up: a side-effect-free query language is extended with updating capabilities: • SQL, PL/SQL, T-SQL, XQuery, Hibernate,… • To some extent: SQL-99, SQL-2003 (extensions of SQL-92) • In some cases (PL/SQL, T-SQL, SQL-99, SQL-2003) the approach leads to a new programming language. • Top-down: Integrated query and programming language. • Several DataBase Programming Languages (DBPL), e.g. SBQL; • No definite border line between querying and programming • A unified and universal conceptual and semantic frame for queries and programs involving queries, including procedures, functions, classes, types, methods, views, etc.
Integrated QL/PL – decisions (1) • The SBA/SBQL solution relies on adopting a run-time mechanism of PLs and introducing necessary improvements. • The main syntactic decision is the unification of PL expressions and queries- no conceptual difference: • 2+2 • (x+y)*z • Employeewhere salary = 1000 • (Employeewhere salary = (x+y)*z).surname • All such expressions/queries can be used as: • Arguments of imperative statements (update, insert, delete, …) • Actual parameters of procedures, functions or methods • Return from a functional procedure (method).
Integrated QL/PL – decisions (2) • Queries should be prepared to return references to objects • No more value-oriented approach – no algebras, calculi, formal logic,… • References returned by queries can be used as: • Left sides of assignments • Arguments of delete statements • Out and inout parameters (call-by-reference, strict-call-by-value) • … • Some imperative statements are to be designed as macroscopic. • Orthogonal persistence: a unified typing system for queries addressing persistent (shared) and volatile (non-shared) entities • No difference in access to persistent and volatile data • Total internal identification: to return references each database or program entity, which could be separately retrieved, updated, inserted, deleted, etc., should possess a unique internal identifier.
Total internal identification • SBQL object (ODRA): each object, attribute, sub-attribute, pointer, etc. has a unique internal identifier.
Updating in integrated QL/PL – issues • Imperative constructsbased on queries (macroscopic): variable declarations, assignments, create, insert, delete, etc. • Program control statementsbased on queries: if, loops, for each, etc. • Procedures and functionsbased on query constructs • New issues for strong typing • Parameter passing methods based on queries (in and out parameters) • Local procedure/function objects (based on a unified typing system) • Functions with macroscopic output (SQL-like views) • Classes and methods • Transactionsand transaction processing – for updating shared resources • Virtual updateable O-O views(consistent updating of virtual objects) • Events and triggers • Interoperability issues, in particular: • Updating relational databases via object-oriented queries • Gateways to/from OO PLs, e.g. Java.
Integrated QL/PL - naming, scoping, binding • Integrated QL/PL requires careful designing of naming, scoping and binding mechanisms • The common PLs’ approach is that scopes are organized in an environment stack with the “search from the top” rule. • Some extensions to the structure of stacks used in PLs are necessary. • Query operators, imperative programming constructs and procedures (functions, methods, views, etc.) are defined in terms of the three internal data structures: • Environment stack (for scoping and binding names) • Query result stack (for storing temporary and final query results) • Object store (for storing all persistent and volatile data entities) • For strong typing and query optimizations the stacks must also exist in static (compile time) versions. • They store and process type signatures.
Variable (object) declaration • In ODRA any variable (object) must be declared. • The declaration must be visible to the environment against which a given query is executed. • The variable declaration has the following syntax: name: type [cardinality] • For instance: x:integer; Emp: EmpType [0..*]; typePersonTypeisrecord{name: string;age: integer;}; georg:PersonType; Variables can be declared as persistent(shared, on a server), temporal(session’s), local(to a procedure, function or method). • The concept of cardinality (as in UML) instead of „collections”.
Object creation • Objects are created by the create operator. • It is checked according to types and cardinality. • Syntax: create[where] name(query); • Semantics: • The operator is macroscopic (one statement can create many objects) • Parameterized by a place indicator (where) - permanent, temporal, local • Can be used to create each kind of objects (simple, complex, pointer). • Requires appropriate variable declaration. • Simple object creation createamount(2500); • The persistency status depends on the context; createpossibleMeetingDate(2007-06-04 union 2007-09-12); create localfullName( (EmpwhereworksIn.Dept.name = “adv”).(fName + “ “ + lName)); • Pointer creation: createpermanenthighPayed( ref (Empwheresal > 3000) );
Complex object creation • To create a complex object the query must return structures with named fields or a reference to a complex object. • The reference will be automatically dereferenced; • ref for creating pointers; • If the argument query returns a bag, many objects are created. • Create a shared complex Emp object create permanentEmp( “Tom” asfName, “Jones” aslName, 2500 assal, ref (DeptwheredName = “adv”) asworksIn, ( ref (DeptwheredName = “pr”) union ref (DeptwheredName = “retail”) ) groupasprevJobPlace );
Assignments and queries • In PLs, the assignment operator has the (possible) syntax : lvalue := rvalue; • lvalue and rvalue are expressions • lvalue must return a reference to an entity (e.g. to a variable) • rvalue returns a new value assigned to the entity (deref is enforced) • Types of lvalue and rvalue must coincide. • Can both lvalue and rvalue be queries (returningcollections)? • Can lvalue return a reference to a complex object and what in such a case rvalue shoud return (the substitutability problem)? • How to define the concept of „complex value”? • Can lvalue be a reference to a pointer what in such a case rvalue should return?
Assignments in SBQL lquery := rquery; • We do not allow macroscopic assignments: • lquery must return a single reference • rquery must return a single value (with automatic deref enforced). • Other solutions are inconsistent • Instead, we allow to nest assignments into control statements: foreachEmpwherejob = ”programmer” do{ sal := sal +100; job := ”engineer”; }; • The solution is like SQL update, but SBQL is more orthogonal: foreach (avg(Emp.sal)asajoin(Empwheresal<a) as e) do{ e.sal := a + 100; e.job :=”programmer”; };
Assignments to complex objects • SBQL supports this feature. • It requires the definition of „complex value” that can be calculated at the right side of the assignment. • We have defined it through the concept of binder, i.e. an entity n(x), where n is a name, x is any (perhaps complex) value. • Operators as and groupas – creating binders • Operator ref – prevents dereferencing (EmpwherelName=„Jnes”) := ( “Tom” asfName, “Jones” aslName, 2500 assal, ref (DeptwheredName= “adv”) asworksIn, ( ref (DeptwheredName= “pr”) union ref (DeptwheredName= “retail”) ) groupas prevJobPlaces );
Assignments to pointers (links) • Requires a reference to an object as the right hand operand. (EmpwhereeNbr = 4419).worksIn := ref (DeptwheredName = “adv”); • ref can be omitted due to type inference. • Assignments to binary links: • Any binary (two-way) link is considered as twin pointers semantically constrained; e.g. worksIn and employs. • Assignment to one of them triggers a corresponding operation on its twin. • See the ODMG standard, C++ binding.
Insertion lQuery :< rQuery;lQuery :<< rQuery; • Inserts an object into another object. • The result of lQuery is a reference to a complex object. • The result of rQuery is a bag of references to objects being inserted. • Insertion is type and cardinality checked. • Insert new prevJobPlace pointer object into Doe’sobject: (EmpwherelName=“Doe”):< createprevJobPlace( ref(DeptwheredName=“pr”)); • A variant of insertion – create and insert operator (EmpwherelName=“Doe”):<< prevJobPlace( ref(DeptwheredName=“pr”));
Deletion deletequery; • Removes objects from the store. • The operator is macroscopic. • Concerns all kinds of run-time program or database entities. • The result of operand query have to be a reference or a bag of references. • Can be used to delete each kind of objects (simple, complex, pointer). • Type checking concerns the cardinality after deletion. • Delete location London from the Marketing department. delete (DeptwheredName = ”Marketing”). (locasxwherex = ”London”).x;
Program control statements • We implemented typical statements known from many PLs: if querythenstatement1elsestatement2 if querythenstatement while query dostatement dostatementwhile (query) for( istmnt;cquery;incstmnt ) dostatement • query and cquery must return a boolean value. • Example: ifcount( Empwherehireyear = 2006) - count( Empwherehireyear = 2005) > 100 thenreport :<< note(“employment increase achieved”); elsereport :<< note(“employment increase not achieved”); • statement ::= {statement_list}
For Each statement foreachquerydostatement • Iterates through elements of a collection determined by a query. • query is evaluated first, it should return a bag. • For each bag element r its internal environment nested(r) is calculated and pushed at the top of the environment stack. • After statement execution the environment nested(r) is destroyed. • Example: Increase by 100 the salary of employees having salary below the average. • Without “iteration variable”: foreachEmpwheresal < avg(Emp.sal) dosal:= sal + 100; • With “iteration variable”: foreach (Empwheresal < avg(Emp.sal)) asedoe.sal:=e.sal + 100;
SBQL procedures and methods • Procedures are special complex objects. • Inside modules - treated as global procedures. • Inside classes – treated as methods and called in the instance context. • Inside views – treated as local to views. • Encapsulate arbitrary complex computation. • Local objects and actual parameters are invisible from outside. • Can be parameterized by parameters and/or by the state. • Little distinction between procedures and functional procedures. • A functional procedure call is a query, but with possible side effects. • The result of a functional procedure is typed, similarly to other PLs. • Syntax of procedure declaration: name([parameter_list]):[returntype] {statement_list} • Return can be determined by any query, according to returntype. • Typical stack-based semantics • Any recursive calls are supported, with no special declaration.
Parameters of procedures and methods • The parameter passing technique implemented in ODRA is known as strict-call-by-value: • Actual parameter determined by a query is evaluated before the procedure execution. • The result is stored at the procedure activation record as a binder (named value). • The method combines call-by-value and call-by-reference in a very general fashion. • It allows the programmer to pass as a parameter the result of any complex query that combines atomic values, references, auxiliary names, structures, bags, sequences, etc. • Parameter declaration syntax: name: type[cardinality] • If the cardinality is not specified the default [1..1] is assumed.
SBQL: example of a procedure • Procedure ChangeDept moves the specified employees to the specified department; returns the number of the moved employees. • Let Kim become the boss of all designers working so far for Lee: procedureChangeDept( E: EmpType[0..*]; D: DeptType ): integer { delete ( Dept . employs ) whereEmpinE; for eachEasedo{ D :<< employs( ref e ); e . worksIn := ref D }; returncount(E); }; ifChangeDept( Empwherejob = “designer” and (worksIn.Dept.boss.Emp.lName) = “Lee”; Deptwhere (boss.Emp.lName) = “Kim” ) = 0 thenprintString(”No effect”);
Updating via procedure return • SBQL functions may return references, which can be then used in imperative statements: • Procedure EmpSalBoss returns references to names of employees earning less than 2000, to their salaries and to their boss names: procedureEmpSalBoss(): record{e: ref string; s: ref integer; b: ref string}[0..*] { return (Empwheresal < 2000). (lNamease, salass, (worksIn.Dept.boss.Emp.lName) asb); }; • Updating through the return is possible: for each (EmpSalBosswheree = ”Doe” andb = ”Lee”)dos:= s + 100; • Hence SBQL functions may work as updateable views. • In general, such updates lead to inconsistency(view updating problem). • We have developed and implemented special updateable views.
SBQL virtual updateable views • SQL views have limitations that restrict their applications: • Limited power of a view definition facilities (far below the full algorithmic power); • Limited data model (only relational tables); • Limited view updating (updating of virtual tables is prohibited or severely restricted); • Performance can be compromised by the use of views. • instead of trigger views of Oracle, SQL Server and DB2 • Relax the third limitation. • SBQL views: • No limit concerning the algorithmic power; • No limit concerning the datamodel; • No limit concerning the semantics of view updating; • Powerful optimization methods of queries involving views.
SBQL views - generalities • Some applications may require updating of virtual data: • Updates of virtual data are to be mapped into updates of stored data. • Typically (SQL) these updates are made by side effects of view invocations. • In SBQL we take another point of view. • Our method is based on overloading generic updating operations (create, delete, update, insert, …) acting on virtual objects by invocation of procedures that are written by the view definer. • The procedures have full algorithmic power. • Full transparency of virtual objects: they cannot be distinguished from stored objects by any programming option. • High-level view definition, full algorithmic power, any datamodel, view updating anomalies controlled by the programmer, the optimization potential.
View example • Delivers virtual objects named EmpBoss, withattribute name (of an employee) and bossName (of his/her boss). • Updating may concern bossName => a corresponding employee is moved to the department managed by the new boss. • Move Doe to the department managed by Lee: • viewEmpBossDef{ • virtual objectsEmpBoss: record{e:refEmp;}[0..*]{ returnEmpase; }; • viewnameDef{ virtual objectsname: record{en: string;}{ • returne.lNameasen;}}; • on_retrieve: string { returnen; } }; • viewbossNameDef{ virtual objectsbossName: record{bn: string;}{ • returne.worksIn.Dept.boss.Emp.lNameasbn; }; • on_retrieve: string { returnbn; }; • on_update(newBoss: string){ • e.worksIn := ref (Deptwhere (boss.Emp.lName) = newBoss); }}} • (EmpBosswherename = ”Doe”).bossName := ”Lee”;
Transactions in SBQL • Shared objects must obey the transactional semantics. • In SBQL we have introduced transactions in a different form in comparison to ODMG (and other proposals). • The basic assumption is that each transaction must possess an identity in a source code and during runtime. • Transactions are similar to procedures – they have a name, parameters and local objects. • Syntactic difference concerns some keywords. • Semantic difference concerns ACID properties. • Nested transactions are possible. • In runtime transactions are represented by objects of a special class • Managed by SBQL. • DBA has rights to get some privileges to access to these objects. • This is especially important in distributed database environments and protocols such as 2PC.
Conclusions • Any programming environment must support updates • This concerns a new object-oriented database standard • In SBA/SBQL we follow seamless integration of querying and programming capabilities • Delegating updates to a host PL → impedance mismatch • Ad hoc extending queries with updates → limitations and inconsistencies • Unification of PL expressions and queries • For updating, queries must return references • Classical stack-based semantics w.r.t. queries • Queries can be used as components of imperative statements, as parameters of procedures and as a return from a functional procedure • Procedures, functions, types, classes, methods, updateable views, transactions, etc. are abstractions that can be based on queries • They should be components of a new OO database standard.
Acknowledgement • This work is supported by the European Commission 6-th Framework Programme, Project VIDE - VIsualize all moDel drivEn programming, IST 033606 STP • VIDE Participant List (in random order) • SAP AG (Germany) • SOFTEAM (France) • Institute for Information Systems at the German Research Center for Artificial Intelligence (Germany) • IESE Fraunhofer (Germany) • Polish-Japanese Institute for Information Technology (Poland, coordinator) • FIRST Fraunhofer (Germany) • TNM Software GmbH (Germany) • Bournemouth University (United Kingdom) • Rodan Systems S.A. (Poland) • ALTEC (Greece)