280 likes | 395 Views
Object Database Semantics: the Stack-Based Architecture. Presentation prepared for Object Database Technology Users and Vendors Roundtable OMG Technical Meeting, Burlingame, CA, December 10-14, 2007 by Prof . Kazimierz Subieta
E N D
Object Database Semantics: the Stack-Based Architecture Presentation prepared for Object Database Technology Users and Vendors Roundtable OMG Technical Meeting, Burlingame, CA, December 10-14, 2007 by Prof. Kazimierz Subieta Polish-Japanese Institute of Information Technology, Warsaw, Poland subieta@pjwstk.edu.pl www.ipipan.waw.pl/~subieta SBA/SBQL pages:www.sbql.pl
Topics • Human and machine semantics – the motivation for SBA • Machine aspects of database semantics • Semantic quarks of OO store models • Functionality, semantics & theories • SBA – an approach to formal semantics • Major topics that SBA deals with • Some examples • Current SBA implementation in the ODRA system • SBA and interoperability
Targets of database semantics • Human (database designer, programmer, database administrator) • Humans perceive the semantics informally • The most important is practice, training and efficiency of work • The informal semantics addressing humans is under influence of a lot of side factors, including beliefs, aesthetics, opinion of authorities, etc. • Machine (systems that have to accomplish the semantics through interpreters, compilers, mappers, etc.) • Machine semantics is always fully formal and deterministic • Human semantics is a guide for developing machine semantics • However, the machine semantics eventually determines the human semantics
Human and machine semantics • Programmer’s understanding of database semantics and machine interpretation of database semantics should coincide. • The programmer must understand database structures on a high abstraction level • The programmer must understand semantics of queries addressing database structures on a high abstraction level, too • No essential details concerning database structures and query semantics (e.g. updating) can be neglected or treated as „implementation issues”. • However, programmers need not be aware of all the machine semantics details • Programmers use a database programming environment intuitively • Machine should precisely follow their thinking and imagination
Database models and SBA • Database semantics depends on the assumed data model • „Data model” is an ideological rather than technical notion • People believe or not. • No mathematics, theory or experience can justify a particular ideology for all future cases and applications • Ideologies can be wrong, based on misleading superficial rhetoric • The relational database model is an ideology, supported by (very limited) mathematical theories • OO is an ideology, too • SBA is a theory supporting OO database models, • SBA is incomparably more powerful than relational theories • However, SBA is neutral to database models • It can be used to object-oriented, XML, relational and other models
Machine aspects of database semantics • Machines deal with data structures rather than with data models • For this reason in SBA we talk about data store models • i.e. formal concepts determining organization of data structures • A store model is purely formal, it does not involve an „ideology” • Features of a data model are reflected in data structures indirectly • Some features of a store model appear as the result of orthogonality criteria, beyond the data model • Database queries and programs formally address formal data structures
Semantic quarks of OO store models • There are many object-oriented store models that are significantly different and possess incompatible features • Smalltalk, C++, Java, C#, CORBA, ODMG, SQL-99, XML, RDF, … • The models tend to be complex and non-intuitive • The same notions are understood differently • Is it possible to unify them on some common ground? • SBA reduces the models to „semantic quarks”: object identifiers, atomic values and object names • They are used to build object store models practically with no limits • Some principles (known for 40 years): • Object relativism: each component of an object is an object • Total identification: each run-time entity (e.g. object) should possess an internal identifier (used as a reference to the entity)
Functionality of queries – where is the limit? • All theories devoted to the relational model assume some limited role of queries • E.g. the relational algebra: even 2+2 is beyond • SQL: limits of the relational algebra are not reasonable • SQL-99 has the power of universal programming languages, far more than the „relationally complete” languages can do • Limited role of queries in the database programming is assumed in all proposals concerning OO query languages • models, theories, OODBMS, ODMG standards, … • SBA abandons this philosophy
SBA: no limit for applications of QLs • Any functionality may require queries • Queries take the role of expressions of programming languages • Hence new attitude to the semantic description, in particular, to theories that are the basis for semantics • SBA removes the border between querying and programming • SBA theory is a continuation of the programming language theory rather than database theories.
Semantics of query & programming languages • Usually, semantics is intuitively explained rather than formally specified • This is typical for programming languages • Formal specification of semantics is complex, boring, full of difficult concepts, frequently containing bugs and unspecified parts • Intuitive specification of semantics makes problems for standards: • Ambiguous specification => many incompatible implementations • Contradictory specification => no correct implementation • No reasoning on redundancy, equivalence or incompleteness of language constructs • No formal semantics => poor query optimization, difficulties with strong typing, and other problems • Even smallest semantic problem is a very big problem • Especially for standardization aiming at code portability
Precision, Functionality and Universality • Precision • Simple, non-ambiguous model of data structures being queried • Non-ambiguous semantics of query and programming constructs • Functionality and Universality • A formal object model should cover (almost) all features of the current object models, including UML, CORBA, XML, Java, C++, WSDL • A complete query and programming language addressing the model • What does it mean „complete”? • Complete = practically universal: the power of PLs + interoperability + performance + client/server + transactions + database abstractions + …. • Mistake of current database models & theories: neglecting updates • Non-redundancy • Keep the model and the language as lean as possible
SBA – an approach to formal semantics • Formal models of data stores: • Based on the semantics quarks and assumed principles • Models M0, M1, M2 and M3 cover basic features of majority of object models, including complex objects, classes, inheritance, roles and encapsulation • Other features can be introduced by small variations of M1-M3 • Functionality of SBQL and its programming capabilities address all features of M1-M3 • Semantics of SBQL is expressed in a way specific to programming languages • SBA non-algebraic operators – known from relational languages, but defined in the spirit of programming languages
Approaches to formal semantics • There are many approaches, in particular: • Relational/object algebras (or calculi) • 1st order logic • Denotational semantics • Operational semantics • etc. • In the specification of semantics three aspects are important: • Formal specification • Supporting all imaginable data structures and query functionalities • Communication with developers who must understand the semantics to implement it • Academic people usually address only the first aspect • For other aspects only operational semantics is adequate
Abstract implementation as semantic specification • It is a kind of operational semantics based on abstract machine that accomplishes query/program processing • SBA abstract machine introduces three well-known structures: • object store, • environment stack (thus SBA), • query result stack. • These structures are fundamental for precise semantic description of everything that may happen in database query/programming languages. • Classical query operators, such as selection, projection, joins and quantifiers, can be generally and precisely specified • Updating constructs, programming abstractions, database abstractions, strong typing, etc. can be expressed in terms of abstract implementation.
SBA – power through orthogonality • SBA discovers and employs semantic quarks of query languages • Primitive queries: literals and names • Semantics of a complex query is build from semantics of its parts • Environment and result stacks as a mechanism of query composition • Semantics based on object references rather than object values • Qualities of the orthogonality and semantic quarks • Powerful theory and reasoning on features of languages • Easier implementation • Much shorter programmers’ manuals • Powerful query optimization methods and a strong typing system • Much easier teaching and developing general principles • Supporting inventions and new ideas
The idea of SBA • Unification of PL expressions and queries: • 2, ”Smith” • salary, x, Employee • 2+2 , (x+y)*z • Employeewhere salary = 1000 • (Employeewhere salary = (x+y)*z).surname • All such expressions/queries used as: • arguments of imperative statements, • parameters of procedures, functions or methods • a return from a functional procedure (from a method) • Expressions/queries + programming capabilities used for: • Procedures, functions, classes, types, inheritance, roles, … • Virtual updatable views • Various other forms of database abstractions (triggers, business rules, constraints, transactions,…)
Major topics that SBA deals with (1) • General architecture of query processing • Abstract models of object stores • Syntax, semantics and pragmatics of query languages • Semantics of algebraic and non-algebraic operators • Classes, methods and static inheritance in query languages • Dynamic object roles and dynamic inheritance in query languages • Processing of irregular data structures (semi-structured data) • Transitive closures and fixed-point equations • Imperative (updating) constructs • Procedures, functions and methods
Major topics that SBA deals with (2) • Parameter passing for procedures, functions & methods • Encapsulation • Virtual updatable views • Types, interfaces, schemas and metamodels • Static (semi-) strong type checking of queries and programs • Query optimization (rewriting, indices, caching, …) • Query processing and optimization in distributed systems • Data-intense grids and P2P networks: integration of distributed, heterogeneous, fragmented and redundant resources • Aspect-oriented databases • OMG MDA and executable UML + OCL
SBQL queries • Get all information on departments for employees named Doe: (EmpwherelName = “Doe”).worksIn.Dept • Get the name of Doe’s boss: (EmpwherelName = “Doe”).worksIn.Dept.boss.Emp.lName • Names and cities of employees working in departments managed by Kim: (Deptwhere (boss.Emp.lName) = “Kim”).employs.Emp. (lName, ifexists(address) thenaddress.cityelse “No address”) • For each employee get the name and the percent of the annual budget of his/her department that is consumed by his/her monthly salary: Emp . (lNameasn, (((ifexists(sal) thensalelse 0) ass). ((s * 12 * 100)/(worksIn.Dept.budget)) aspercentOfBudget)
SBQL programs (ODRA) • For each person having no salary give the minimal salary in his/her department: for each (Empwhere notexists(sal)) ase{ e.changeSal( min(e.works_in.Dept.employs.Emp.sal) )} • A method: changeSal(newSal: real): boolean {if (not exists(self.sal)){ sal: real[0..1];self :< createsal(newSal); } else {if (self.sal > newSal) return false;else self.sal := newSal; }return true; }
SBA/SBQL in recent (pending) projects • ODRA (Object Database for Rapid Applications) - queries, imperative constructs, programming abstractions, classes, types, methods, inheritance, modules, query optimization,... • European project eGov Bus. Integrating distributed resources being under control of various European governmental institutions. • SBQL as an embedded QL for application programming in Java. • SBQL as self-contained DBPL for application programming. • Virtual repository based on SBQL virtual updateable OO views • European project VIDE - developing a visual programming language for the OMG MDA. • OCL and other concepts related to Executable UML are implemented • XML2XML mapper based on SBQL (more powerful than XSLT)
Current functionality of ODRA Object model: complex objects, collections, associations, classes, inheritance, polymorphism, types and schemata ODRA IDE SBQL queries: many algebraic and all non-algebraic operators, transitive closures, function, procedure and method calls Typing system: semi-strong static type checking, dynamic type checks Imperative (updating) statements and control statements Procedures, functions with parameters, recursive Virtual Updatable Views Transactions Multiple-client/multiple server architecture Accessing Java Libraries Accessing Web Services Wrapper to Relational DB XML Importer and Exporter RDF Wrapper Web Services Front End ODRA Web API (JSP + SBQL) ODRA JOBC ODRA Indexing ODRA Access Control
Interoperability with RDBMS • The ORM problem is not (cannot be?) properly solved on the ground of current object-oriented technologies, including Java. • If the mapping between the models would be complex, then: • Performance can be unacceptable for very large databases (the SQL query optimizer has no chances to work). • Updating leads to non-trivial view updating problems. • Practically, only limited mappings are acceptable • The problem is much easier on the ground of SBA, due to virtual updateable views: • Because SBA is neutral to data models, it is possible to see a relational database as a primitive object database queried by SBQL • Then, SBQL virtual updatable views make it possible to map the relational database to an object database with full algorithmic power • This technology is implemented in ODRA.
Interoperability with popular programming languages: Java, C++, Ruby, etc. • A difficult problem with a lot of (partial) solutions: • Bindings in the style of embedded SQL or ODMG • An interface from a PL to SBQL in the style of ODBC/JDBC • A generic gateway from SBQL to libraries written in other languages • Generic middleware based e.g. on CORBA or Web Services • Generic middleware based on a virtual repository • The MDA case: CIM => PIM => PSM => code • If the transformations between the models are to be done automatically, the problem is difficult • Manual transformations? • Using native syntax (Java, Ruby, etc.) to query some external resources (e.g. OO databases) – a very challenging problem. • No general solution.
Conclusions • To make a high quality standard for object-oriented databases, the specification of semantics is the must, • to avoid the fate of SQL-99 and ODMG standards • SBA offers the unique method of query languages’ construction and semantic specification. • SBA is a holistic database theory, it doesn’t give up any (even the most advanced) feature of current practical OO database QL/PL. • Michi Henning, ZeroC: „No standard should be approved without a reference implementation. … • No one is brilliant enough to look at a specification and be certain that it does not contain hidden flaws without actually implementing it.” • SBA has been implemented more than 10 times, for different systems and purposes. • PJIIT can support OMG with a reference implementation of a new object database standard.