290 likes | 467 Views
ORMs and Hibernate. Glenn Paulley, Director, Engineering http://iablog.sybase.com/paulley. What is object-relational mapping?. Object-oriented programming technologies are typically used to implement business logic Relational databases are used for persistent data storage
E N D
ORMs and Hibernate Glenn Paulley, Director, Engineering http://iablog.sybase.com/paulley
What is object-relational mapping? • Object-oriented programming technologies are typically used to implement business logic • Relational databases are used for persistent data storage • Impedance mismatch between the two paradigms: objects vs. relations • Estimated that 30-40% of a JDBC application involves coercing data from tuples to object instances and back again • ORM toolkits are designed to address this impedance mismatch • 61 different ORM toolkits are listed in Wikipedia for Java, C++, Delphi, Ruby, .NET, PHP, Python, Perl
What is object-relational mapping? • To exploit object behaviour fully, data-access from within an object-oriented programming language should offer: • Separation of concerns • Information hiding • Inheritance • Change detection • Uniqueness capability • Database independence
Change detection • ORM toolkits require mechanisms to track changes to objects made by the application • When a transaction is complete, write the changes to the database within an atomic transaction • Need appropriate guarantees to prevent lost updates
Uniqueness • Mappings are usually a correspondence between a row in a normalized table and a class instance • Specified using metadata • For example, a row of the Employee table will correspond to an instance of the Employee object within the application • Mappings are often not isomorphic • Sophisticated ORMs such as Hibernate and LINQ permit object models that differ substantially from the underlying relational store • Object-oriented language features offer greater semantic flexibility in application design than 1NF values from a relational database • Need to establish a correspondence between an in-memory object and a database row • Must be independent of how the object was acquired: a database query, or navigating a reference to another object • Predicated on the existence of primary keys in the database
Database independence • Many ORM toolkits attempt to offer database independence, so that applications can be ported from one DBMS to another • Create common APIs and models to interact with a variety of DBMS platforms • Useful with mobilized applications where the consolidated database is one DBMS, and local databases are different
Why are ORMs useful? • Eliminates tedious, repetitive code that instantiates object instances from tuples using a SELECT statement and a CURSOR • Insulates, to some extent, the application developer from vendor-specific SQL extensions • Permits the application developer to exploit object-orientation and model and manipulate the application view differently from the relational model • Data manipulation can be done at the object level, rather than (only) at a SQL statement level
Challenges of ORMs • ORM toolkits introduce an additional level of complexity to the application • Example: Java Hibernate 3.2.6 is • 266 packages, 1938 classes, 18,680 functions, over 118K LOC • Can be difficult to debug, perform performance analysis • Most frameworks suffer from a lack of appropriate tools • Performance analysis is problematic because the application’s behaviour is not tied directly to specific interactions with the database • Complex mappings may cause very complex SQL queries to be generated • Can be difficult for the application developer to understand what caused their construction
Complex SQL (LINQ generated) SELECT [Project9].[ContactID] AS [ContactID],[Project9].[C1] AS [C1],[Project9].[C2] AS [C2],[Project9].[ContactID1] AS [ContactID1],[Project9].[SalesOrderID] AS [SalesOrderID], [Project9].[TotalDue] AS [TotalDue] FROM ( SELECT [Distinct1].[ContactID] AS [ContactID], 1 AS [C1], [Project8].[ContactID] AS [ContactID1], [Project8].[SalesOrderID] AS [SalesOrderID], [Project8].[TotalDue] AS [TotalDue], [Project8].[C1] AS [C2] FROM (SELECT DISTINCT [Extent1].[ContactID] AS [ContactID] FROM [DBA].[Contact] AS [Extent1] INNER JOIN [DBA].[SalesOrderHeader] AS [Extent2] ON EXISTS (SELECT cast(1 as bit) AS [C1] FROM ( SELECT cast(1 as bit) AS X ) AS [SingleRowTable1] LEFT OUTER JOIN (SELECT [Extent3].[ContactID] AS [ContactID] FROM [DBA].[Contact] AS [Extent3] WHERE [Extent2].[ContactID] = [Extent3].[ContactID] )AS [Project1] ON cast(1 as bit) = cast(1 as bit) LEFT OUTER JOIN (SELECT [Extent4].[ContactID] AS [ContactID] FROM [DBA].[Contact] AS [Extent4] WHERE [Extent2].[ContactID] = [Extent4].[ContactID] ) AS [Project2] ON cast(1 as bit) = cast(1 as bit) WHERE ([Extent1].[ContactID] = [Project1].[ContactID]) OR (([Extent1].[ContactID] IS NULL) AND ([Project2].[ContactID] IS NULL)) ) ) AS [Distinct1] LEFT OUTER JOIN (SELECT [Extent5].[ContactID] AS [ContactID], [Extent6].[SalesOrderID] AS [SalesOrderID], [Extent6].[TotalDue] AS [TotalDue], 1 AS [C1] FROM [DBA].[Contact] AS [Extent5] INNER JOIN [DBA].[SalesOrderHeader] AS [Extent6] ON EXISTS (SELECT cast(1 as bit) AS [C1] FROM ( SELECT cast(1 as bit) AS X ) AS [SingleRowTable2] LEFT OUTER JOIN (SELECT [Extent7].[ContactID] AS [ContactID] FROM [DBA].[Contact] AS [Extent7] WHERE [Extent6].[ContactID] = [Extent7].[ContactID] )AS [Project5] ON cast(1 as bit) = cast(1 as bit) LEFT OUTER JOIN (SELECT [Extent8].[ContactID] AS [ContactID] FROM [DBA].[Contact] AS [Extent8] WHERE [Extent6].[ContactID] = [Extent8].[ContactID] )AS [Project6] ON cast(1 as bit) = cast(1 as bit) WHERE ([Extent5].[ContactID] = [Project5].[ContactID]) OR (([Extent5].[ContactID] IS NULL) AND ([Project6].[ContactID] IS NULL)) ) ) AS [Project8] ON ([Project8].[ContactID] = [Distinct1].[ContactID]) OR (([Project8].[ContactID] IS NULL) AND ([Distinct1].[ContactID] IS NULL)) ) AS [Project9] ORDER BY [Project9].[ContactID] ASC, [Project9].[C2] ASC
Equivalent SQL query select Extent6.ContactID, 1 as C1, 1 as C2, Extent6.ContactID as ContactID1, Extent6.SalesOrderID as SalesOrderID, Extent6.TotalDue as TotalDue from DBA.SalesOrderHeader as Extent6 order by Extent6.ContactID asc Can your query optimizer get there?
Challenges of ORMs • Object-instance-at-a-time navigation through the object model can result in multiple, separate interactions to the database server to retrieve the rows to create the objects • Termed the N+1 Selects problem • Application developer must tradeoff prefetching applicability in various scenarios • Objects are not partially instantiated; all object attributes are required for the constructor • Transactional semantics are complex once caching is introduced • Transactional semantics often differ across DBMSs, even with the identical isolation level • Developers must (still) be aware of the potential for lost updates
Challenges of ORMs • The SQL dialect supported by ORM toolkits is typically a very restricted subset of ANSI SQL • ORM toolkits often support only limited, straightforward DDL for schema creation/modification • Not recommended for production applications
Important aspects of ORM toolkits • Mapping specification • Query language • Persistence • Class inheritance • Fetch strategies • Caching • We look and see how these are implemented in Hibernate, a popular open-source ORM toolkit for Java applications
Brief introduction to Hibernate • Open-source, LGPL Java ORM toolkit • Originally developed by Christian Bauer, Gavin King, and a worldwide team of developers • Now maintained by a team at JBoss (Redhat) led by Steve Ebersole • Ported to the .NET environment (C#), called Nhibernate • http://hibernate.org
Hibernate mapping specification CREATE TABLE "GROUPO"."SalesOrders" ( "ID" integer NOT NULL DEFAULT autoincrement ,"CustomerID" integer NOT NULL ,"OrderDate" date NOT NULL ,"FinancialCode" char(2) NULL ,"Region" char(7) NULL ,"SalesRepresentative" integer NOT NULL ,CONSTRAINT "SalesOrdersKey" PRIMARY KEY ("ID") ) ALTER TABLE "GROUPO"."SalesOrders“ ADD FOREIGN KEY "FK_SalesRepresentative_EmployeeID" ("SalesRepresentative“) REFERENCES "GROUPO"."Employees" ("EmployeeID") ALTER TABLE "GROUPO"."SalesOrders“ ADD FOREIGN KEY "FK_FinancialCode_Code" ("FinancialCode") REFERENCES "GROUPO"."FinancialCodes" ("Code“) ON DELETE SET NULL ALTER TABLE "GROUPO"."SalesOrders“ ADD FOREIGN KEY "FK_CustomerID_ID" ("CustomerID") REFERENCES "GROUPO"."Customers" ("ID“)
Hibernate mapping specification Hibernate mapping file SalesOrders.hbm.xml: <?xml version="1.0"?> <!DOCTYPE hibernate-mapping PUBLIC "-//Hibernate/Hibernate Mapping DTD 3.0//EN" "http://hibernate.sourceforge.net/hibernate-mapping-3.0.dtd"> <!-- Generated Mar 3, 2009 11:59:55 AM by Hibernate Tools 3.2.2.GA --> <hibernate-mapping> <class name="SalesOrders" table="SalesOrders"> <comment>sales orders that customers have submitted to the sporting goods company</comment> <id name="id" type="int"> <column name="ID" /> <generator class="assigned" /> </id> <many-to-one name="employees" class="Employees" fetch="select"> <column name="SalesRepresentative" not-null="true" /> </many-to-one> <many-to-one name="financialCodes" class="FinancialCodes" fetch="select"> <column name="FinancialCode" length="2" /> </many-to-one> <many-to-one name="customers" class="Customers" fetch="select"> <column name="CustomerID" not-null="true" /> </many-to-one> <property name="orderDate" type="date"> <column name="OrderDate" length="10" not-null="true" /> </property> .............[snip] </class> </hibernate-mapping>
Java class implementation // default package // Generated Mar 3, 2009 1:39:06 PM by Hibernate Tools 3.2.2.GA import java.util.Date; import java.util.HashSet; import java.util.Set; /** * SalesOrders generated by hbm2java */ public class SalesOrders implements java.io.Serializable { private int id; private Employees employees; private FinancialCodes financialCodes; private Customers customers; private Date orderDate; private String region; private Set salesOrderItemses = new HashSet(0); public SalesOrders() { } public SalesOrders(int id, Employees employees, Customers customers, Date orderDate) { this.id = id; this.employees = employees; this.customers = customers; this.orderDate = orderDate; }
HQL: Hibernate Query Language • Subset of ANSI query specification with support for • DISTINCT • GROUP BY, simple aggregation • INNER, LEFT- and RIGHT-OUTER JOIN • Quantified subqueries • Supports dot-notation for many-to-one, one-to-one associations, for example: select s.id, s.orderDate, s.region, s.customers.givenName, s.customers.surname from SalesOrders s where s.orderDate between '2001-03-16' and '2001-03-26' and s.region <> 'Central' order by s.orderDate
HQL: Hibernate Query Language • Hibernate’s HQL supports SELECT, inner and outer JOIN, WHERE, HAVING, simple GROUP BY, UNION, ORDER BY, self-joins with different correlation names • HQL does not support recursion, common table expressions, window functions, derived tables, other set operators, table functions, array or structured types, APPLY/OUTER APPLY, CROSS JOIN, GROUP BY CUBE/ROLLUP/GROUPING SETS, FULL OUTER JOIN select distinct c.givenName from Customers c left join c.salesOrderses where c.city in (select c2.city from Customers c2 where c2.companyName like '%Power%')
HQL: Hibernate Query Language • HQL is augmented by “dialects” that implement specific methods to modify the SQL generated by Hibernate before execution on the server • HQL has optional support for (implemented by a specific dialect): • UNION ALL (for entity-type hierarchies) • LIMIT (SELECT TOP N), OFFSET • IDENTITY, GUID data types • Syntax to declare an updateable cursor and locking mode • Case-insensitive string comparisons
Object persistence in Hibernate • Saving objects • Once an object is created or modified, it must be saved explicitly and then the transaction must be committed: session.save(<object name>); tx.commit(); • Loading objects • The Hibernate session interface offers several load() methods for loading objects into memory from database tuples: public Object load(Class theClass, Serializable id) public Object load(String entityname, Serializable id) public Object load(Object object, Serializable id) • Other load() methods permit specification of a lock mode, ie Select for Update
Object persistence in Hibernate • A refresh() method is implemented to reload objects from the database • Useful for when attributes are modified by database triggers upon INSERT or UPDATE • Highly error-prone • Updating objects • Hibernate manages changes to persistent objects transparently and automatically • If an attribute is altered, the appropriate Hibernate session will queue the change for writing to the database using SQL • One can force changes to be written at a certain point using the flush() method, controlled by isDirty() and setFlushMode()
Class inheritance • Hibernate offers a variety of built-in techniques to handle different normalizations of entity-type hierarchies: • Single table with discriminator value • Multiple tables fully normalized into BCNF • A hybrid model consisting of a mixture of the two • If represented as different objects in a mapping, an ETH requires careful construction and different equals() and hashcode() implementations
FETCH strategies • A major pain point is the N+1 SELECTs problem • Navigation through the object model iteratively causes additional rows to be retrieved from the database, using independent SQL requests • “client-side” join; performance tends to be extremely poor due to the additional latency • Adaptive, client-side prefetching and SQL rewriting may be of benefit in these scenarios: see Ivan Bowman’s PhD thesis on Scalpel • Alternatively, in the mapping one may specify the selection method: either “lazy” or “eager” • One must tradeoff this method will global application behaviour, or override it on a case-by-case basis using HQL-specific syntax
Concurrency control • Hibernate relies on the database’s locking scheme for concurrency control • ANSI isolation levels 0-3 are supported directly; 1 or 2 is recommended (READ COMMITTED and REPEATABLE READ) • DBMS that support snapshot isolation require tweaks to their Hibernate dialect implementation • Lock mode for individual HQL statements or instantiations of objects can be specified directly, ie Customer c = (Customer) session.get(Customer.class, 101, LockMode.Upgrade); • Hibernate has builtin support for optimistic concurrency control • Can use either version numbers (added to the schema of the table), or base change control on all of the values in the row
Caching and concurrency control • Like many runtime environments, Hibernate supports built-in caching controls to speed-up database interaction • Two levels of caching are supported: • Level 1: Persistence context cache • Lifespan is a transaction or a conversation without sharing. • Guarantees scope of the object and object identity. Mandatory. • Level 2: Pluggable, scope is process or cluster (shared) • Configurable on a class-by-class basis • Selectable concurrency control strategies: Transactional, Read-write, non-strict read-write, read-only • EHCache, OpenSymphony, SwarmCache, JBoss Cache
Research opportunities • Robust, industrial-strength debugging frameworks • Identify how a particular database interaction was caused by what specific application program behaviour • Analysis of concurrency control behaviour to determine application correctness, for example with respect to lost updates • Identifying common classes of SQL queries for specific sets of optimizations • Place in the dialect layer, in the DBMS itself, or in a separate tool that can support the optimization of multiple inter-related statements • Mechanisms to support additional types of mappings and more complex schemas • Support for recursion
Questions? • Watch my blog for continuing articles on supporting ORM toolkits with SQL Anywhere http://iablog.sybase.com/paulley References: • Christian Bauer and Gavin King (November 2007). Java Persistence with Hibernate (revised edition of Hibernate in Action). Manning Publications, New York, New York. ISBN 1-932394-88-5. Seventh printing. • Dave Minter and Jeff Linwood (2005) Pro Hibernate 3. Apress Books, Berkeley, CA. • ACM Queue 6(3), May/June 2008, “Bridging the Object-Relational Divide”.