440 likes | 674 Views
Persistence. MSO 08/09, Chapter 11, WP. Persistence. to persist = to last , to endure In a business application you often need your data to last even if you sometimes have to shut down the application. Simple: Save the data. When the application is up again, it re-load the data.
E N D
Persistence MSO 08/09, Chapter 11, WP
Persistence • to persist = to last , to endure • In a business application you often need your data to last even if you sometimes have to shut down the application.Simple: • Save the data. • When the application is up again, it re-load the data. • But this is both critical and challenging!
Challenges • Reliability my data should really be saved • Data integrity (latter) • Ability to rollback • Performance • I want to persist terra bytes of data • Fast query and update • Concurrency I have concurrent clients accessing the data
Objectives of chapter 11, learning: • several object-persistence formats • about mapping objects to your persistence formats • optimizing relational data bases Implementation Design
How to 'persist' my data ? • Save them in files. • Simple • But a business app often requires more: • Querying specific parts of data • Fast query • Backup and roll back • Access control • Save them in a database. DBMS
Typical Architecture (of business app) Presentation / User Interface Application Logic / PD Layer In our setup this layer is in OO. Persistence/DAM Layer database Design issue: which persistence approach should we use?
Recommended architecture in Dennis et al • DAM takes care interaction with DBMS (e.g. to save and load). • Relieve PD classes from having to implement save and load themselves • Makes PD independent of the underlying DBMS. Problem Domain (PD) Layer Data Access Management (DAM) Layer Person ------------- Name Age DBMS PersonDAM ----------------- ----------------- save delete load
Database ... traditionally is relational • Means that you store data in relation forms, aka tables.
Basic elements of a table foreign key primary key attribute row column
Referential Integrity • Foreign and primary keys should be consistent.
RDBMS • Relational Data Base Management SystemRefers to an implementation of a relational database concept. • They come with strong features: • Powerful query language (SQL) • Access control • Referential integrity • High performance • ACID (Atomicity, Consistency, Isolation, Durability) transactions • Proven technology • ...
So, how to persist objects in an RDB ? • RDB has no notion of inheritance. • Still, we can map a class diagram to an ER diagram (next slide)This will tell how to map our class structures to tables. • But ... when we save an object: save(patient)this is not the same as inserting a row. We may potentially have to save the object substructures its induced relations. • Same goes with load and delete.
Mapping ER diagram Person -------------------- Name : String Age : int Person -------------------- Name : String Age : int 1 SubObjectOf 0..1 Patient ------------------ID : String Insurance : String Patient -------------------- IDInsurance Symptom ---------------- Code : StringName : String suffers 0..* 0..* 0..* Suffers Symptom --------------------CodeName So, how do you implement :save (Patient p) Patient load() 0..*
Various DB persistence alternatives • Persisting in RDBMS you have to build the DAM yourself • Persisting in ORDBMS support for inheritance thinner DAM. • Persisting in OODBMS in principle no DAM needed. • Persisting in RDMBS + ORM (Object-Relation mapping) gives a virtual OODBMS
ORDB(MS) • It's a classical RDB extended with some OO concepts • User Defined Type class • REF allow object navigation, e.g. x.attr1.attr • SET (of REFs) allows a more direct representation of one-to-many relation. • Inheritance • SQL 1999 comes with all these extensions. • Many vendors of traditional RDBMS already support SQL 99 (DB2, Oracle, PostgreSQL, Microsoft)
ORDB, example with SQL 99 CREATE TYPE Person AS OBJECT ( Name CHAR(20) Age INT )CREATE TYPE Symptom AS OBJECT ( Code CHAR(10) Name CHAR(20) ) CREATE TYPE Patient UNDER Person ( ID CHAR(12) Insurance ... Suffers SET (REF Symptom) )
ORDB, example with SQL 99 CREATE TABLE Persons OF TYPE PersonCREATE TABLE Symptoms OF TYPE SymptomCREATE TABLE Patients OF TYPE Patient INSERT INTO Persons VALUES (Person("Octo", 50))...INSERT INTO Patients VALUES (Patient("Spons Bob",3,SET(1465))) SELECT p.Name FROM Persons p
So, mapping OO-PD to ORDBMS ... • In Dennis et al: Rule 1 .. 9b p340 • Assuming ORDBMS that does not support inheritance • Read it yourself • I'll give you a simplified set of 'rules' • We'll assume an ORDBMS that supports (single) inheritance • After all, SQL 99 already includes single inheritance
Simplified mapping rules • Map your class diagram to ER diagram (as before).But no need to factor-out inheritance. This is already supported in our ORDB. • Map an entity to a table over a User Defined Type (UDT)UDT can express inheritance. • You have more options when implementing a relation • via REF / OID • via SET of REFs
Converting to ER diagram Person -------------------- Name : String Age : int Person -------------------- Name : String Age : int Patient -------------------- IDInsurance Patient ------------------ID : String Insurance : String Symptom ---------------- Code Name suffers 0..* 0..* 0..* Suffers Symptom --------------------CodeName 0..*
From ER diagram to ORDB tables Person -------------------- Name : String Age : int CREATE TYPE Person AS OBJECT ( Name CHAR(20) Age INT )CREATE TYPE Symptom AS OBJECT ( Code CHAR(10) Name CHAR(20) ) CREATE TYPE Patient UNDER Person ( ID CHAR(12) Insurance ... Suffers SET (REF Symptom) ) Patient -------------------- IDInsurance 0..* Suffers Symptom --------------------CodeName But keep in mind you still have to build your DAM !! 0..*
OODB • Db4o for Java and .Net • Available in GPL and as commercial • Stable, and is just .5 MB ! ObjectContainer db = Db4o.openFile(Util.15DB4OFILENAME) try { ... // do something with db4o } finally { db.close(); } From db4o " Formula One Tutorial"
Examples of using db4o Person p = new Person("Octo", 50) Person q = new Patient("Spons Bob", 3) Person r = new Patient("Patrick", 5) db.set(p) ; db.set(q) ; db.set(r) r.suffers.add(PanicDisorder) ObjectSet result1 =db.get(Patient.class) ObjectSet result2 =db.get(new Person(null,5)) List <Person> result3 = db.query( new Predicate<Person>() { public boolean match(Person p) { return p.age >= 5 ; } } )
So ... • Notice that when using an OODB we don't need a DAM; we just call: db.set(object1) object2 = db.get(prototype) ;
Hibernate • Is an ORM (Object-Relation mapping) solution : • Allow you to use ordinary RDB to persist • Automate the back and forth transformation (from objects to tables) • No DAM is needed either give you the illusion of using an OODB. • But you have to tell Hibernate how your classes are mapped to different tables. • Very popular choice at the moment.
Describe the mapping to tables <?xml version="1.0"?> <!DOCTYPE ... "... hibernate-mapping-3.0.dtd"> <hibernate-mapping> <class name="Person" table="PersonTab"> ... <property name="Name" column="Name"/> <property name="Age" column="Age"/> <subclass name="Patient" table="PatientTab"> ... <set name="suffers" ...> < key column="patientId"> <one-to-many class="Symptom"/> </set> </subclass> </class> </hibernate-mapping>
Then you can directly save and load Session session = HibernateUtil.getSessionFactory().getCurrentSession() session.beginTransaction() Patient p = ... session.save(p) session.getTransaction().commit() Patient q = (Patient) session.load(Patient.class, 101) List result = session.createQuery("from Patient as r where r.insurance != null").list()
If MI / SI mismatch is an issue ... • Problem: my application uses a MI OO language, buy my persistence technology only supports SI. • Map MI to SI : • By representing inheritance as association. • Or by flattening • Your DAM will have to implement the back and forth mapping between MI PD and SI OODB. But perhaps we should avoid this kind of mismatch, unless we can find a tool that can do the mapping automatically.
Example of approach 1 PersonODB ----------------- Name Age ChatODB ---------------- Nickname Person ------------- Name Age Chat ---------------- Nickname 1..1 1..1 subObjOf XOR 1..1 PatientODB ---------------ID Insurance Patient ---------------ID Insurance subObjOf 1..1 ChatBot ChatBotODB AutomatedPrg AutomatedPrgODB PD classes OODB classes
Example of approach 2 PersonODB ----------------- Name Age Person ------------- Name Age Chat ---------------- Nickname PatientODB ---------------ID Insurance Nickname ChatBotODB -------------------- Nickname Patient ---------------ID Insurance ChatBot AutomatedPrg AutomatedPrgODB PD classes OODB classes
Optimizing your RDB duplicated information wasted space
Normalization • Decomposing your initial DB schemes into a new set of schemes: • optimal e.g. no unnecessary duplication and no wasted space • we can still reconstruct the original schemes • You may want to recall the DB course. • Here we'll discuss • 1st Normal Form (1NF) • 2NF • 3NF
1NF • A table T is in 1NF if : • No attribute a2 in T that actually is a duplication of attribute a1. • T has no empty cell. Table Order This table is not 1NF
Org. Table Order New Table Order Table ProductOrder
2NF • A table T is in 2NF if • it is in 1NF • no nonkey attribute in T depends on only a part of T's primary key. New ProductOrder Old Table ProductOrder Table Product PrdNr Desc
3NF transformation Old Table Order • A table T is in 3NF if • it is in 2NF • every attribute in T directly depends on T's primary key. Table Order Table Customer
Transformation • Note that we do not ideally normalize/transform tables. We ideally apply normalization at the design phase on our DB schemes, which form the structural design of an RDB. ProductOrder : (OrdNr, PrdNr, Desc) (OrdNr, PrdNr) is primary key PrdNr Desc ProductOrder : (OrdNr, PrdNr) Product : (PrdNr, Desc)
3NF Table Customer Old table Customer table State State Tax
So ... • After 3NF we optimize space usage : • No wasted empty cells • No unnecessarily duplicated information • But now information is scattered over multiple tables • We rely on joins • E.g. to know which customers order wine we have to query on: will cost time! Customer join Order joinProductOrderjoin Product
Optimizing data access speed • Clustering see book • Indexing below • Denormalization Customer-State Index Table Customer
Denormalization • putting back some redundancy to make certain things faster. Remember we did this 2NF normalization: New ProductOrder Old Table ProductOrder Table Product denormalize
Estimating storage size • Scheme of ProductOrder OrdNr : INT(10) PrdNr : INT(10) Desc : VARCHAR(160) • Average row size 100 bytes • #row now 106 • growth: 100 new orders per day • This is a crude estimation. Size now 100 MB After 3 years 110 MB