320 likes | 493 Views
OmniBase object database Source Tracking System for Dolphin Smalltalk. David Gori šek, ESUG, Bled 2003. OmniBase. Multi-user persistent object system for Smalltalk It is not a database in the traditional sense i.e. there is no declarative query language like SQL or OQL
E N D
OmniBase object databaseSource Tracking System for Dolphin Smalltalk David Gorišek, ESUG, Bled 2003
OmniBase • Multi-user persistent object system for Smalltalk • It is not a database in the traditional sense i.e. there is no declarative query language like SQL or OQL • All data access and data manipulation is done in Smalltalk • Has transactions with ACID properties • Atomicity – changes to objects are made persistent only after transaction commit • Consistency – a transaction transforms the database from one consistent state to another • Isolation – one transaction is always isolated from another transaction • Durability – once transaction commits its updates can not be changed by a paralel transaction
Concurrent access to objects • OmniBase maintains data consistency by using multiversion concurrency control (MVCC) • Every update creates a new object version with its own version number • Each transaction sees a snapshot of data as it was at transaction begin • Advantages • No need for setting read locks • Writers never wait for readers • Readers never wait for writers • Readers never wait for readers • Long transactions can be used • Pessimistic and optimistic concurrency control is possible for write transactions
Runs on multiple platforms • Dolphin Smalltalk – primary development environment • VisualWorks – complete port with some limitations • Squeak – ported by Cees de Groot and some help from the Squeak community • IBM VAST – complete port was done by myself but there is currently no further development • ST/X – ported by C. David Schaffer • Easily portable to any dialect with a file system supporting file locking
Mature code • First version of OmniBase was released 5 years ago for the free version of Dolphin Smalltalk 2.1 • Meant to be used as an embeded database for Dolphin applications • Used at more than 100 sites today
Easy schema migration • OmniBase maintains its own schema in the database where class definitions are versioned • Adding an inst var in the image will automatically add it in the database, old persistent instances will have it set to nil • If an inst var is missing in the image its value will get lost upon load • If inst vars get reordered, OmniBase will order them appropriately upon load
Easy to use • To create a database use:db := OmniBase createOn: ‘c:\myDB’ • To open an existing database use:db := OmniBase openOn: ‘c:\myDB’
Easy to use • All data access happens inside a transaction using the following 3 steps: • 1. begin transaction • 2. get, create or change persistent objects • 3. commit or rollback transaction • Transactions can be used either explicitly, or implicitly where they are associatied with the active Smalltalk process/thread
Two ways of using transactions • Explicitly:txn := db newTransaction. txn root at: ‘EMP001’ put: Employee new. txn commit. • Implicitly:[ OmniBase root at: ‘EMP001’ put: Employee new ] evaluateAndCommitIn: db newTransaction
What is a ‘root’ object • Root object is the global database entry point • All objects stored in the database must be reachable from the root in order not to get garbage collected. This is also called persistence by reachability • Root object is a dictionary by default but it can be changed to any other object if there is a need to do so
Making objects persistent [ OmniBase rootat: ‘misc’put: (OrderedCollection newPersistent add: ‘a string’; add: Date today; add: 100 @ 200; add: Time now; add: #aSymbol; yourself) ] evaluateAndCommitIn: txn
Updating persistent objects [ coll := OmniBase root at: ‘misc’. coll add: ‘another string’; markDirty] evaluateAndCommitIn: txn
Object clustering • A cluster is a unit of persistence in OmniBase • Object cluster is a group of objects which are serialized into a series of bytes and stored together in OmniBase • An object cluster gets its own object ID (oid) • An object ID is used for interobject relationships between persistent objects in OmniBase
Example of an object cluster • Instance of a class Employee • Inst vars: name, addresses, birthDate • name: a String e.g. ‘Talk Small’ • addresses: a Collection ( an Address, an Addres, … ) • birthDate: a Date • When the employee object is made persistent it will be serialized together with its inst var values into a single object cluster • When an employee is fetched from the database there will be a single data access to get all of its addresses and other inst var value
Making objects persistent • Problem: • A transaction needs to be notified when an object inside a persistent object cluster gets changed • Example: adding an address to the Employee object doesn’t change itself, only the collection object • A change like the one above should be propagated to the transaction because the whole cluster needs to be stored upon commit
Making objects persistent • Solution 1: • Add a trigger mechanism which will mark main objects as dirty when something changes • Solution 2: • Make a cluster out of every container object • Use immutability support where objects are marked as read-only upon load • Upon update exception mark them as dirty
Indexing objects • Indexing of objects in OmniBase is similar to the way one would index objects inside a Smalltalk image • The problem is that Smalltalk hash dictionaries don’t perform well when amount of data is too large • Another problem is that Smalltalk hash dictionaries can not handle parallel updates • OmniBase therefore adds another type of dictionary – a b-tree dictionary for indexing persistent objects
Indexing objects • Creating a b-tree dictionary for indexing:[ OmniBase root at: ‘employee_PK_index’ put: (OmniBase newBTreeDictionary: 20)] evaluateAndCommitIn: txn • A b-tree dictionary is using a b-tree structure underneath for fast indexing of associations. A dictionary key can be any of the following: Integer, String, Date, or any other implementing a method #asBTreeKeyOfSize:
Indexing objects • Adding objects to index dictionary:[ index := OmniBase root at: ‘employee_PK_index’. index at: ‘EMP001’ put: Employee new.] evaluateAndCommitIn: txn • An object is automatically made a persistent cluster when added to a b-tree dictionary.
Retrieving objects using an index • Example:[ index := OmniBase root at: ‘employee_PK_index’. emp := index at: ‘EMP001’ …] evaluateIn: txn
Advance b-tree example • B-tree dictionaries are sorted by default therefore range queries are possible:index keysFrom: ‘EMP00’ to: ‘EMP09’ do: [:eachKey | …].
Advance b-tree examples • When doing batch processing one would like to process an object at a time each in its own transaction:index transactionDo: [:txn2 :eachEmployee | self process: eachEmployee ].
Advance b-tree examples • A b-tree dictionary can also act as a cursor:index goTo: ‘KEYPREFIX’.[(association := index next) isNil] whileFalse: [ ].
Indexing to do • Add a declarative model on top of a b-tree dictionary to support more SQL-like queries • Add automatic update mechanism which will update indexes when an object attribute changes • Better support for secondary indices
A little about object locking and concurrency • As mentioned a transaction sees a snapshot of data as it was at the beginning of the transaction • If a transaction takes longer time and needs to update an object which was changed in between, it will need to restart itself to get the newest version for update • Using object locks one can ensure that no one else is allowed to change the object while its locked
More about locking • B-tree dictionaries are multiversioned too • Each transaction sees a snapshot of a dictionary as it was at the beginning, even if another transaction adds 1 million objects to the dictionary in parallel each of these two transactions will have their own views of the dictionary • To prevent two transactions from updating the same association at the same time dictionary key locks are used
What the users want • Better indexing (and query) support • On-line backup; this is easy to implement with MVCC • DB administration tools • On-line garbage collection • Client/Server version over TCP/IP
Further resources • http://www.gorisek.com • http://swiki.cdegroot.com/omnibase • http://www.whysmalltalk.com