ObjectStore

ObjectStore Martin Wasiak

ObjectStore Overview • Object-oriented database system • Can use normal C++ code to access tuples • Easily add persistence to existing applications • Critical operation: retrieving data from a tuple must be as fast as possible

Why Object-oriented? • Needs for CAD, ECAD and some other engineering applications are different from traditional database systems. • These applications need to store large amounts of data that are inefficient to retrieve using traditional relational database systems.

CAD & CAM Applications • Store data objects that are connected to one another forming complicated networks. • Each object represents a part that contains some attributes and is also connected to another part (object).

What to Optimize? • Most CAD apps have to traverse a list of those objects such as a list of vertices in a 3-D CAD program. • Another example: traversing a network of objects representing a circuit and carrying out computation along the way.

Bottlenecks! • Example: • SELECT p1.weight, p2.weight, p3.weight FROM Pipes p1, Pipes p2, Pipes p3 WHERE p1.left_pipe_id=p2.right_pipe_id AND p2.left_pipe_id=p3.right_pipe_id AND p1.pipe_id=5 AND p1.contents=“Water” • Inefficient if we want to find the total weight of our pipeline composed of thousands of parts.

More Bottlenecks… • Our pipeline can be thought of a long linked list, and so doing joins on it is simply inefficient. • In C++ we can write a simple loop to traverse a list or a tree type structure with many pointers. • So instead of a join we simply dereference a pointer!

Note About Pointers • Pointers when stored on disk point to actual memory addresses, not some other logical pointers within the file. • This means when an object is paged into memory, ObjectStore tries to fit it into the memory address that it was loaded in last.

Dereferencing Pointers • ObjectStore sets page permission to “no access” if a record is not a memory. • Client tries to access the page it has no access to. • Hardware detects an access violation and reports memory fault to ObjectStore. • ObjectStore loads the record into memory and sets the page permission to read-only. • Client tries to dereference the record and succeeds.

More Dereferencing • What if DB is bigger than VM pool? • Dynamically assign address space to db. • What if address of a record in db is already in use? • Tag table keeps track of all objects in the database. • Used to relocated pointers.

Client Caching • Client side caching is used to eliminate the need to page over network and speed up performance. • Server keeps track of all objects present in client caches. • What if a client tries to modify an object that exists in another client’s cache? • Callback message is sent to the client to check whether the object is locked.

Defining Relations • Relations are defined using pointers. • Pointers are kept in both directions to facilitate updates.

Associative Queries • Query against all_employees: • os_Set<employee*>& overpaid_employees = all_employees [: salary >= 100,000 :]; • Query against employees of dept. d: • d->employees [: salary >=100000 :]; • Nested queries: • all_employees [: dept->employees [: name == ‘Fred’ :] :];

Versions • ObjectStore supports version control. • Allows teams to check out read-only and read-write objects for extended period of time. Also called “long transaction.” • Example: A new CPU can have people working on the ALU while others work on LSU (load/store unit) at the same time.

Performance • Relational database schemas are normalized and queries usually involve joins of different tables. • ObjectStore queries generally involve embedded collections or paths through objects. • In addition, indexes can be created over those paths. • The problem ends up being of how to traverse a linked list as fast as possible!

Warm and Cold Cache Results • Cold cache is an “empty” cache. • Warm cache is… non-empty…

QuickStore (Part 2) • Similar to ObjectStore. • Both try to load objects into same memory space since as before since the pointers on disk reflect the actual memory pages. • Few differences. • QS implements a buffer manager system based on simplified version of clock.

More QuickStore • Also built using C++. • Storage provided by EXODUS storage manager (EMS). • In the paper QS is compared to E and QS-B. • E uses software to emulate hardware paging. • QS-B uses bitmaps to keep track of pointers which takes space.

Depth-first Traversal Test • Cold times on small database of an object built with components with each component containing atomic parts. • t1: depth first traversal including atomic parts. • t6: same as t1, but excluding atomic parts.

Conclusion • OS and QS use virtual memory hardware to facilitate loading of objects from disk to memory. • VERY efficient for CAD/CAM applications which rely heavily on traversals of complicated networks of objects. • ObjectStore and QuickStore add persistence to C++ programs.

ObjectStore

ObjectStore

Presentation Transcript

ObjectStore Database System By C. Lamb, G Landis, J.Orenstein, L. Weinreb