150 likes | 305 Views
Data Storage with the POOL persistency framework. Motivation Strategy Storage model Storage operation Summary Giacomo Govi PPARC-LCG, CERN IT/DB. Motivation. Provide storage and retrieval of C++ objects No intrusion into experiments data models Support for various type of data
E N D
Data Storage with the POOL persistency framework • Motivation • Strategy • Storage model • Storage operation • Summary • Giacomo Govi • PPARC-LCG, CERN IT/DB GridPP7 June30 - July 2, 2003
Motivation • Provide storage and retrieval of C++ objects • No intrusion into experiments data models • Support for various type of data • Event data, Detector data, Analysis data • Different volumes • Different access pattern • Persistency technology may change over time • Different technologies may be used at the same time Avoid to bind to a single choice • Physics software should be independent from the underlying data storage technology GridPP7 June30 - July 2, 2003
Strategy • Hide any technology details for the clients • Clients deal with objects or object references • Leave Transient data representation free from ‘knowledge’ about persistency Each technology can be handled transparently • Run-time binding of transient data to the underlying technology GridPP7 June30 - July 2, 2003
Strategy (cont’d) • Objects maintain their state when made persistent • Allow for queries, selections and independent element access • Backend layers built on the technology • Use object feature when supported - need to be instructed • Split into primitives if no object support – need full access to member data • Need for object description: “dictionary” GridPP7 June30 - July 2, 2003
Storage scheme • Define a model for an object storage system: • identifying commonalties among different technologies • Adapts to any technology with direct record access • Need to know record identifier in advance • RDBMS: More or less traditional • Primary key must be uniquely determined before writing GridPP7 June30 - July 2, 2003
Persistent C++ pointer >> object ID Objects & references Objects, object IDs, DBs Persistency model Transient GridPP7 June30 - July 2, 2003
Storage functions • Write objects • Return a unique identifier of their ‘address’ in the database (Token) • Read back/ modify/ delete stored objects • Localize objects in the database using the Tokens • Support of object association • Provide a transparent way to navigate into object references • Available: Root I/O backend GridPP7 June30 - July 2, 2003
Components breakdown CLIENT SIDE POOL SIDE PersistencyService DataService Ref<T> Cache Client Storage Service LCG Dictionary GridPP7 June30 - July 2, 2003
.xml .h GCC-XML Code Generator ROOTCINT LCG dictionary code CINT dictionary code Gateway I/O CINT dictionary LCGdictionary Other Clients Data I/O Reflection Technology dependent Dictionary generation DictionaryGeneration GridPP7 June30 - July 2, 2003
Data Access through Reference Access to persistency service Ref<T> • References are implemented as smart pointers • Maintain access to the embedded class members • Provide services to handle persistency • Take care of the memory clean up Reference in the object cache Dereference Pointer to object GridPP7 June30 - July 2, 2003
Data Service object cache • Object • Token Ref<T> • <…> • … • <pointer> • <…> Cache Ref Data Service T o k e n Pointer Storage type Object type Persistent Reference File Catalog Persistency Service Cache Access by Smart Pointer GridPP7 June30 - July 2, 2003
Start Transaction Object Cache Ref<A> mark for write Ref<B> mark for write Ref<C> mark for write Data Service Client PersistencyService Commit Transaction cache->transaction().start(...); refA.mark_write(placement); ... refC.mark_write(placement); cache->transaction().commit(); Storage Service Data operation:WRITE GridPP7 June30 - July 2, 2003
Data operation: READ/UPDATE/DELETE Start Transaction Object Cache Ref<A> Ref<B>. mark for update PersistencyService Data Service Client Ref<C>. mark for delete Tokens Commit Transaction Storage Service cache->transaction().start(...); refA->myMethod(); refB.mark_update(); refC.mark_delete(); cache->transaction().commit(); GridPP7 June30 - July 2, 2003
Object • Token • <pointer> • <…> • <…> • <…> Link ID Link Info ... ... DB/Cont.name,... <number> Entry ID Link ID Local lookup table in each file Follow Object Associations Reference GridPP7 June30 - July 2, 2003
Summary • The POOL framework provides persistency services with a generic store technology • The POOL model can be applied to other technologies based on database files, collections and objects within collections • POOL allows the client to choose technologies according to their needs • Root I/O backend implemented • Proof-of-concept prototype RDBMS backend started GridPP7 June30 - July 2, 2003