1 / 23

POOL Data Storage, Cache and Conversion Mechanism

This project aims to save and restore physics data independently, ensuring full object connectivity. It addresses diverse data nature and sizes (O(10^6) to O(10^13) Bytes/experiment/year). The system hides technology details and ensures cache/persistency specifics are concealed. Major components include Transient Data Cache, Client Persistency Service, and Storage Service among others. The design emphasizes object caching through references and object rendering compatibility. A common transaction model is implemented for handling database operations. Storage mechanisms focus on migrating objects and maintaining reference handling. Object-Token system optimizes object layout, enabling efficient data retrieval and management. Documentation and interface design are key aspects of the project.

rvancleave
Download Presentation

POOL Data Storage, Cache and Conversion Mechanism

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. POOL Data Storage, Cache and Conversion Mechanism What we want Main components and contributors Main design issues and component walkthrough Documentation Performance CAT team, M. Frank, G. Govi, I. Papadopoulos Applications Area Review 2003 October 21, 2003

  2. Motivation • Save and restore physics data independent ofthe underlying data storage technology • Keep full connectivity between objects • Address data sources of different nature • Event data, detector data, statistical data, … • The data sizes: O(106) to O(1013) Bytes/experiment/year • The access patterns differ • Hide any technology details from the clients • Hide all cache/persistency specific details CHEP 2003 March 22-28, 2003

  3. G.Govi0.5 FTE I.Papadopoulos0.5 FTE M.Frank0.5 FTE CAT Team?? FTE Breakdown of Major Components Transient Data Cache Client Persistency Service Storage Service(s) RootStorage RDBMS Prototype CHEP 2003 March 22-28, 2003

  4. Reference to Cache Manager Ref<T> Object Reference in Cache Manager Dereference Pointer to object Cache Access Through References • References know about the Data Cache • 2 operation modes: - Clear at checkpoint - Auto-clear with reference count - With/without object deletion • References are implemented as smart pointers • Use cache manager for “load-on-demand” • Use the object key of the cache manager CHEP 2003 March 22-28, 2003

  5. Manages object cache Ref<T> Data Service Object Cache Client accessdata through References Client Client Client Ref<T> Object Cache Data Service Different context • Event data • Detector data • other Ref<T> Data Service Object Cache Client Data Access CHEP 2003 March 22-28, 2003

  6. Ref<T> Client Data Service (3) Load request (1) read(…) Technology dispatcher PersistencyService (2) Look-up Try to accessan object data Data Cache Common Handling Conversion Service (5) Register-Object-References Storage Service Access to the Data Will be unsuccessful,requested object is not present CHEP 2003 March 22-28, 2003

  7. Object Cache Ref<T>. mark for write Data Service Client Technology dispatcher PersistencyService Common handling Conversion Service Map objects and write Storage Service Storing objects Start Transaction Commit Transaction cache.startTransaction(...) Ref<T>.mark_write(placement) ... Ref<T>.mark_write(placement) cache.endTransaction(...,COMMIT) CHEP 2003 March 22-28, 2003

  8. Common Transaction Model • Implemented by persistency service package • Main client is the data cache • Acts on all open databases • Transaction handling: start, commit, rollback • Pros and cons • Can’t “forget” an open file • Not for free (CPU, I/O etc.) CHEP 2003 March 22-28, 2003

  9. The Storage Mechanism • The underlying model assumptions • Migrating objects to/from the persistent medium • Object mapping • Reference handling • References are objects, not primitives • Need setup: Reference to data cache • ROOT: Callback for base class (Streamer) CHEP 2003 March 22-28, 2003

  10. Storage type DB name database database Database Objects Cont.name Objects Objects Item ID Objects Objects Objects Container Objects Container Objects Container Model Assumptions Data Cache StorageMgr • Class type DiskStorage CHEP 2003 March 22-28, 2003

  11. Object Rendering • Objects must maintain personality when persistent • Allow for queries, selections and independent element access • If technology supports objects… • Want to make use of such features • These technologies must be instructed how to do it • Need object dictionary • If technologies support only primitives • Split objects into primitives [until reasonable level] • Need full access to object member data [member offset, type] • Constructor and Destructor with defined signature • Need object dictionary CHEP 2003 March 22-28, 2003

  12. .xml .h GCC-XML Code Generator ROOTCINT LCG dictionary code CINT dictionary code Gateway I/O CINT dictionary LCGdictionary Other Clients Data I/O Reflection Technology dependent SEAL Dictionary: Reminder DictionaryGeneration CHEP 2003 March 22-28, 2003

  13. Column Wise Object Layout • Object must be splitinto primitives • Done by ROOT • Columns wise data support • For POOL (using ROOT) • Objects are primitives • But also POOL can split • Ease support for “stupid” back-ends (RDBMS) • Store more complexobject models CHEP 2003 March 22-28, 2003

  14. Object Token <pointer> <…> <…> <…> (3) (2) (4) Link ID Link Info ... ... DB/Cont.name,... <number> (1) Local lookup table in each file Follow Object Associations Entry ID Link ID CHEP 2003 March 22-28, 2003

  15. The “Link” Table • Optimize size of persistent data • Keep full functionality • Contains all information to resurrect an object • Storage type • Database name • Container name • Object type • Local to every database • Limited size, scalable CHEP 2003 March 22-28, 2003

  16. Documentation • Interface design documents • Published paper • Some doxygen generated documentation • Many examples (too many?) • Code is still quite rapidly changing • It’s a bit early for a stable reference manual • Good reference manual will have to be addressed • May be doxygen based Could be worse, but certainly is not the most glorious point on the agenda CHEP 2003 March 22-28, 2003

  17. Software dependencies Run-time dependency Data Cache &Storage Manager Comp, link & run dep. SEAL Plugin Mgr ROOT backend RDBMS backend SEAL Framework SEAL Dictionary ODBC ROOT CHEP 2003 March 22-28, 2003

  18. CPU Performance (write) • CPU Usage as function of • #events • #tracks/event(Event size) • #events/transaction • Be careful • Don’t compareapples and oranges • Small events& many transactions=> POOL gets badFramework overhead [ Numbers by Ioannis ] 1..n Event Track Many eventsMany tracks Many tracksfew events CHEP 2003 March 22-28, 2003

  19. CPU Performance (read) • CPU Usage as function of • #events • #tracks/event(Event size) • #events/transaction [ Numbers by Ioannis ] 1..n Event Track Many eventsMany tracks Small events few tracks Many tracksfew events CHEP 2003 March 22-28, 2003

  20. CPU per Event and Track (read) [ Numbers by Ioannis ] [ Numbers by Ioannis ] Many tracks Many tracks CHEP 2003 March 22-28, 2003

  21. File size • Disk Usage as function of • #events • #tracks/event(Event size) • #event/transaction • No big differenceto ROOT(no surprises) 1..n Event Track [ Numbers by Ioannis ] CHEP 2003 March 22-28, 2003

  22. Further Development • No open milestones • Further developments • Depends if someone has time • Plenty to do • RDBMS integration • ROOT backend works, but is not perfect • Optimizations: Speed, CINT usage etc. • My part will go down (POOL integration in Gaudi for next ½ year) • i.e. left with 2 x 0.5 FTE • Maintenance will go up • Bugs start flowing in CHEP 2003 March 22-28, 2003

  23. Conclusions • Cache and storage manager work well • …at least we survived the first release(s) • Solution got picked up by 3 out of four expts. • We have not seen dramatic penalties & overheads • No plans for many new developments • Focus on consolidation and optimization CHEP 2003 March 22-28, 2003

More Related