1 / 17

Persistency Project (POOL) Status and Work Plan Proposal

Persistency Project (POOL) Status and Work Plan Proposal. Dirk Düllmann dirk.duellmann@cern.ch PEB Meeting, July 16 th 2002. Persistency Project Timeline. Started officially 19 April Initially staffed with 1.6 FTE Zhen Xie/CMS (100%) - MySQL scalability and reliability test

redman
Download Presentation

Persistency Project (POOL) Status and Work Plan Proposal

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Persistency Project (POOL)Status and Work Plan Proposal Dirk Düllmann dirk.duellmann@cern.ch PEB Meeting, July 16th 2002

  2. Persistency Project Timeline • Started officially 19 April • Initially staffed with 1.6 FTE • Zhen Xie/CMS (100%) - MySQL scalability and reliability test • DD (60%) – Completed discussion about requirement list and deployment plans with experiments • Persistency Workshop 5-6 June at CERN • More requirement and implementation discussions • Draft work package breakdown and release sequence proposed • Additional experiment resource commitments received • Since beginning of July • Real design discussions in work package started • active participation now ~5FTE • Daily meetings with core component WPs • Meet twice a week for Catalog WP • Collection and Meta Data discussion just starting now • Project h/w and s/w infrastructure being defined and becoming available

  3. Experiment Deployment Plans • Summary of first round of discussions with the experiments • Numbers referring to minimal requirements for a first release (rather than to constraints imposed by the design/implementation) • Timescale for first release – September (CMS) • Others are interested but more relaxed about date • Volume – 10-50TB (CMS) • Files – several 100k (ALICE) • Distribution – O(10) sites (CMS) • Number of population jobs – 10k (CMS) • Use of REFs per Event (CMS/ATLAS/LHCb) • LHCb: O(100) refs per event; CMS (100-1000); • ALICE: not relying on Refs so far • Recovery from jobs failures (CMS, ATLAS) • CMS: Existing setup allows to just re-issue same job • Append to existing files • LHCb: each file is written by exactly one job • ALICE: single event spans several files

  4. Experiment Focus of Interest (prioritised by # of votes) • RootIO integration with grid aware catalog (ALL) • Transparent Navigation (ATLAS/CMS/LHCb) • ALICE: maybe, but only in some places • EDG (ALL), Alien (ALICE), Magda (ATLAS) • MySQL as RDBMS implementation until first release (ALL) • Consistency between streaming data and meta-data (CMS/ATLAS) • At application defined checkpoints during a job • Early separation of persistent and transient dictionary (ATLAS/LHCb) • Initial release supports persistency for non-TObjects (ATLAS/LHCb) • without changes to user class definitions • Support for shallow (catalog-only) data copies (CMS) • formerly known as cloned Federations • Support for deep (extracted part-of-object hierarchy) copies (CMS)

  5. Practical Project Organisation • Stay close to initial RTAG component model • Work Packages are (nearly) completely aligned • Constraint: • at least two significant contributions (>30% FTE) per work package to insure independent work • have core component developers co-located at CERN • Some interface/design discussion will need to take place • inside co-located work packages outside CERN • by email / phone meetings • small workshops at CERN

  6. Proposed Work Package Split • Earlier Drafts have been shown on the Persistency Workshop and also later by Torre • Catalog and Grid Integration • Zhen Xie (50%), Maria Girone (50%), Mathias Steinecke • Storage Manager & Refs • Markus Frank (50%), Giacomo Govi (50%), Fons Rademakers • Dictionary & Conversion Services • Craig Tull, Stefan Roiser (50%), Victor Perevoztchikov(30%) • Collections and Meta Data • David Malon(30%), Chris Laine (50%), Julius Hrivnac (20%), Sasha Vaniachine (20%), RD Schaffer(30%) • Common services, integration and testing • Giacomo Govi (50%), Maria Girone (50%), Zhen Xie (50%) • some commitment made by experiments still under discussion

  7. File Catalog – First Results • Scalability and performance for native MySQL catalog close to completion • no big surprises compared to Persistency Workshop reports • nearly flat scaling up to 200 clients from 10 nodes • 0.2 - 0.3 ms per catalog update (depending on client load) • impact of batching several updates in one transactions • Some instabilities observed • database had to be recreate after schema corruption • server lockup after file system full condition • Need to understand both in more detail before any larger scale production deployment

  8. Catalog Scaling Results(M. Girone/Z. Xie)

  9. EDG/Globus Integration • Close contact with EDG WP2 about Giggle (EDG/Globus Replica Location Service) • Need to understand software integration with EDG provided middleware • software dependencies, release cycle, supported platforms • POOL will assume standard EDG testbed installation for client and server machines • Plan to repeat catalog scalability and reliability tests with Giggle as backend • currently some issues since EDG software is not yet released for RedHat 7.2

  10. Basic S/W Infrastructure Setup • Persistency is the first LCG project • will need to operate before some of the basic LCG services are fully agreed • Need to participate during setup basic s/w infrastructure • Repository directory structure • Need to build on a few basic services • Error Reporting and Exception hierarchy • UUID generation • Component loading and configuration • Configuration control, build and automated test system • For now start as pragmatic (and simple) as possible • Extract implementation for all of the above from existing projects • Expect to change as soon as wider scale agreement is achieved in LCG • will require additional work for re-factoring eg after V0.0.1 release

  11. Work Package Progress • Core Components – StorageMgr • Draft header files and UML diagrams are circulated • Gaudi Introspection has been made independent from GAUDI • Received example code implementing persistency for foreign (non-TObject) classes • Expect to conclude first round of interface discussions this week • start with implementation of component prototypes • Collections & Meta Data • Design discussions between people involved have started but are still in an early state • Aim for first interface proposals during the next two weeks

  12. POOL Hardware Resources • Discussed hardware requirements with other groups in IT • Proposal distributed by Bernd Panzer • Now-December • 2 disk server machines to be setup as catalog servers • one with native MySQL catalog / meta data server • one installed as EDG testbed node acting as replica location server • 10 client nodes for concurrency test • installed as testbed nodes once EDG for RH7.2 is available • September-December • stability and distribution test (decoupled from development setup) • 2 additional disk servers for decoupled stability and distribution tests • 3 disk servers for ROOT data • November-December • more resources for experiment framework integration tests from experiment data challenge allocations as required by experiments

  13. Proposed Release Strategy • AIM • complete first internal release cycle as soon as possible • Start by integration of core components • StorageMgr, FileCatalog, Reflection, CacheMgr & Refs • result transparent navigational store operational • Other (more external) components will start design and implementation already • but will only be integrated in later releases • in particular Collections and MetaData are essential for final product but less tightly coupled • Only the first external release will have all components integrated • usable for expert users (eg no automatic installation)

  14. Proposed Release Sequence • R 0.0.1 - Basic Navigation • all core components for navigation exist and interoperate • StorageMgr, Refs & CacheMgr, Dictionary (r/o), FileCatalog • some remaining simplifications • Assume TObject on read/write – simplified conversion • R 0.0.2 – Collections • first collection implementation integrated • support implicit and explicit collections on either RDBMS or RootIO • persistency for foreign classes working • persistency for non-TObject classes without need for user code instrumentation • EDG/Globus FileCatalog integrated • R 0.0.3 – Meta Data & Query (external) • meta data annotation and query added • for event, event collection and file based meta data

  15. Summary • Real development work has (just) started • more than half of the people who signed up for the project are actively working towards a very first release • Consultancy and (in some areas) modifications from ROOT team • expect some inefficiencies because of vacations and project being a test case for new LCG s/w components and procedures • September release date and list of requested features seem incompatible • Need to limit the influx of new requirements and feature request • Propose to aim rather for relevant release content than a too aggressive release date • Propose to reschedule V0.0.3 Release for November • Assuming that committed resources become available

  16. Root Storage Manager Alternatives • User objects are kept in a Root tree • Root trees are typed (as eg RDBMS tables) • trees need to created by SM • objects need to be placed in the right tree • are optimised for sequential access • should allow foreign (non-TObject) persistency • User objects are kept as keyed Objects • could allow more efficient random access • but some scalability issues • all keys are read if a directory is opened? • could probably be fixed • Do keyed foreign objects work already? • if not could they be made work? • CMS is actively prototyping second approach for their COBRA Root prototype • POOL project will concentrate on first approach and re-synchronise after first development release

  17. Storage Manager • Several half day brainstorming sessions with Markus Frank last week • cache manager interface and implementation issues • role of context • integration with dictionary and conversion • token structure for Root/RDBMS tables • basic model file-container-object • container maybe a Root tree, a Root directory, a RDBMS table • in more detail • flow of control inside SM during read, write, commit operations • first UML diagrams have been circulated

More Related