1 / 25

Persistency Framework CORAL, POOL, COOL status and plans

Physics Services Support. Persistency Framework CORAL, POOL, COOL status and plans. Andrea Valassi (IT-PSS) On behalf of the Persistency Framework team LHCC, 19th November 2007 Thanks to the experiments for their input for this talk!. Outline. Introduction Main achievements in 2007

lenci
Download Presentation

Persistency Framework CORAL, POOL, COOL status and plans

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Physics Services Support Persistency FrameworkCORAL, POOL, COOL status and plans Andrea Valassi (IT-PSS) On behalf of the Persistency Framework team LHCC, 19th November 2007 Thanks to the experiments for their input for this talk!

  2. Outline • Introduction • Main achievements in 2007 • PF usage in the experiments • Plans for 2008 • Conclusions A. Valassi – Persistency Framework – 2

  3. Persistency Framework Components • CORAL • Abstraction of access to relational databases • Support for Oracle, MySQL, SQLite, FroNtier • POOL • Technologically-neutral hybrid data storage • Streaming of objects (e.g. to ROOT or RDBMS) • Object metadata catalogs (e.g. in relational databases) • COOL • Conditions data management • Conditions object metadata (interval of validity, version) • Conditions object data payload (user-defined attributes) A. Valassi – Persistency Framework – 3

  4. Component Interaction A. Valassi – Persistency Framework – 4

  5. Persistency Framework and AA • Interaction with other projects • SPI (external libs, configuration, nightly builds) • SEAL (base libraries - will soon be replaced) • ROOT (object streaming in POOL; PyCool) • GRID middleware (LFC for CORAL authentication) • 3D (relational data deployment and distribution) • New in 2007 • New configuration and build system • Move from SCRAM to CMT • Nightly builds and QMTEST tests based on CMT • Support for new platforms • SLC4 on 64-bit Linux (lxplus) • MacOSX/Intel (no Oracle yet) A. Valassi – Persistency Framework – 5

  6. Main achievements in 2007 (1) • CORAL • LFC-based authentication and authorization • Python bindings (PyCoral) and database copy tool • PyCoral and LFC work in cooperation with RRCAT, India • Connection pooling • Improved thread safety (Atlas online requirement) • Support for stored procedures • POOL • Major reimplementation of collections • Using CORAL database connectivity and authentication A. Valassi – Persistency Framework – 6

  7. Main achievements in 2007 (2) • COOL • Major API and schema changes in COOL 2.0 • Portable data types (e.g. cool::Int64) for 64-bit platforms • Tag locking functionality • Channel metadata management • Dynamic replication tool • Schema evolution tool and tests • Improved DB authentication and replica lookup • Performance optimizations in COOL 2.1 and 2.2 • Multi-channel bulk insertion and several query use cases • COOL is now deployed at T0 and several T1 sites (with Streams distribution – see 3D presentation) A. Valassi – Persistency Framework – 7

  8. LHCb feedback and PF usage • POOL • For event data (SIMU, DIGI, DST, tags…) • ROOT backend and XML catalog • But RAW data will be in flat files (no POOL or ROOT) • LHCb requests to remove the SEAL dependency • LHCb does not request any new functionality in POOL • COOL(and CORAL via COOL) • For conditions data (online and offline) • Oracle at the pit, T0 and T1 (with Streams replication) • Evaluating LFC-based authentication in CORAL • SQLite files for MC production A. Valassi – Persistency Framework – 8

  9. ATLAS feedback and PF usage (1) • POOL • For event data (RDO, EST, AOD, tags…) • ROOT backend • Separation of transient/persistent definitions simplifies schema evolution and is used to improve performance • New POOL collections were mainly developed by ATLAS • CORAL (directly) • For the detector description (geometry database) • Oracle master, SQLite (was MySQL) for data distribution • For the online configurationand trigger databases • MySQL server and proxies • Motivation for most of the CORAL developments in 2007 A. Valassi – Persistency Framework – 9

  10. ATLAS feedback and PF usage (2) • COOL(and CORAL via COOL) • For conditions data (extensive online/offline use) • Oracle at online, T0 and T1 (with Streams replication) • COOL replication to SQLite (MC) and MySQL (HLT) • Largest data volume is from DCS (>300 GB/year) • ATLAS tools exist to transfer data from PVSS to COOL • ATLAS requests are regularly discussed at the weekly COOL meetings • Functionalities (e.g. for channel and tag management) • Performance (e.g. retrieval of tagged data) • Worries about limited manpower and time spent on debugging non-core platforms (Windows) and non-COOL issues (SEAL, CORAL…) A. Valassi – Persistency Framework – 10

  11. CMS feedback and PF usage • POOL (and CORAL via POOL-ORA) • POOL-ORA is the basis of all conditions data modeling and storage in CMS • Using the Oracle, SQLite and FroNtier backends • Switch to streaming to BLOB columns in POOL-ORA to optimize performance (switch is transparent via Reflex) • Work in progress with POOL team on schema evolution • Worries about continuity of development/support due to expected changes in the development team • CORAL (directly) • To read conditions data from the Oracle online db (then written via POOL-ORA into the offline db) A. Valassi – Persistency Framework – 11

  12. Plans for next year • CORAL • Move SEAL functionalities into CORAL • Will then need to be picked up by POOL and COOL • CORAL proxy server development • POOL • Schema evolution in POOL-ORA • COOL • Further performance optimizations • Enhancements for channel and tag management A. Valassi – Persistency Framework – 12

  13. Move SEAL functionalities into CORAL • Motivation • SEAL support and maintenance are not staffed • Several problems in multi-threading environments • e.g. from 2nd CORAL thread that closes stale connections • Multi-threading outside the original SEAL design scope • Main components to replace • Component model and dynamic loading of plugins A. Valassi – Persistency Framework – 13

  14. CORAL proxy server • Motivation • Secure authentication and authorisation • Authenticate using Grid certificates on the proxy in spite of missing database vendor support for X.509 certificates • Scalability for many connections • Serve several (mostly idle) CORAL connections to the proxy by fewer active connections to the database • Interests both users and service managers • IT physics database and security teams • Better load management, hide DB ports behind firewall • ATLAS and CMS • Possible addition of data caching functionality • A MySQL-based proxy is currently used in ATLAS online A. Valassi – Persistency Framework – 14

  15. CORAL proxy server - a possible scenario A. Valassi – Persistency Framework – 15

  16. POOL-ORA schema evolution • Deal with class definition changes • e.g. add or remove attributes of a class • e.g. change the type of an attribute of a class • e.g. move data member from/into a base class • Deal with changes in the storage layout • e.g. move C-array data into inline columns • Work is already in progress • Tools for users with schema modification privileges • Mainly for CMS (no persistent/transient separation) A. Valassi – Persistency Framework – 16

  17. COOL enhancements and optimizations • Feature enhancements • Partial tag locks, easier use of channel names… • Most requests from ATLAS, some from LHCb too • Performance optimizations • Main pending issue: retrieval of tagged IOVs • Many use cases mean many queries to optimize • Similar problems (query time increases as table size increase) with similar solutions (query rewrite, indexes…) • Consolidating code to factor out commonalities • Local scalability tests to try to prevent problems from showing up later at T0 and T1 sites A. Valassi – Persistency Framework – 17

  18. Manpower • Small contributions from the experiments • ATLAS: POOL (0.5) and COOL (0.2) • COOL contributions also from user community and DBAs • CMS: CORAL (0.1) – not counting Frontier team • LHCb: POOL (0.1) and COOL (0.2) • Main contribution from CERN IT-PSS • POOL/CORAL (3.5) and COOL (0.8) • Three key CORAL developers from IT-PSS will leave in 2008 • Transfer expertise to new hires (one, possibly two) • Reallocate tasks to smaller PF team A. Valassi – Persistency Framework – 18

  19. Comments from the previous review • Both remarks are still valid and relevant • Reduction and turnover in the CORAL/POOL team • “Possible manpower crisis” foreseen in the 2006 report • SEAL replacement is in the PF work plan for 2008 A. Valassi – Persistency Framework – 19

  20. Conclusions • ATLAS, CMS and LHCb are relying on the PF for their event and/or conditions data • Development plans for 2008 • Many items on the CORAL and COOL work plan • POOL is mostly in maintenance mode • Overall manpower reduction • Low but relatively stable contributions for COOL • Team reduction and turnover for CORAL/POOL A. Valassi – Persistency Framework – 20

  21. Reserve slides A. Valassi – Persistency Framework – 21

  22. IOV valid at t=20 (inefficient lookup – two columns) SV Single-channel browse IOVs • Example: get all IOVs in t=[20,30] for channel 5 • SV (single version): there is only one version at any time t • SC (single channel): just select ChannelId=5 • Problem until COOL 2.0.0 included • Retrieval time is longer for IOVs at the end of the IOV table • ( Since<=20 AND 20<Until ) OR ( 20<Since AND Since<=30 ) • Fixed in COOL 2.1.0 • Optimize lookup of first IOV • As in fix for SV SC single-IOV find • Two separate SQL queries • MAX(Since) WHERE Since<20 • Since = maxSince (from query1) • New strategy in COOL 2.2.0 • Merge two queries in a single SQL query (use subqueries) • Needed for SV MC case For more details: “task #3675” A. Valassi – Persistency Framework – 22

  23. SV Multi-channel browse IOVs • Example: get all IOVs in t=[20,30] in channels 1-99 • SV (single version): there is only one version at any time t • MC (multi channel): ( 1<=ChannelId AND ChannelId<=99 ) • Special case: all channels (no selection on ChannelId) • Problem until COOL 2.1.1 included • Retrieval time is longer for IOVs at the end of the IOV table • ( Since<=20 AND 20<Until ) OR ( 20<Since AND Since<=30 ) • Same problem as for single-channel case in COOL 2.0.0 • Fixed in COOL 2.2.0 • Optimize lookup of first IOV for each channel as in fix for SC case • With Max(Since) subquery • Loop over selected channels via a join on the IOV and channel tables • Execution plan (table order in join) depends on first value used (“bind variable peeking”): fix it using hints, • /*+ LEADING(c i) USE_NL(c i) */ Was a showstopper for Atlas distributed readback tests at T1 sites in Q2 2007 For more details: “task #4402” A. Valassi – Persistency Framework – 23

  24. Deployment in LHCb • Computing model • Reconstruction at T0/T1 • Only MC prod at T2 • COOL stores only conditions data for event reconstruction • Oracleat PIT, T0, T1 with replication viaStreams • Geometry and conditions for MC sent to T2 as SQLite file • Online db master at PIT • Replicated forward to T0 and T1 via Streams • Data from PVSS processes • Offline db master at T0 • Replicated back to PIT and forward to T1 via Streams • Data computed in offline calibration/alignment jobs (Marco Clemencic, COOL meeting 3 July 2006) COOL A. Valassi – Persistency Framework – 24

  25. Deployment in Atlas • Largest COOL data set comes from DCS • Via the PVSS2COOL data transfer (1.5 GB/day) • From the online RAC in the T0 computer centre • For offline reconstruction and detector experts • Many options open for T2 replication • Many use cases (simulation, calibration, analysis) • Static/dynamic replication to sqlite/mysql, Frontier (Florbela Viegas, CHEP 2007) A. Valassi – Persistency Framework – 25 Persistency Framework – 25

More Related