An Overview of ATLAS Databases and Database access (Geometry/conditions) in Athena

An Overview of ATLAS Databases and Database access (Geometry/conditions) in Athena Elizabeth Gallas - Oxford ATLAS-UK Distributed Computing Tutorial Edinburgh, UK – March 21-22, 2011

Outline • Motivation: Databases • Overview of ATLAS Databases • Databases of Athena-based analysis interest • Geometry Database • Conditions Database And how they are made accessible on the grid • Some tips for users • Summary and Conclusions Elizabeth Gallas - Databases/COMA

Motivation: Database use in ATLAS • ATLAS “data” – falls into 2 broad categories • Event-wise data: stored in files (RAW, ESD, AOD, TAG …) • Know something about themselves but also have some ‘metadata’ pointers to the bigger picture • Non-event-wise data: Stored in Databases • Enable construction of the ‘bigger picture’ • Important information needed at our fingertips • Usually by diverse clients • Data Base Management Systems (DBMS) provide: • persistent storage • for large/small collections of data of varied complexity • in data structures that provide access flexibility • powerful query language • for data entry, modification and retrieval • transaction management • appearance of isolation • but provides multi-user simultaneous access Elizabeth Gallas - Databases/COMA

Overview – Oracle usage in ATLAS Oracle is used extensively: every stage of data taking, processing, analysis. Some of the more common applications: • Configuration • PVSS – Detector Control System (DCS) Configuration & Monitoring • Trigger – Trigger Configuration (online and simulation data) • OKS – Configuration databases for the TDAQ • Geometry - Detector Description • File and Job management • T0 – Tier 0 processing • DQ2/DDM – distributed file and dataset management • Dashboard – monitor jobs and data movement on the ATLAS grid • PanDa – workload management: production & distributed analysis • Conditions data (non-event data for offline analysis) • Conditions Database • [POOL files in DDM (referenced from the Conditions DB)] • “Metadata” == data about data • AMI (ATLAS Metadata Interface) – Dataset metadata • COMA (COnditions MetadatA) – Configuration/Conditions metadata • TAGs (not an acronym) – Event-level metadata Elizabeth Gallas - Databases/COMA

What does your Athena job need ? What does every Athena job need ? • Data (Events) • Database (Geometry, Conditions) • Efficient I/O (sometime across a network), CPU • (A Purpose and a) Place for Output • Next slides … more details about Geometry and Conditions • What they contain • How Athena accesses them • How they are distributed for access on the grid • User interfaces, documentation, and help • Needs: • Food • Water • Love • Place for output Elizabeth Gallas - Databases/COMA

Geometry Database • Relational DB: Primary Numbers for the ATLAS Detector Description • All data for building GeoModel description in single place • Primary numbers stored in Data Tables (leaf) • Organized by subsystem (branch) • Tagging (versioning) at various levels • Locked tags define distinct detector description • And Globally tagged/locked at higher levels • Associated with Software Releases • Evolution of Geometry tags is set up such that • Each new tag is compatible with older Releases • Location and Distribution: • Master copy: in Oracle server at CERN • Up to now: Copy of entire database dumped into SQLite file • Delivered to sites using DB Release technology with each Software Release • Future … more diverse distribution model being tested (Frontier) • Update: (Vakho Tsulaia) in upcoming Software/Computing workshop Elizabeth Gallas - Databases/COMA

Geometry DB Browser http://atlas.web.cern.ch/Atlas/GROUPS/OPERATIONS/dataBases/DDDB Elizabeth Gallas - Databases/COMA

“Conditions” LHC ZDC “Conditions” – general term for information which is not ‘event-wise’ reflecting the conditions or states of a system – conditions are valid for an ‘interval of validity’ (IOV) ranging from very short to infinity. IOV’s can be expressed as a range: in timestamps or Run/LumiBlocks. Any conditions data needed for offline processing and/or analysis must be stored in the ATLAS Conditions Database (aka: COOL) or in its referenced POOL files (DDM) TDAQ DCS OKS DQ ATLAS Conditions Database Elizabeth Gallas - Databases/COMA

Relies on considerable infrastructure: COOL, CORAL, Athena (developed by ATLAS and CERN IT) -- generic schema design which can store / accommodate / deliver a large amount of data for diverse set of subsystems. IOV ‘interval of validity’ DB in relational DB tables Data organized into folders … foldersets By schema (subdetector) By instance (for real data and MC) Stores data ‘inline’ but can have references to external POOL files (managed by DDM) Athena / Conditions DB data maps to transient C++ objects, which are accessible to Athena at run time through the Transient Store COOL Tag (version) - distinct sets of Conditions making specific computations reproducible Used at many stages of data taking and analysis From online calibrations, alignment, monitoring, to offline … processing … more calibrations … further alignment… reprocessing … analysis …to luminosity and data quality Conditions DB infrastructure in ATLAS Elizabeth Gallas - Databases/COMA

Conditions: User interfaces Command line interface: • https://twiki.cern.ch/twiki/bin/view/Atlas/AtlCoolConsole Conditions TAG Browser: • https://atlas-coolbrowser.web.cern.ch/atlas-coolbrowser/ Elizabeth Gallas - Databases/COMA

Oracle Distribution of Conditions data Outside world Tier-1 replica Calibration updates Computer centre Tier-1 replica Offline master CondDB Online CondDB Tier-0 farm Isolation / cut • Oracle stores a huge amount of essential data ‘at our fingertips’ • But ATLAS has many… many… many… fingers • May be looking for oldest to newest data • Conditions in Oracle – Master copy at Tier-0 • Replicated to many Tier-1 sites • Running jobs at Oracle sites (direct access) performs well • But direct Oracle access on the grid from remote sites: • Even after tuning, direct access requires many back/forth network transactions – RTT (Round Trip Time) multiplies … SLOW • Cascade effect: Jobs hold connections longer, prevents starting new jobs • Use alternative technologies, especially over WAN (Wide Area Network): • “caching” Conditions from Oracle when possible Simplified Diagram ! Elizabeth Gallas - Databases/COMA

Technologies for Conditions “caching” • “DB Release”: make a system of files containing all data ‘needed’. • Used in reprocessing campaigns and for MC processing/analysis • Includes: • SQLite replicas: “mini” Conditions DB • with specific Folders, IOV range, CoolTag • (a ‘slice’ – small subset of all rows in Oracle tables) • And associated POOL files and a PFC (file catalog) • “Frontier”: store results in a web cache. • Developed by Fermilab (used by CDF, further refined for CMS) • Basic Idea: Frontier / Squid servers located at/near Oracle RAC • negotiate transactions between grid jobs and Oracle DB • reduce the load on Oracle by caching results of repeated queries • reduce latency observed connecting to Oracle over the WAN. • Additional Squid servers at remote sites help even more • Used by default for user analysis jobs. • Picture on next slide Elizabeth Gallas - Databases/COMA

Conditions DB access via Frontier Frontier for distributed database access • Used by default for user analysis jobs. Main components • Frontier server • Communicates directly with Oracle server • Includes data caching • Provides data to Squids • Squid • Communicates with Frontier server over http • Caches retrieved data locally for its clients ATLAS: Frontier in operation late in 2009 • Frontier servers at T1 sites on replication • ~60 Squids all over the world • Mostly T2, some T3 too Tier 2 Tier 1 Elizabeth Gallas - Databases/COMA

DB Access in Athena • Athena applications access conditions and geometry DBs using LCG software libraries POOL, COOL and CORAL • Allows for transparent usage of various technologies (Oracle, SQLite, FroNTier/Squid) Elizabeth Gallas - Databases/COMA

Tips for Users (1) • What Global Conditions and Geometry tags to use? • Autoconfigure your job • Have job read global tags from its input file (ESD, AOD) • In job options: from RecExConfig.RecFlags import rec rec.AutoConfiguration=['everything'] • In job transforms: Command line parameter 'autoConfiguration=everything' https://twiki.cern.ch/twiki/bin/view/Atlas/RecExCommonAutoConfiguration Slide: V.Tsulaia Elizabeth Gallas - Databases/COMA

Tips for Users (2) • How to configure my environment to access • FroNTier/Squid? • Conditions payload POOL files? • DB Release for geometry (and MC conditions if needed)? • All that is done for you automatically... … just sit back and enjoy the ride! Slide: V.Tsulaia Elizabeth Gallas - Databases/COMA

Tips for Users (3) If things go wrong … and it seems to be related to database access Useful information on TWiki: • Athena DB Access: https://twiki.cern.ch/twiki/bin/view/Atlas/AthenaDBAccess • COOL Troubles: https://twiki.cern.ch/twiki/bin/viewauth/Atlas/CoolTroubles • Atlas DB Release: https://twiki.cern.ch/twiki/bin/viewauth/Atlas/AtlasDBRelease These TWiki documents should be able to help you in narrowing down the problem and then you'll be in position to • Either ask your site admin • Or send email to Database Operations<hn-atlas-DBOps@cern.ch> Slide: V.Tsulaia Elizabeth Gallas - Databases/COMA

Conclusions: Databases and DB Access from Athena • Databases are used extensively in ATLAS • At every stage of data taking, processing, analysis • Scratch the surface of many interactive user applications • And you will find a Database ! • I’ve attempted to give an overview of the issues and considerations in DB access from Athena • The need to provide database information • In a variety of access patterns • With potentially widely varying data volumes • From diverse clients makes Athena access to ATLAS non-event-wise databases (Conditions and Geometry) complex. • Supporting different technologies • allows us to optimally meet the various needs. • A lot of effort has gone into making DB access for user analysis as transparent as possible … • More details can be found: • See V.Tsulaia slides • Software Workshop in Tbilisi Oct 26, 2010 • On various TWiki pages Elizabeth Gallas - Databases/COMA

An Overview of ATLAS Databases and Database access (Geometry/conditions) in Athena

An Overview of ATLAS Databases and Database access (Geometry/conditions) in Athena

Presentation Transcript

talk-ppt - PowerPoint Presentation

An Overview of Multiple View Geometry and Matching

Databases and Database Design

Databases and Data Access

Microsoft Access 2010 Overview of Microsoft Access Databases

Knowledge access and sharing An overview of access models

PowerPoint Presentations: An Overview

Transformation Geometry PowerPoint

Database Access: Multi-User Databases

Project Athena: Overview

Databases and Database Users

Perl/cgi and an Access Database

GEOMETRY 3A CHAPTER 10 POWERPOINT PRESENTATION

Overview of Database Access in .Net

Database PowerPoint

FEA boundary conditions and geometry

Development, Deployment and Operations of ATLAS Databases

Updating an ACCESS database

Solenoid map access in athena

An Overview of today’s Presentation

Texas Hydrological Conditions An Overview

An overview of my presentation: