90 likes | 226 Views
Introduction: Distributed POOL File Access. Elizabeth Gallas - Oxford – September 16, 2009 Offline Database Meeting. Overview. ATLAS relies on the Grid for processing many types of jobs. Jobs need Conditions data from Oracle + referenced POOL files.
E N D
Introduction: Distributed POOL File Access Elizabeth Gallas - Oxford – September 16, 2009 Offline Database Meeting
Overview • ATLAS relies on the Grid for processing many types of jobs. • Jobs need Conditions data from Oracle + referenced POOL files. • ATLAS has decided to deploy an array of Frontier/Squid servers to • negotiate transactions between grid jobs and the Oracle DB. • reduce the load on Oracle • reduce latency observed connecting to Oracle over the WAN. • With Frontier: • Inline Conditions via Squid cache –> Frontier Server -> Oracle • Referenced Conditions data is in POOL files (always < 2GB) • which are manageable on all systems. • FOCUS TODAY on how GRID JOBS find the POOL files. • All sites accepting jobs on the grid must have: • all the POOL files and a • PFC (POOL File Catalog) – xml file w/POOL file locations at the site • Job success on the GRID requires • GRID submission system must know how sites are configured. • GRID sites configured with site appropriate env and Squid failover* Elizabeth Gallas
DB Access Software Components Elizabeth Gallas
Where are the POOL files ? • DQ2(DDM) - distributes Event data files and Conditions POOL files. • TWiki: StorageSetUp for T0, T1's and T2's • ADC/DDM maintains ToA sites (Tiers of ATLAS) • ToA sites are subscribed to receive DQ2 POOL files • ToA sites have "space tokens" (areas for file destinations) such as: • “DATADISK" for real event data • “MCDISK" area for simulated event data • … • “HOTDISK" area for holding POOL files needed by many jobs • has more robust hardware for more intense access • Some sites also use Charles Waldman's "pcache": • Duplicates files to a scratchdisk accessible to local jobs • avoiding network access to "hotdisk". • Magic in pcache tells the job to look in the scratchdisk first. • Are POOL files deployed to all ToA sites 'on the GRID' ? • Tier-1 ? Tier-2 ? bigger Tier-3s ? • Any other sites that want to use them ? Are these sites in ToA ? Elizabeth Gallas
Email from Stephane Jezequel (Sept 15) • Could you please forward this request to all ATLAS Grid sites which are included in DDM: • As discussed during the ATLAS software week, sites are requested to implement the space token ATLASHOTDISK. • More information: • https://twiki.cern.ch/twiki/bin/view/Atlas/StorageSetUp#The_ATLASHOTDISK_space_token • Sites should assign at least 1 TB to this space token (should foresee 5 TB). In case of storage crisis at the site, the 1 TB can be reduced to 0.5 TB. Because of the special usage of these files, sites should decide to assign a specific pool or not. • When it is done, please report to DDM Ops (Savannah ticket is a good solution) to create the new DDM site. Elizabeth Gallas
Where are the PFCs (POOL File catalogs)? • Mario Lassnig - modified DQ2 client dq2-ls • Can ‘on the fly’ create the PFC for the POOL files on a system • written to work for "SRM systems“ (generally Tier-1s) • Non-SRM systems (generally Tier-2,3) • this PFC file must be modified: replace SRM specific descriptors • We need to collectively agree on the best method and designate who will follow it up • Scriptable way to remove SRM descriptors from PFC for use on non-SRM systems. • Cron? • Detection of new POOL file arrival • Generate updated PFC • Run above script preparing file for local use Elizabeth Gallas
Configuring jobs on the GRID Item 5 from Dario’s TOB Action items: DB and ADC groups: discuss and implement a way to set the environment on each site so as to point to the nearest Squid and the local POOL file catalogue • Grid submission system must know which sites have • Squid access to Conditions data • Site specific ? Failover • Experience at Michigan with muon calibration: Frontier / Squid access to multiple Squid servers • Subscriptions in place to insure POOL files are in place and PFC location (?) • Site specific – continuous updates to local PFC • Manual setup for now in Ganga/Panda, • will move to AGIS with configuration file on each site. Link to AGIS Technical Design Proposal: • http://indico.cern.ch/getFile.py/access?sessionId=4&resId=1&materialId=7&confId=50976 Elizabeth Gallas
BACKUP Elizabeth Gallas
Features of Athena: • Previous to Release 15.4: • Athena (RH) looks at IP the job is running at, • uses dblookup.xml in the release to decide the order of database connections to try to get the Conditions data. • Release 15.4 • Athena looks for Frontier environment variable, • if found, ignores the dblookup • using instead another env Elizabeth Gallas