200 likes | 273 Views
SAM:Metadata and Middleware Components. Mòrag Burgon-Lyon University of Glasgow. Contents. CDF Computing Goals SAM CAF DCAF JIM How it all fits together SAM TV. CDF Computing Goals. The CDF experiment intend to have: 25% of computing offsite by June 2004 50% by June 2005
E N D
SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow CDF Grid
Contents • CDF Computing Goals • SAM • CAF • DCAF • JIM • How it all fits together • SAM TV CDF Grid
CDF Computing Goals • The CDF experiment intend to have: • 25% of computing offsite by June 2004 • 50% by June 2005 • To achieve these goals several components are being developed and deployed: • SAM – data handling system • CAF & DCAF – batch systems • JIM – Grid extension to SAM • SAM TV – monitoring for SAM Stations CDF Grid
SAM • Sequential Access via Metadata • Mature data handling system • Users can start SAM projects, e.g. running AC++Dump. • Large volumes of files (in datasets) may be requested by SAM and are processed by the SAM projects. These are transferred from either the main cache at Fermilab, or from neighbouring SAM stations. CDF Grid
CAF • The original CDF Analysis Farm • The CAF is a 600 CPU farm of computers running Linux • Access to the CDF data handling system and databases to allow CDF collaborators to run batch analysis jobs. • Since standard Unix accounts are not created for users (i.e. you cannot ``log into'' the CAF), custom software provides remote job submission, control, monitoring, and output interface for the user • Strongly authenticated via kerberos. • http://cdfcaf.fnal.gov/ CDF Grid
CAF • Users compile and link their analysis jobs on their desktop. • The required files are archived into a temporary tar file and copied to the CAF head node. • Jobs are executed using a distributed batch system Farm Batch System Next Generation (FBSNG) • Output is tarred up and either received back on the users desktop or saved to scratch space on the CAF FTP server, for later retrieval. • A cdfsoft installation is required to submit jobs. Two 8-way Linux SMP systems are provided for users without cdfsoft on their local desktops, and for general reference for users having problems with their local installations. CDF Grid
CAF CDF Grid
CAF • Initially configured to favour large reads and small writes (e.g. produce small skims, histograms, etc from official secondary datasets). • Extensions have been made to allow users to store their output files back into the SAM data handling system allowing jobs with larger writes to run easily. • CAF has also been used for large-scale Monte Carlo and tertiary data set production. • Users typically use CAF GUI, though command line job submissions are also possible. CDF Grid
CAF Monitoring CDF Grid
DCAF • Decentralised CDF Analysis Farm • CAF implemented at several remote sites from Taiwan to Canada • Rollout began in January 2004 • Core set of 6 DCAF sites provide backbone • New sites continually being added • User selects site on which to run CDF Grid
DCAF Hardware Resources CDF Grid
DCAF • Recent DCAF report (1st June): • Taiwan DCAF has finished copying and pinning 3 large muon datasets with no major problems. • Request for ~600GHz of MC production for June has been received. • Storing MC results in a timely way was a priority. • The MC producers have been educated in storage of files through SAM (web-pages, tutorials), requiring only the CDF dataset name or MC request ID. • Request for ~600GHz of MC production for June has been received. CDF Grid
JIM • Job and Information Management • Grid extension to SAM allowing users to submit jobs using a local thin client. • Remote broker assigns each job to an execution site based on where the most data is present and the queue is the shortest. • Job progress can be monitored through a web page. • Job output can be downloaded from using a web browser. CDF Grid
JIM CDF Grid
JIM • JIM can run on shared resources, and can interface with most batch systems • CDF environment can be tar-balled, for running Monte Carlo on non-CDF equipment. • D0 have successfully run large Monte Carlo • CDF Monte Carlo has been run interactively on D0 cluster. Next step is JIM submission. CDF Grid
How it all fits together CDF Grid
SAM TV Adam Lyon at Fermilab has created a set of web pages that can be used to monitor SAM stations and projects. Demo: • http://ncdf151.fnal.gov:8520/samTV/current/samTV.html CDF Grid
SAM TV • Snapshot summaries – lists the stations with a pie-chart showing the number of file transfers. • SAM project snapshot – all the projects on the selected station with a plot of file delivery/time. • Project details – including time and plot of last file delivery • Consumer and process – consumer and process Ids, application, node, user, etc. • Files – list of files desired by a project CDF Grid
SAM TV CDF Grid
Challenges and Future Work • Implementation and rollout of JIM for MC • More DCAF installations • Encourage user migration • Solve fragmented disks and caches problem (suggestions welcome!) CDF Grid