1.02k likes | 1.04k Views
SAM: Tevatron Experiments Using the Grid. Rick St. Denis, University of Glasgow. CDF and D0 Need the Grid Requirements, the CAF and SAM Grid from the User Perspective Grid to Meet the Need How SAM works SAM usage by D0 and CDF Near Future: SAMGrid. Spokespersons’ Requirements for CDF.
E N D
SAM: Tevatron Experiments Using the Grid Rick St. Denis, University of Glasgow • CDF and D0 Need the Grid • Requirements, the CAF and SAM • Grid from the User Perspective • Grid to Meet the Need • How SAM works • SAM usage by D0 and CDF • Near Future: SAMGrid Getting Ready for the Grid
Spokespersons’ Requirements for CDF • Maximize physics output @ low Lumi • L3 output rate: 80 -> 360Hz by 06 CDF needs the Grid Reviews: Director’s (technically), International Finance Committee (fiscally) FNAL PAC (for its physics merit) 50% computing outside FNAL Getting Ready for the Grid
Scale of CDF Requirements 6-7 sites, 100Duals each, by 2006 + 700 @FNAL Getting Ready for the Grid
CDF Computing Model Exists Now • Develop Analysis on desktop • Access to all CDF data from anywhere • Large scale processing on batch clusters • Submission from anywhere • interactive tools: ls,top,head/tail/cat • Output to scratch space or desktop Implemented Now with CAF (not Grid standard) Getting Ready for the Grid
Central Analysis Facility • CAF is a pile of PC’s with a pile of disks. (1200 processors and 100TB) • This can be implemented anywhere as dCAF: Decentralized CAF. • Output of jobs can go to desktop or a scratch area • Need a password for this: authentication (kerberos). Getting Ready for the Grid
Sequential Access through Metadata • Metadata: SAM allows groups of files to be identified into datasets using attributes (metadata) such as production pass version or top quark mass to associate them. • File Retrieval: SAM moves files to users as they request them. • File Storage: SAM allows output files to be stored with new metadata. Getting Ready for the Grid
Metadata [sam@nglas08 ~]$ sam get metadata --file=Bs_conc_4o5_3.root File Type: SAMMC Data File File Name: Bs_conc_4o5_3.root File ID: 2494282 File Size: 530926740 [B] File Start Time: 01/29/2004 16:00:00 File End Time: 01/29/2004 17:00:00 Application Family: generator Application Version: 1.00 Description: BsDspi_phipi MONTE CARLO Dataset 4o5 part 3 Run Number: 167634 totalevents = 7290 Work Group: cdf html = http://www.pd.infn.it/~lucchesi Node Name: cdfsam.cnaf.infn.it dataset = BsMC-lucchesi_test Getting Ready for the Grid
Use Cases • User Level MC Production • All Users have access • No data on site -> write to tape at FNAL • User Level Data Access • All users have access • Selected samples automaticaly copied on site SAM provides this Getting Ready for the Grid
Functionality • User selects a place to run, saying what dataset they will use • System checks they can do this (privileges) • User access to data at any place • User output is stored on any disk or back to tape at FNAL and results are made available for transfer to any site for others to analyse. Getting Ready for the Grid
User Perspective User Perspective CAF Gui/CLI CAF Gui/CLI Uses SAM Uses SAM Uses SAM Analysis program Outside Lab Only Fermilab Grid Grid Italy Toronto Korea Taiwan FermiCAF UK Getting Ready for the Grid
Meeting the Needs • SAM: How it works • Progress in SAM • CDFGridWorkshop: “Nerd’s Paradise” • D0 and CDF Usage Getting Ready for the Grid
FSS (Deamon) (fss) Fcdfdata016 Disk/Cache Stager Daemon (stagerng) Disk/Cache Station central-analysis Daemon (smaster) Disk/Cache Stager Daemon (stagerng) Stager Daemon (stagerng) Disk/Cache Stager Daemon (stagerng) Stager Daemon (stagerng) Getting Ready for the Grid
A Farm: Station with Stagers and Caches Cache Cache Cache Cache Cache Node1 Node2 Node3 Node4 Node5 Stager stagerng Stager stagerng Stager stagerng Stager stagerng Station smaster Stager stagerng Getting Ready for the Grid
What can 20 duals and 6 TB do? Need to transfer 0.6 GB/min or 1 TB/Day Getting Ready for the Grid
<fcdfdata016> fcdfdata016 Disks/Cache Getting Ready for the Grid
<fcdfdata016> fcdfdata016 Disks/Cache Station central-analysis smaster Getting Ready for the Grid
<fcdfdata016> fcdfdata016 Disks/Cache Stager stagerng Station central-analysis smaster Getting Ready for the Grid
<fcdfdata016>sam submit --script=userscript --group=groupname --cpu-per-event= --defname= fcdfdata016 Disks/Cache Stager stagerng Station central-analysis smaster Getting Ready for the Grid
<fcdfdata016> >>>>>> Starting project with the Station Master Station Master contacted, result: Started project 49008 (49008_sam_) for group test Waiting for the project to initialize... fcdfdata016 Disks/Cache Stager stagerng Station central-analysis smaster Getting Ready for the Grid
<fcdfdata016> Callback from server: 'OK|Project is ready' fcdfdata016 Disks/Cache Stager stagerng Station central-analysis smaster Project pmaster Getting Ready for the Grid
<fcdfdata016> >>>>>> Submitting the job to the batch system. fcdfdata016 Disks/Cache Stager stagerng Station central-analysis smaster Project pmaster Getting Ready for the Grid
<fcdfdata016> Job <52554> is submitted to queue <sam_lo>. fcdfdata016 Disks/Cache Stager stagerng Station central-analysis smaster Batch (LSF) 52554 <user> PSUSP Project pmaster Getting Ready for the Grid
<fcdfdata016> Job <52554> is submitted to queue <sam_lo>. fcdfdata016 Disks/Cache Stager stagerng Station central-analysis smaster Batch (LSF) 52554 <user> PSUSP Optimizer Project pmaster Getting Ready for the Grid
<fcdfdata016> Job <52554> is submitted to queue <sam_lo>. fcdfdata016 Disks/Cache Stager stagerng Station central-analysis smaster Batch (LSF) 52554 <user> PSUSP eworker eworker eworker Project pmaster eworker Getting Ready for the Grid
<fcdfdata016> Job <52554> is submitted to queue <sam_lo>. fcdfdata016 Disks/Cache Stager stagerng Station central-analysis smaster Batch (LSF) 52554 <user> PSUSP eworker eworker eworker Project pmaster eworker encp encp encp encp Getting Ready for the Grid
<fcdfdata016> Job <52554> is submitted to queue <sam_lo>. fcdfdata016 Disks/Cache Enstore Stager stagerng Station central-analysis smaster Batch (LSF) 52554 <user> PSUSP eworker eworker eworker Project pmaster eworker encp encp encp encp Getting Ready for the Grid
<fcdfdata016> Job <52554> is submitted to queue <sam_lo>. fcdfdata016 Disks/Cache Enstore Stager stagerng Station central-analysis smaster Batch (LSF) 52554 <user> PSUSP eworker eworker eworker Project pmaster eworker encp encp encp encp Getting Ready for the Grid
<fcdfdata016> Job <52554> is submitted to queue <sam_lo>. fcdfdata016 Disks/Cache Enstore Stager stagerng Station central-analysis smaster Batch (LSF) 52554 <user> PSUSP eworker eworker eworker Project pmaster eworker encp encp encp encp Getting Ready for the Grid
<fcdfdata016> Job <52554> is submitted to queue <sam_lo>. fcdfdata016 Disks/Cache Enstore Stager stagerng Station central-analysis smaster Batch (LSF) 52554 <user> PSUSP eworker eworker Project pmaster encp encp Getting Ready for the Grid
<fcdfdata016> Job <52554> is submitted to queue <sam_lo>. fcdfdata016 Disks/Cache Enstore Stager stagerng Station central-analysis smaster Batch (LSF) 52554 <user> RUN eworker eworker Project pmaster encp encp Getting Ready for the Grid
<fcdfdata016> Job <52554> is submitted to queue <sam_lo>. fcdfdata016 Disks/Cache Enstore Stager stagerng Station central-analysis smaster Batch (LSF) 52554 <user> RUN eworker samscript.sh userscript eworker Project pmaster encp encp Getting Ready for the Grid
<fcdfdata016> Job <52554> is submitted to queue <sam_lo>. fcdfdata016 Disks/Cache Enstore Stager stagerng Station central-analysis smaster Batch (LSF) 52554 <user> RUN eworker samscript.sh userscript eworker Project pmaster consumer encp encp Getting Ready for the Grid
<fcdfdata016> Job <52554> is submitted to queue <sam_lo>. fcdfdata016 Disks/Cache Enstore Stager stagerng Station central-analysis smaster Batch (LSF) 52554 <user> RUN eworker samscript.sh userscript eworker Project pmaster consumer encp encp Getting Ready for the Grid
<fcdfdata016> Job <52554> is submitted to queue <sam_lo>. fcdfdata016 Disks/Cache Enstore Stager stagerng Station central-analysis smaster Batch (LSF) 52554 <user> RUN eworker samscript.sh userscript eworker Project pmaster consumer encp encp Getting Ready for the Grid
<fcdfdata016> Job <52554> is submitted to queue <sam_lo>. fcdfdata016 Disks/Cache Enstore Stager stagerng Station central-analysis smaster Batch (LSF) 52554 <user> RUN eworker samscript.sh userscript eworker Project pmaster consumer encp encp Getting Ready for the Grid
<fcdfdata016> Job <52554> is submitted to queue <sam_lo>. fcdfdata016 Disks/Cache Enstore Stager stagerng Station central-analysis smaster Batch (LSF) 52554 <user> RUN eworker samscript.sh userscript eworker Project pmaster consumer encp encp Getting Ready for the Grid
SAMManager:sam Getting next input file... SAMManager:sam Project master will call back. Getting Ready for the Grid
<fcdfdata016> Job <52554> is submitted to queue <sam_lo>. fcdfdata016 Disks/Cache Enstore Stager stagerng Station central-analysis smaster Batch (LSF) 52554 <user> RUN eworker samscript.sh userscript eworker Project pmaster consumer encp encp Getting Ready for the Grid
<fcdfdata016> Job <52554> is submitted to queue <sam_lo>. fcdfdata016 Disks/Cache Enstore Stager stagerng Station central-analysis smaster Batch (LSF) 52554 <user> RUN eworker samscript.sh userscript eworker Project pmaster consumer encp encp Getting Ready for the Grid
<fcdfdata016> Job <52554> is submitted to queue <sam_lo>. fcdfdata016 Disks/Cache Stager stagerng Station central-analysis smaster Batch (LSF) 52554 <user> RUN samscript.sh userscript Project pmaster consumer Getting Ready for the Grid
<fcdfdata016> Job <52554> is submitted to queue <sam_lo>. fcdfdata016 Disks/Cache Stager stagerng Station central-analysis smaster Batch (LSF) 52554 <user> RUN samscript.sh userscript Project pmaster consumer Getting Ready for the Grid
<fcdfdata016> Job <52554> is submitted to queue <sam_lo>. fcdfdata016 Disks/Cache Stager stagerng Station central-analysis smaster Batch (LSF) 52554 <user> RUN samscript.sh userscript Project pmaster consumer Getting Ready for the Grid
<fcdfdata016> Job <52554> is submitted to queue <sam_lo>. fcdfdata016 Disks/Cache rm Stager stagerng Station central-analysis smaster rm Batch (LSF) 52554 <user> RUN samscript.sh userscript Project pmaster consumer Getting Ready for the Grid
<fcdfdata016> Job <52554> is submitted to queue <sam_lo>. fcdfdata016 Disks/Cache Stager stagerng Station central-analysis smaster Batch (LSF) 52554 <user> RUN samscript.sh userscript Project pmaster consumer Getting Ready for the Grid
<fcdfdata016> Job <52554> is submitted to queue <sam_lo>. fcdfdata016 Disks/Cache Stager stagerng Station central-analysis smaster Batch (LSF) 52554 <user> RUN samscript.sh Optimizer userscript Project pmaster consumer Getting Ready for the Grid
<fcdfdata016> Job <52554> is submitted to queue <sam_lo>. fcdfdata016 Disks/Cache Stager stagerng Station central-analysis smaster Batch (LSF) 52554 <user> RUN samscript.sh eworker eworker userscript Project pmaster consumer Getting Ready for the Grid
<fcdfdata016> Job <52554> is submitted to queue <sam_lo>. fcdfdata016 Disks/Cache Stager stagerng Station central-analysis smaster Batch (LSF) 52554 <user> RUN samscript.sh eworker eworker userscript Project pmaster consumer rcp rcp Getting Ready for the Grid
<fcdfdata016> Job <52554> is submitted to queue <sam_lo>. fcdfdata016 Disks/Cache Other Cache Stager stagerng Station central-analysis smaster Batch (LSF) 52554 <user> RUN samscript.sh eworker eworker userscript Project pmaster consumer rcp rcp Getting Ready for the Grid
<fcdfdata016> Job <52554> is submitted to queue <sam_lo>. fcdfdata016 Disks/Cache Other Cache Stager stagerng Station central-analysis smaster Batch (LSF) 52554 <user> RUN samscript.sh eworker eworker userscript Project pmaster consumer rcp rcp Getting Ready for the Grid
<fcdfdata016> Job <52554> is submitted to queue <sam_lo>. fcdfdata016 Disks/Cache Stager stagerng Station central-analysis smaster Batch (LSF) 52554 <user> RUN samscript.sh eworker eworker userscript Project pmaster consumer rcp rcp Getting Ready for the Grid