620 likes | 798 Views
June 24, 2005. Workshop on Physics at Hadron Collider. LHC Computing Project (Grid). Kihyeon Cho Center for High Energy Physics Kyungpook National University. Contents. LHC Computing LCG, OSG LCG Service @CHEP CMS Computing @CHEP CMS MC Production Road Map Summary.
E N D
June 24, 2005 Workshop on Physics at Hadron Collider LHC Computing Project (Grid) Kihyeon Cho Center for High Energy Physics Kyungpook National University
Contents • LHC Computing • LCG, OSG • LCG Service @CHEP • CMS Computing @CHEP • CMS MC Production • Road Map • Summary
1. LHC Computing = 1% of 1 Megabyte (1MB) A digital photo 1 Gigabyte (1GB) = 1000MB A DVD movie 1 Terabyte (1TB) = 1000GB World annual book production 1 Petabyte (1PB) = 1000TB 10% of the annual production by LHC experiments 1 Exabyte (1EB) = 1000 PB World annual information production • LHC Data per experiment: • 40 million collisions per second • After filtering, 100 collisions of interest per second • A Megabyte of digitised information for each collision = recording rate of 100 Megabytes/sec • 1 billion collisions recorded = 1 Petabyte/year With four experiments, processed data we will accumulate 15 PetaBytes of new data each year Korean Group is working on CMS CMS LHCb ATLAS ALICE Ref. David Foster
40 MHz (40 TB/sec) level 1 - special hardware 75 KHz (75 GB/sec) level 2 - embedded processors 5 KHz (5 GB/sec) level 3 - PCs 100 Hz (100 MB/sec) data recording & offline analysis ~15 PetaBytes of data each year Analysis will need the computing power of ~ 100,000 of today's fastest PC processors! Balloon (30 Km) CD stack with 1 year LHC data! (~ 20 Km) Concorde (15 Km) Mt. Blanc (4.8 Km) =>1. Tier0, 1, 2 Regional Data Center 2. Grid concept Ref. David Foster
Korea Russia UK USA U Florida Caltech UCSD Maryland FIU Iowa LHC Regional Data Center (2007+) • 5000 physicists, 60 countries • 10s of Petabytes/yr by 2008 • 1000 Petabytes in < 10 yrs? CMS Experiment Online System CERN Computer Center 200 - 1500 MB/s Tier 0 10-40 Gb/s Tier 1 >10 Gb/s Tier 2 2.5-10 Gb/s Tier 3 Tier 4 Physics caches PCs Ref. Paul Avery
Grid (LHC Computing Grid)
LCG (May 2005) • Country providing resources • Country anticipating joining • In LCG-2: • 139 sites, 32 countries • ~14,000 cpu • ~5 PB storage • Includes non-EGEE sites: • 9 countries • 18 sites Number of sites is approaching the scale expected for LHC - demonstrates the full complexity of operations Ref. David Foster
Grid3: A National Grid Infrastructure • 32 sites, 4000 CPUs: Universities + 4 national labs • Part of LHC Grid, Running since October 2003 • Sites in US, Korea, Brazil, Taiwan • Applications in HEP, LIGO, SDSS, Genomics, fMRI, CS Brazil www.ivdgl.org/grid3 Ref. Paul Avery
LCG Goals • The goal of the LCG project is to prototype and deploy the computing environment for the LHC experiments • Two phases: • Phase 1: 2002 – 2005 • Build a service prototype, based on existing grid middleware • Gain experience in running a production grid service • Produce the TDR for the final system • Phase 2: 2006 – 2008 • Build and commission the initial LHC computing environment • LCG is not a development project – it relies on other grid projects or grid middleware development and support Ref. David Foster
LCG Service Elements • Web Portal • Security • Application Layer • ARDA • Grid Service based on LCG & Grid3 • Data storage using SRM & SRB Grid3 CHEP Service elements
Current Status of LCG Service @CHEP • CA(Certificate Authority) • To construct KR-CHEP CA • Constructing CA , RA(Registration Authority) and web interface • Final goal is to be approved by DOE and APGrid Until then, use DOE CA for LHC Grid jobs • Constructing Base Service • Constructing Web Portal • Installing CMS software & Grid interface
3. CMS Computing Farms @CHEP • Non-Grid Farm • LSF Batch (20CPUs) • Will attach to LCG farm soon => disappear soon • Grid-Farm • Grid3 (3CPUs) => Open Science Grid (May 17, 2005) • CMS software has not been test yet. • LCG Farm (9CPUs) • Has been upgraded from 2.3.0 to 2.4.0 • Will move on SLC3 from RH7.3 soon.
CMS computing components @ CHEP
LCG Farm @CHEP • VO: atlas alice lhcb cms dteam sixt • CO_SW_DIR:/opt/exp_soft • System Administrator: lcg_knu@knu.ac.kr (Daehee Han) LCG Map in Asia
4. CMS MC Production • CMKIN: MC Generation for a physics channel • 125 events ~ 1minute ~ 6 Mbyte nutple • OSCAR: Object-oriented for CMS Analysis and Reconstruction (old CMSIM) • 125 events ~12 hours ~ 230 Mbyte FZ file • ORCA: Object-oriented Reconstruction for CMS Analysis
MC Test Job ID @CHEP • Test ID –2210 (Oct. 2004 ~ Dec. 2004) • QCD jet events • To test Grid3 farm • No more used • Test ID – 9256 (March 2005 ~ Present ) • QCD jet events • To test OSG and LCG farm • Current test Job ID
CMKINView (Test Job) TestID-9256 (QCD jet) Farm: LCG (June 10, 2005) TestID-2210 (QCD jet) Farm: Grid3 (October 16, 2004)
MC Production Job ID @CHEP • MC Production ID – 9587 • qcd_l • Dr. Sangryul Ro • Batch, LCG • MC Production ID – 9588 • Higgs -> b bbar • Dr. Sangryul Ro • Batch, LCG • New Assignment => Scientific Linux CERN 3 needed
CMKINView (MC Production) JobID-9587(QCD_l) Farm: Batch (May 7, 2005) JobID-9588 (Higgs->b bbar) Farm: Batch (May 6, 2005)
5. Road Map Step1. To upgrade SLC3.0 from RH7.3 on LCG Farm • To use Konkuk University’s CPUs to test SLC3. Step2. Attach LSF batch Farm (20CPUs) to LCG Farm Step3. Federation on other sites in Korea • SNU(6CPUs), SKKU(1+1 CPUs) and Konkuk U (8+1 CPUs) • Yonsei U will contribute PHENIX’s 6CPUs for CMS. • Other Universities (1CPU) – UI • CERN (2 CPUs) – Federation and to test file transfer Step4. Tier1 Regional Data Center
Batch LSF WN Node WN Node WN Node WN Node WN Node WN Node WN Node WN Node WN Node WN Node WN Node WN Node WN Node WN Node WN Node WN Node WN Node WN Node WN Node WN Node WN Node WN Node WN Node Farms (Current) OSG LCG Torque Condor Non-dedicated OSG nodes Dedicated OSG nodes Nodes for CMS experiment Node for CMS experiment
WN Node WN Node WN Node WN Node WN Node WN Node WN Node WN Node WN Node WN Node WN Node WN Node WN Node WN Node WN Node WN Node WN Node WN Node WN Node WN Node WN Node WN Node WN Node Farms (Plan) OSG LCG Torque Condor Non-dedicated OSG nodes Dedicated OSG nodes Nodes for CMS experiment Node for CMS experiment
CMS Grid Institute APII TEIN … Hyeonhae/I Genkai … KREONET KORNET Plan CMS Grid Plan in Korea US Fermilab Korean CDF 1 CPU LCG KOREN/NOC 32 CPUs LCG Konkuk U 1+8 CPU SNU 6 CPUs Ewha WU 1 CPU 10 Gbps 2X2.5 Gbps LCG CERN Korean 2 CPU Yonsei 6 CPU 1Gbps 1Gbps 155Mbps 1Gbps LCG(?) SeoulXP SKKU 1+2 CPUs 1Gbps Suwon 40Gbps DaejeonXP 10Gbps OSG, LCG 10Gbps DaeguXP KISTI 64 CPUs CHEP 205 CPUs 10Gbps 10Gbps Japan KEK Korean Belle 12 CPUs 10Gbps Gwangju XP 5Gbps BusanXP 2.5Gbps 10Gbps Chonnam 1 CPU 2 X 1Gbps Dongshin 1 CPU GLORIAD 10Gbps
Tier-1 Regional Data Center • Service • Network - GLORIAD • Storage • Computing Ref. CMS Note 2004-031
Tier-0 – the accelerator centre • Data acquisition & initial processing • Long-term data curation • Distribution of data Tier-1 centres Tier-1 –“online” to the data acquisition process high availability • Managed Mass Storage – grid-enabled data service • Data-heavy analysis • National, regional support Canada – Triumf (Vancouver) France – IN2P3 (Lyon) Germany – Forschunszentrum Karlsruhe Italy – CNAF (Bologna) Netherlands – NIKHEF (Amsterdam) Nordic countries – distributed Tier-1 Spain – PIC (Barcelona) Taipei – Academia SInica UK – CLRC (Oxford) US – FermiLab (Illinois) – Brookhaven (NY) LCG Service Hierarchy Tier-2 – ~100 centres in ~40 countries • Simulation • End-user analysis – batch and interactive Ref. David Foster
June05 - Technical Design Report Sep05 - SC3 Service Phase May06 – SC4 Service Phase Apr07 – LHC Service commissioned 2005 2006 2007 2008 SC2 SC3 First physics cosmics First beams Full physics run SC4 LHC Service Operation LCG Service Challenge Sep06 – Initial LHC Service in stable operation • SC2 – Reliable Data Transfer • SC3 – Reliable Base Software Service • SC4 – Full Software Service Ref. Greg Graham
Network for CMS experiment LCG Test bed DataTAG 10Gbps Korea CHEP at CERN APII 2X622Mbps TEIN 155Mbps TransPAC Grid3 Test bed Hyeonhae-Genkai 2X1Gbps EuropeCERN (CMS) Korea CHEP Regional Data Center Ref. Dongchul Son
GLORIAD Today Ref. Greg Cole
GLORIAD Tomorrow Ref. Greg Cole
Storage Resource Broker KBSI KBSI servers KISTI CHEP Linux Cluster Postgres DB KBSI Firewall KEK Raid Disk SRB-enabled HPSS (120TB) SRB/MCAT server SRB/MCAT server SRB-enabled HPSS KEK Firewall SRB/MCAT server Poland Taiwan China Australia SRB/MCAT server SRB/MCAT server SRB/MCAT server SRB/MCAT server SRB-enabled SAM-FS (1200TB) Raid Disks
CMS Computing Grid Korean Group CMS Computing Grid Grid3 => OSG LCG SAMGrid EUDG Batch(LSF) Testbed DCAF CERN LHC CDF iVDGL CERN
6. Summary • Need more CPUs and storage for CMS • Need helps from many institutes for federations • S/W installation needed • KOREN(1Gbps) needed – Konkuk U. • GLORIAD and TEIN/APII network needed • 10Gbps needed between Busan – CHEP and Daejeon – CHEP • Ready to study CMS physics
LCG Tutorial For more information, ask Daehee Han (hanbi@knu.ac.kr, 053-950-6321)
Getting Started 1. Obtain a Cryptographic X.509 certificate from an LCG-2 approved Certification Authority (CA). 2. Get registered with LCG-2. 3. Join one of the LCG-2 Virtual Organizations (consequence of the registration process). 4. Obtain an account on a machine which has the LCG-2 User Interface software installed. 5. Create a proxy certificate.
Getting a Personal certificate How to request a DOEGrids Certificate http://www.doegrids.org/pages/How-To.html Personal certificate http://www.doegrids.org/pages/cert-request.html
Request your certificate Sponsor Information *Name of Sponsor (P.I., Supervisor): Kihyeon Cho *Sponsor's Email: cho@knu.ac.kr *Sponsor's Phone Number: +82-53-950-6320 ※ After submitting DOE CA, please notify to Prof. Kihyeon Cho (cho@knu.ac.kr) to approve CA. ※ http://www.doegrids.org/pages/cert-request.html
Obtain an account at UI • Ask Mr. Daehee Han to obtain a user account at User Interface machine(cluster3.knu.ac.kr) • Name and User ID
Exporting your key pair for use by Globus grid-proxy-init • Export or 'backup' your certificate. The interface for this varies from browser to browser. Internet Explorer starts with "Tools -> Internet Options -> Content"; Netscape Communicator has a "Security" button on the top menu bar; Mozilla starts with "Edit -> Preferences -> Privacy and Security -> Certificates". The exported file will probably have the extension .p12 or .pfx. • Guard this file carefully. Store it off your computer, or remove it once you are finished with this process. • Copy the above PKCS#12 file to the computer where you will run grid-proxy-init. • Extract your certificate (which contains the public key) and the private key: • Certificate: openssl pkcs12 -in YourCert.p12 -clcerts -nokeys -out $HOME/.globus/usercert.pem • To get the encrypted private key : openssl pkcs12 -in YourCert.p12 -nocerts -out $HOME/.globus/userkey.pem You must set the mode on your userkey.pem file to read/write only by the owner, otherwise grid-proxy-init will not use it(chmod go-rw $HOME/.globus/userkey.pem).
Registering in a grid virtual organization https://lcg-registrar.cern.ch/cgi-bin/register/account.pl
Checking the Certificate $ grid-cert-info Certificate: Data: Version: 3 (0x2) Serial Number: 3352 (0xd18) Signature Algorithm: sha1WithRSAEncryption Issuer: DC=org, DC=DOEGrids, OU=Certificate Authorities, CN=DOEGrids CA 1 Validity Not Before: Dec 17 00:11:58 2004 GMT Not After : Dec 17 00:11:58 2005 GMT Subject: DC=org, DC=doegrids, OU=People, CN=DaeHee Han 768004 Subject Public Key Info: Public Key Algorithm: rsaEncryption RSA Public Key: (1024 bit) Modulus (1024 bit): 00:e0:29:bb:83:ae:be:10:f9:1a:29:89:76:7a:26:
Checking the Certificate $ grid-cert-info -subject /DC=org/DC=doegrids/OU=People/CN=DaeHee Han 768004 $ openssl verify -CApath /etc/grid-security/certificates ~/.globus/usercert.pem /home/hanbi/.globus/usercert.pem: OK
Getting a Proxy Certificate $ grid-proxy-init Your identity: /DC=org/DC=doegrids/OU=People/CN=DaeHee Han 768004 Enter GRID pass phrase for this identity: Creating proxy ........................................................... Done Your proxy is valid until: Thu Jun 23 00:22:14 2005 $ grid-proxy-info subject : /DC=org/DC=doegrids/OU=People/CN=DaeHee Han 768004/CN=proxy issuer : /DC=org/DC=doegrids/OU=People/CN=DaeHee Han 768004 identity : /DC=org/DC=doegrids/OU=People/CN=DaeHee Han 768004 type : full legacy globus proxy strength : 512 bits path : /tmp/x509up_u19490 timeleft : 11:59:10
myproxy $ myproxy-init -s <host name> -d –n Your identity: /DC=org/DC=doegrids/OU=People/CN=DaeHee Han 768004 Enter GRID pass phrase for this identity: Creating proxy ....................................................................... Done Proxy Verify OK Your proxy is valid until: Wed Jun 29 12:24:23 2005 A proxy valid for 168 hours (7.0 days) for user /DC=org/DC=doegrids/OU=People/CN=DaeHee Han 768004 now exists on cluster3.knu.ac.kr.
Checking Job Status $ lcg-infosites --vo cms ce **************************************************************** These are the related data for cms: (in terms of queues and CPUs) **************************************************************** #CPU Free Total Jobs Running Waiting ComputingElement ---------------------------------------------------------- 168 141 1 1 0 ce01.pic.es:2119/jobmanager-lcgpbs-cms 64 63 1 1 0 ceitep.itep.ru:2119/jobmanager-lcgpbs-cms 28 28 0 0 0 ce01.lip.pt:2119/jobmanager-lcgpbs-cms 5 5 0 0 0 ce00.inta.es:2119/jobmanager-lcgpbs-cms 23 23 0 0 0 ingvar.nsc.liu.se:2119/jobmanager-lcgpbs-cms 20 20 0 0 0 ce.prd.hp.com:2119/jobmanager-lcgpbs-cms 46 45 0 0 0 grid-ce.desy.de:2119/jobmanager-lcgpbs-cms 1 0 0 0 0 lcgce02.ifae.es:2119/jobmanager-lcgpbs-cms 60 60 0 0 0 cluster.pnpi.nw.ru:2119/jobmanager-pbs-cms 96 91 1 1 0 gate.grid.kiae.ru:2119/jobmanager-lcgpbs-cms … …
Checking Status of Grid $ lcg-infosites --vo cms se ************************************************************** These are the related data for cms: (in terms of SE) ************************************************************** Avail Space(Kb) Used Space(Kb) Type SEs ---------------------------------------------------------- 823769472 1761092 disk seitep.itep.ru 1 n.a disk se01.lip.pt 1000000000000 500000000000 mss castorgrid.pic.es 7786256 138234180 disk se00.inta.es 793234712 10307544 disk se.grid.kiae.ru 58566112 794493472 disk teras.sara.nl 1000000000000 500000000000 mss lcgse05.ifae.es 26936956 8951308 disk se.prd.hp.com 364963932 91136 disk cms10.fuw.edu.pl 1448906356 11294768 disk grid100.kfki.hu 62299220 7832876 disk ingvar-se.nsc.liu.se