1 / 10

Using Gordon to Accelerate LHC Science

Using Gordon to Accelerate LHC Science. Rick Wagner San Diego Supercomputer Center. Brian Bockelman University of Nebraska-Lincoln. XSEDE 13 July 22-25, 2013 San Diego, CA. Coauthors. Mahidhar Tatineni Eva Hocks Kenneth Yoshimoto Scott Sakai Michael L. Norman.

vega
Download Presentation

Using Gordon to Accelerate LHC Science

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using Gordon toAccelerate LHC Science Rick Wagner San Diego Supercomputer Center Brian Bockelman University of Nebraska-Lincoln XSEDE 13 July 22-25, 2013 San Diego, CA

  2. Coauthors MahidharTatineni Eva Hocks Kenneth Yoshimoto Scott Sakai Michael L. Norman Igor Sfiligoi (UCSD) MatevzTadel (UCSD) James Letts (UCSD) Frank Würthwein (UCSD) Lothar A. Bauerdick (FNAL)

  3. When Grids Collide

  4. Overview • 2012 LHC data collection rates higher than first planned (1000Hz vs. 150Hz) • Additional data was “parked” to be reduced during 2 year shutdown • Delays the science from data at the end

  5. Overview • Frank Würthwein (UCSD, CMS Tier II lead) approaches Mike Norman (Director of SDSC) regarding analysis delay • A rough plan emerges: • Ship data at the tail of the analysis chain to SDSC • Attach Gordon to CMS workflow • Ship results back to FNAL • From CMS perspective, Gordon becomes a compute resources • From SDSC perspective, CMS jobs run like a gateway

  6. Gordon Overview • 1,024 2S Xeon E5 (Sandy Bridge) nodes • 16 cores, 64 GB/node • Intel Jefferson Pass mobo • PCI Gen3 • 300 GB Intel 710 eMLC SSDs • 300 TB aggregate • 64, 2S Westmere I/O nodes • 12 core, 48 GB/node • 4 LSI controllers • 16 SSDs • Dual 10GbE • SuperMicro mobo • PCI Gen2 • 3D Torus • Dual rail QDR • Large Memory vSMP Supernodes • 2TB DRAM • 10 TB Flash “Data Oasis” Lustre PFS 100 GB/sec, 4 PB

  7. CMS Components • CMSSW: Base software components, NFS exported from IO node • OSG worker node client: CA certs, CRLs • Squid proxy: cache calibration data needed for each job, running on IO node • glideinWMS: worker node manager pulls down CMS jobs • BOSCO: GSI-SSH capable batch job submission tool • PhEDEx: data transfer management

  8. Results • Work completed in February to March 2013 • 400 million collision events • 125TB in, ~150 TB out • ~2 million SUs • Good experience regarding OSG-XSEDE compatibility

  9. Thoughts& Conclusions • OSG & XSEDE technologies very similar • GridFTP • GSI authentication • Batch systems, etc. • Staff at both ends speak the same language • Some things would make a repeat easier: • CVMFS (Fuse-based file system for CMS tools) • Common runtime profile for OSG & XSEDE • Common SU and data accounting

More Related