Outline

Dynamically Creating Big Data Centers for the LHCFrank WürthweinProfessor of PhysicsUniversity of California San DiegoSeptember 25th, 2013

Outline • The Science • Software & Computing Challenges • Present Solutions • Future Solutions Frank Wurthwein - ISC Big Data

The Science

The Universe is a strange place! ~67% of energy is “dark energy” We got no clue what this is. ~29% of matter is “dark matter” We have some ideas but no proof of what this is! All of what we know makes up Only about 4% of the universe. Frank Wurthwein - ISC Big Data

Mont Blanc To study Dark Matter we need to create it in the laboratory Lake Geneva LHCb ATLAS CMS ALICE Frank Wurthwein - ISC Big Data

“Big bang” in the laboratory • We gain insight by colliding particles at the highest energies possible to measure: • Production rates • Masses & lifetimes • Decay rates • From this we derive the “spectroscopy” as well as the “dynamics” of elementary particles. • Progress is made by going to higher energies and brighter beams. Frank Wurthwein - ISC Big Data

Explore Nature over 15 Orders of magnitudePerfect agreement between Theory & Experiment Dark Matter expected somewhere below this line. Frank Wurthwein - ISC Big Data

And for the Sci-Fi Buffs … Imagine our 3D world to be confined to a 3D surface in a 4D universe. Imagine this surface to be curved such that the 4th D distance is short for locations light years away in 3D. Imagine space travel by tunneling through the 4th D. The LHC is searching for evidence of a 4th dimension of space. Frank Wurthwein - ISC Big Data

Recap so far … • The beams cross in the ATLAS and CMS detectors at a rate of 20MHz • Each crossing contains ~10 collisions • We are looking for rare events that are expected to occur in roughly 1/10000000000000 collisions, or less. Frank Wurthwein - ISC Big Data

Software & ComputingChallenges

The CMS Experiment

The CMS Experiment • 80 Million electronic channels x 4 bytes x 40MHz ----------------------- ~ 10 Petabytes/sec of information x 1/1000 zero-suppression x 1/100,000 online event filtering ------------------------ ~ 100-1000 Megabytes/sec raw data to tape 1 to 10 Petabytes of raw data per year written to tape, not counting simulations. • 2000 Scientists (1200 Ph.D. in physics) • ~ 180 Institutions • ~ 40 countries • 12,500 tons, 21m long, 16m diameter Frank Wurthwein - ISC Big Data

Active Scientists in CMS ~1/4 of the collaboration, scientists and engineers, contributed to the common source code of ~3.6M C++ SLOC. 5-40% of the scientific members are actively doing large scale data analysis in any given week. Frank Wurthwein - ISC Big Data

Evolution of LHC Science Program 1000Hz 150Hz 10000Hz Event Rate written to tape Frank Wurthwein - ISC Big Data

The Challenge How do we organize the processing of 10’s to 1000’s of Petabytes of data by a globally distributed community of scientists, and do so with manageable “change costs” for the next 20 years ? Guiding Principles for Solutions Chose technical solutions that allow computing resources as distributed as human resources. Support distributed ownership and control, within a global single sign-on security context. Design for heterogeneity and adaptability. Frank Wurthwein - ISC Big Data

Present Solutions

Federation of National Infrastructures. In the U.S.A.: Open Science Grid Frank Wurthwein - ISC Big Data

Among the top 500 supercomputers there are only two that are bigger when measured by power consumption. Frank Wurthwein - ISC Big Data

Tier-3 Centers • Locally controlled resources not pledged to any of the 4 collaborations. • Large clusters at major research Universities that are time shared. • Small clusters inside departments and individual research groups. • Requires global sign-on system to be open for dynamically adding resources. • Easy to support APIs • Easy to work around unsupported APIs Frank Wurthwein - ISC Big Data

Me -- My friends -- The grid/cloud Common to all sciences and industry Domain science specific Me Thin client O(104) Users Thick VO Middleware & Support My friends O(101-2) VOs The anonymous Grid or Cloud Thin “Grid API” O(102-3) Sites Frank Wurthwein - ISC Big Data

“My Friends” Services • Dynamic Resource provisioning • Workload management • schedule resource, establish runtime environment, execute workload, handle results, clean up • Data distribution and access • Input, output, and relevant metadata • File catalogue Frank Wurthwein - ISC Big Data

Optimize Data Structure for Partial Reads Frank Wurthwein - ISC Big Data

Fraction of a file that is read Average 20-35% Median 3-7% 20% (depending on type of file) # of files read Overflow bin For vast majority of files, less than 20% of the file is read. Frank Wurthwein - ISC Big Data

Future Solutions

From present to future • Initially, we operated a largely static system. • Data was placed quasi-static before it can be analyzed. • Analysis centers have contractual agreements with the collaboration. • All reconstruction is done at centers with custodial archives. • Increasingly, we have too much data to afford this. • Dynamic data placement • Data is placed at T2s based on job backlog in global queues. • WAN access: ”Any Data, Anytime, Anywhere” • Jobs are started on the same continent as the data instead of the same cluster attached to the data. • Dynamic creation of data processing centers • Tier-1 hardware bought to satisfy steady state needs instead of peak needs. • Primary processing as data comes off the detector => steady state • Annual Reprocessing of accumulated data => peak needs Frank Wurthwein - ISC Big Data

Any Data, Anytime, Anywhere Global redirection system to unify all CMS data into one globally accessible namespace. Is made possible by paying careful attention to IO layer to avoid inefficiencies due to IO related latencies. Frank Wurthwein - ISC Big Data

Vision going forward Implemented vision for 1st time in Spring 2013 using Gordon Supercomputer at SDSC. Frank Wurthwein - ISC Big Data

Frank Wurthwein - ISC Big Data

CMS “My Friends” Stack • CMSSW release environment • NFS exported from Gordon IO nodes • Future: CernVM-FS via Squid caches • J. Blomeret al.;2012 J. Phys.: Conf. Ser. 396052013 • Security Context (CA certs, CRLs) via OSG worker node client • CMS calibration data access via FroNTier • B. Blumenfeldet al; 2008 J. Phys.: Conf. Ser.119 072007 • Squid caches installed on Gordon IO nodes • glideinWMS • I. Sfiligoi et al.; doi:10.1109/CSIE.2009.950 • Implements “late binding” provisioning of CPU and job scheduling • Submits pilots to Gordon via BOSCO (GSI-SSH) • WMAgent to manage CMS workloads • PhEDExdata transfer management • Uses SRM and gridftp Job environment Data and Job handling Frank Wurthwein - ISC Big Data

CMS “My Friends” Stack This is clearly mighty complex !!! • CMSSW release environment • NFS exported from Gordon IO nodes • Future: CernVM-FS via Squid caches • J. Blomeret al.;2012 J. Phys.: Conf. Ser. 396052013 • Security Context (CA certs, CRLs) via OSG worker node client • CMS calibration data access via FroNTier • B. Blumenfeldet al; 2008 J. Phys.: Conf. Ser.119 072007 • Squid caches installed on Gordon IO nodes • glideinWMS • I. Sfiligoi et al.; doi:10.1109/CSIE.2009.950 • Implements “late binding” provisioning of CPU and job scheduling • Submits pilots to Gordon via BOSCO (GSI-SSH) • WMAgent to manage CMS workloads • PhEDExdata transfer management • Uses SRM and gridftp Job environment So let’s focus only on the parts that are specific to incorporating Gordon as a dynamic data processing center. Data and Job handling Frank Wurthwein - ISC Big Data

Items in red were deployed/modified to incorporate Gordon Minor mod of PhEDExconfig file BOSCO Deploy Squid Export CMSSW & WN client Frank Wurthwein - ISC Big Data

Gordon Results • Work completed in February/March 2013 as a result of a “lunch conversation” between SDSC & US-CMS management • Dynamically responding to an opportunity • 400 Million RAW events processed • 125 TB in and ~150 TB out • ~2 Million core hours of processing • Extremely useful for both science results as well as proof of principle in software & computing. Frank Wurthwein - ISC Big Data

Summary & Conclusions • Guided by the principles: • Support distributed ownership and controlin a global single sign-on security context. • Design for heterogeneity and adaptability • The LHC experiments very successfully developed and implemented a set of new concepts to deal with BigData. Frank Wurthwein - ISC Big Data

Outlook • The LHC experiments had to largely invent an island of BigData technologies with limited interactions with industry and other domain sciences. • Is it worth building bridges to other islands ? • IO stack and HDF5 ? • MapReduce ? • What else ? • Is there a mainland emerging that is not just another island ? Frank Wurthwein - ISC Big Data

Outline

Outline

Presentation Transcript

Outline

Outline

Outline

Outline

Outline

Outline

Outline

outline

outline

OUTLINE

Outline

Outline

Outline

Outline

Outline

Outline

Outline

Outline

Outline:

Outline

Outline

OUTLINE: