1 / 29

Research Computing at Harvard

Research Computing at Harvard. John Huth. Topics. Support of computing in science (as opposed to desktop) is becoming more and more of an issue at research universities. Crimson Grid Initiative in Innovative Computing The EGG project Another kind of LHC computing challenge:

tevy
Download Presentation

Research Computing at Harvard

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Research Computing at Harvard John Huth

  2. Topics • Support of computing in science (as opposed to desktop) is becoming more and more of an issue at research universities. • Crimson Grid • Initiative in Innovative Computing • The EGG project • Another kind of LHC computing challenge: • The inverse mapping problem.

  3. The Crimson Grid InitiativeStarted in April 2004 A project to engineer a technology fabric in support of interdisciplinary & collaborative computing Joy Sircar – Division of Engineering and Applied Science

  4. The Crimson Grid: • A Scalable collaborative computing environment for research at the interface of science and engineering • A Gateway to community/national/global computing infrastructures for interdisciplinary research • A Test bed for faculty & IT-industry affiliates within the framework of a production environment for integrating HPC solutions for higher education & research • A Campus Resource for skills & knowledge sharing for advanced systems administration & management of switched architectures

  5. The Campus Grid Vision: Grid of Grids from Local to Global National OSG OSG Community Campus ATLAS CrimsonGrid-GLOW

  6. OSG/ ATLAS Users CrimsonGrid Users Campus Grid “agreed” Users CG-GLOW DEAS Condor pool NNIN Condor pool CRC-I Condor pool WKSTNs Condor pool CrimsonGrid Gateway GT-GK GT-GK GT-GK GT-GK GT-GK

  7. Power of Campus Grids GLOW - ~1000 Procs CG - ~750 Procs In just 2 campuses ! …..

  8. Grid use in first 12-months • First Use Research Areas in the Crimson Grid • Nanoscience • Mesoscopic Physics • Quantum Chemistry and Quantum Chaos • Condensed Matter Physics • Chemistry at Harvard Molecular Mechanics-- CHARMM • Harvard Biorobotics Lab • Atmospheric Chemistry • Earth and Planetary Sciences (Ocean Modeling) • Solid and Structural Mechanics • Earth Sciences and Geophysics- earthquake engineering; • Complex Biosystems Modeling • Quantitative Social Science

  9. Initiative in Innovative Computing Alyssa Goodman (Director) Tim Clark (Executive Director)

  10. Filling the “Gap” between Science and Computer Science Scientific disciplines Computer Science departments Increasingly, core problems in science require computational solution Typically hire/“home grow” computationalists, but often lack the expertise or funding to go beyond the immediate pressing need Focused on finding elegant solutions to basic computer science challenges Often see specific, “applied” problems as outside their interests

  11. Continuum “Computational Science” Missing at Most Universities “Pure” Computer Science (e.g. Turing) “Pure” Discipline Science (e.g. Galileo)

  12. Filling the “computational science” gap: IIC Problem-driven approach …focusing effort on solving problems that will have greatest impact & educational value Collaborative projects …combining disciplinary knowledge with computer science expertise Interdisciplinary effort …to ensure that best practices are shared across fields and that new tools and methodologies will be broadly applicable Links with industry …to draw on and learn from experience in applied computation Institutional funding …to ensure effort is directed towards key needs and not driven solely by narrow priorities of funding agencies

  13. Science Departments CS Departments What is the right shape for that boundary? Where are the optimal “IIC” problems? HIgh “Never Mind” Domain Science Payoff Computer Science Department Low Low High Computer Science Payoff

  14. IIC Research Branches( and Projects Draw upon >1 ) V AS I DC DB/P Plus…Educational Programs that bring IIC Science to Harvard students, and to the public at large.

  15. Data Intensive Project • ATLAS/LHC computing – Tier 2 • Mileura Wide Field Array (MWA) – microwave examination of ultra-redshifted era – time of recombination. • Pan-STARRS – optical telescope (Panoramic Survey Telescope And Rapid Response System)

  16. EGG Project • S. Youssef, J. Huth, D. Parkes, M. Seltzer, J. Shank • Extension of PACMAN concept to resource allocation, cache management

  17. In the beginning… BU Harvard Economic mechanism design; bidding systems, provenance & file systems, resource prediction Software environment computing, i.e. creating and manipulating software environments But what do these have to do with each other? …And how do they fit into the (over-)complicated world of grid computing? Netlogger VDT Alien Ganglia Panda dCache Condor Resource Brokers PBS Chimera Pacman GLOBUS SRM Web services Gums iVDGL ADA LSF VDS EGEE Capone RLS Dirac VOMS OSG Glue Eowyn Dial gLite Clarens PPDG MonaLisa EDG Virtual Machines LCG GridCat DISUN DRM ACDC Classads But then, something very unusual happened…

  18. “Pacman” setenv(Foo,Bar) download(foo.tar.gz) shell(make install) get(E) “eggshell” An installation “caches” ~ Various URLs with eggshell source code [ Pacman is used by ATLAS, OSG, VDT, LCG, Globus, TeraGrid,… >350,000 downloads, ~500-1000 new installations per day in 50 countries around the world, supported on 14 OS.]

  19. We can let all computations be “installations.” setenv(Foo,Bar) download(foo.tar.gz) shell(myjob < infile > outfile) put(E) But which path should E follow?

  20. Resolving the put ambiguity == Resource allocation put(job needing ATLAS 10.5.0) ATLAS v.10.5.0 already installed Fast WAN ( F ) , , setenv(Foo,Bar) download(foo.tar.gz) Cache history Cache contents => ~Opportunity cost

  21. A cache can be a marketplace Eggshells Computers “time>= 14 Nov.” On save()… bidding process ->(C,E) C.put(E) ...repeat... “bidding closes in 7 days” Eggshells go where they get the best prices Computers go where there are the most buyers

  22. The LHC Inverse Mapping Problem • A CPU intensive problem • N. Arkani-Hamed, G. Kane

More Related