1 / 22

Condor and (the) Grid (one of the CS X in PPDG)

Condor and (the) Grid (one of the CS X in PPDG).

cambrose
Download Presentation

Condor and (the) Grid (one of the CS X in PPDG)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Condor and (the) Grid(one of the CS X in PPDG)

  2. “ … Since the early days of mankind the primary motivation for the establishment of communities has been the idea that by being part of an organized group the capabilities of an individual are improved. The great progress in the area of inter-computer communication led to the development of means by which stand-alone processing sub-systems can be integrated into multi-computer ‘communities’. … “ Miron Livny, “Study of Load Balancing Algorithms for Decentralized Distributed Processing Systems.”, Ph.D thesis, July 1983.

  3. Condor as a ... • … Grid • … window to the Grid • … manager of Grid resources • … a source of Grid technology

  4. Main Condor capabilities • Management of large collections of distributively owned heterogeneous resources (CPU, storage, network, software) • Management of large (10K) collections of jobs. • Remote Execution • Remote I/O • Checkpointing • Matchmaking • System administration

  5. Condor Deployment(that we know of) • More than 4000 CPUs world-wide • More than 1200 CPUs at UW • More than 200 CPUs at INFN • More than 800 CPUs in industry.

  6. A Simple Scenario Study the behavior of F(x,y,z) for 20 values of x, 10 values of y and 3 values of z (20*10*3 = 600) • F takes on the average 3 hours to compute on a “typical” workstation (total = 1800 hours) • F requires a “moderate” (128MB) amount of memory • F performs “little” I/O - (x,y,z) is 15 MB and F(x,y,z) is 40 MB

  7. How Can CondorHelp?

  8. Step I - get organized! • Turn your workstation into a “Personal Condor” • Write a script that creates 600 input files for each of the (x,y,z) combinations • Submit a cluster of 600 jobs to your personal Condor • Write a script that collects the data from the 600 output files • Go on a long vacation … (2.5 months)

  9. Your Personal Condor will ... • ... keep an eye on your jobs and will keep you posted on their progress • ... implement your policy on when the jobs can run on your workstation • ... implement your policy on the execution order of the jobs • .. add fault tolerance to your jobs • … keep a log of your job activities

  10. personal Condor your workstation 600 Condor jobs

  11. Step II - build a Grid • Install Condor on the machine next door. • Install Condor on the machines in the class room. • Install Condor on the O2K in the basement. • Configure these machines to be part of your Condor pool/grid. • Go on a shorter vacation ...

  12. personal Condor Group Condor your workstation 600 Condor jobs

  13. Step III - Take advantage of your friends • Get permission from “friendly” Condor pools/Grids to access their resources • Configure your personal Condor to “flock” to these pools/grids • reconsider your vacation plans ...

  14. personal Condor Group Condor your workstation 600 Condor jobs friendly Condor

  15. Step IV - Think big! • Get access (account(s) + certificate(s)) to Globus managed Grid resources • Submit 599 “To Globus” Condor glide-in jobs to your personal Condor • When all your jobs are done, remove any pending glide-in jobs • Take the rest of the afternoon off ...

  16. A “To-Globus” glide-in job will ... • … transform itself into a Globus job, • submit itself to Globus managed Grid resource, • be monitored by your personal Condor, • once the Globus job is allocated a resource, it will use a GSIFTP server to fetch Condor agents, start them, and add the resource to your personal Condor, • vacate the resource before it is revoked by the remote scheduler

  17. personal Condor Globus Grid Group Condor your workstation 600 Condor jobs LSF PBS 599 glide-ins friendly Condor Condor

  18. VizBench - send us your dataand we will send you back a movie(a SC99 demo by NCSA)

  19. Frame Rendering Managed and Powered by a Personal Condor A locally installed Personal Condor is used by the VizBench server to • manage, monitor and control the execution of frame rendering tasks, • manage local rendering resources and • locate remote and Grid resources that are capable and willing to render frames

  20. UW Condor UNM Supercluster Condor jobs VizBench Web Server Viz- Bench Globus Gatekeeper Globus Gatekeeper Personal Condor BU O2K NCSA Condor

  21. Grid Obstacles (Sociology) (Education) (Robustness) (Portability) (Technology) • Ownership Distribution • Customer Awareness • Size and Uncertainties • Technology Evolution • Physical Distribution

  22. C High Throughput Computing ondor Visit us at http://www.cs.wisc.edu/condor

More Related