1 / 27

STORK: A Scheduler for Data Placement Activities in Grid

STORK: A Scheduler for Data Placement Activities in Grid. Tevfik Kosar University of Wisconsin-Madison kosart@cs.wisc.edu. Some Remarkable Numbers. Characteristics of four physics experiments targeted by GriPhyN:. Source: GriPhyN Proposal, 2000. Even More Remarkable….

mareo
Download Presentation

STORK: A Scheduler for Data Placement Activities in Grid

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. STORK: A Scheduler for Data Placement Activitiesin Grid Tevfik Kosar University of Wisconsin-Madison kosart@cs.wisc.edu

  2. Some Remarkable Numbers Characteristics of four physics experiments targeted by GriPhyN: Source: GriPhyN Proposal, 2000

  3. Even More Remarkable… “ ..the data volume of CMS is expected to subsequently increase rapidly, so that the accumulated data volume will reach 1 Exabyte (1 million Terabytes) by around 2015.” Source: PPDG Deliverables to CMS

  4. Other Data Intensive Applications • Genomic information processing applications • Biomedical Informatics Research Network (BIRN) applications • Cosmology applications (MADCAP) • Methods for modeling large molecular systems • Coupled climate modeling applications • Real-time observatories, applications, and data-management (ROADNet)

  5. Need to Deal with Data Placement • Data need to be moved, staged, replicated, cached, removed; storage space for data should be allocated, de-allocated. • We call all of these data related activities in the Grid as Data Placement (DaP) activities.

  6. State of the Art • Data placement activities in the Grid are performed either manually or by simple scripts. • Data placement activities are simply regarded as “second class citizens” of the computation dominated Grid world.

  7. Our Goal • Our goal is to make data placement activities “first class citizens” in the Grid just like the computational jobs! • They need to be queued, scheduled, monitored and managed, and even checkpointed.

  8. Outline • Introduction • Grid Challenges • Stork Solutions • Case Study: SRB-UniTree Data Pipeline • Conclusions & Future Work

  9. Grid Challenges • Heterogeneous Resources • Limited Resources • Network/Server/Software Failures • Different Job Requirements • Scheduling of Data & CPU together

  10. Stork • Intelligently & reliably schedules, runs, monitors, and manages Data Placement (DaP) jobs in a heterogeneous Grid environment & ensures that they complete. • What Condor means for computational jobs, Stork means the same for DaP jobs. • Just submit a bunch of DaP jobs and then relax..

  11. Stork Solutions to Grid Challenges • Specialized in Data Management • Modularity & Extendibility • Failure Recovery • Global & Job Level Policies • Interaction with Higher Level Planners/Schedulers

  12. Already Supported URLs • file:/ -> Local File • ftp:// -> FTP • gsiftp:// -> GridFTP • nest:// -> NeST (chirp) protocol • srb:// -> SRB (Storage Resource Broker) • srm:// -> SRM (Storage Resource Manager) • unitree:// -> UniTree server • diskrouter:// -> UW DiskRouter

  13. SRM SRB NeST Higher Level Planners DAGMan Condor-G (compute) Stork (DaP) Gate Keeper StartD RFT GridFTP

  14. Interaction with DAGMan Condor Job Queue A Job A A.submit DaP X X.submit Job C C.submit Parent A child C, X Parent X child B ….. DAGMan A Stork Job Queue X X C B Y D

  15. Sample Stork submit file [ Type = “Transfer”; Src_Url = “srb://ghidorac.sdsc.edu/kosart.condor/x.dat”; Dest_Url = “nest://turkey.cs.wisc.edu/kosart/x.dat”; …… …… Max_Retry = 10; Restart_in = “2 hours”; ]

  16. Case Study: SRB-UniTree Data Pipeline • We have transferred ~3 TB of DPOSS data (2611 x 1.1 GB files) from SRB to UniTree using 3 different pipeline configurations. • The pipelines are built using Condor and Stork scheduling technologies. The whole process is managed by DAGMan.

  17. Submit Site SRB Server 1 UniTree Server SRB get UniTree put NCSA Cache

  18. Submit Site SRB Server 2 UniTree Server SRB get GridFTP UniTree put SDSC Cache NCSA Cache

  19. Submit Site SRB Server 3 UniTree Server SRB get DiskRouter UniTree put SDSC Cache NCSA Cache

  20. Outcomes of the Study 1. Stork interacted easily and successfully with different underlying systems: SRB, UniTree, GridFTP and Diskrouter.

  21. Outcomes of the Study (2) 2. We had the chance to compare different pipeline topologies and configurations:

  22. Outcomes of the Study (3) 3. Almost all possible network, server, and software failures were recovered automatically.

  23. Failure Recovery Diskrouter reconfigured and restarted UniTree not responding SDSC cache reboot & UW CS Network outage SRB server maintenance

  24. For more information on the results of this study, please check: http://www.cs.wisc.edu/condor/stork/

  25. Conclusions • Stork makes data placement a “first class citizen”. • Stork is the Condor of data placement world. • Stork is fault tolerant, easy to use, modular, extendible, and very flexible.

  26. Future Work • More intelligent scheduling • Data level management instead of file level management • Checkpointing for transfers • Security

  27. You don’t have to FedEx your data anymore.. Stork delivers it for you! • For more information • Drop by my office anytime • Room: 3361, Computer Science & Stats. Bldg. • Email to: • kosart@cs.wisc.edu

More Related