210 likes | 333 Views
Managing and Scheduling Data Placement (DaP) Requests. Outline. Motivation DaP Scheduler Case Study: DAGMan Conclusions. Demand for Storage. Applications require access to larger and larger amounts of data Database systems Multimedia applications Scientific applications
E N D
Outline • Motivation • DaP Scheduler • Case Study: DAGMan • Conclusions
Demand for Storage • Applications require access to larger and larger amounts of data • Database systems • Multimedia applications • Scientific applications • Eg. High Energy Physics & Computational Genomics • Currently terabytes soon petabytes of data
Is Remote access good enough? • Huge amounts of data (mostly in tapes) • Large number of users • Distance / Low Bandwidth • Different platforms • Scalability and efficiency concerns => A middleware is required
Two approaches • Move job/application to the data • Less common • Insufficient computational power on storage site • Not efficient • Does not scale • Move data to the job/application
Huge tape library (terabytes) Move data to the Job WAN Local Storage Area (eg. Local Disk, NeST Server..) LAN Remote Staging Area Compute cluster
Main Issues • 1. Insufficient local storage area • 2. CPU should not wait much for I/O • 3. Crash Recovery • 4. Different Platforms & Protocols • 5. Make it simple
Data Placement Scheduler (DaPS) • Intelligently Manages and Schedules Data Placement (DaP) activities/jobs • What Condor is for computational jobs, DaPS means the same for DaP jobs • Just submit a bunch of DaP jobs and then relax..
SRB Server SRM Server Accept Exec. Sched. DAPS Server DaPS Client DaPS Client Local Disk Req. Req. Req. Req. GridFTP Server GridFTP Server NeST Server DaPS Architecture Remote Local Get Queue Buffer Put Thirdparty transfer
DaPS Client Interface • Command line: • dap_submit <submit file> • API: • dapclient_lib.a • dapclient_interface.h
DaP jobs • Defined as ClassAds • Currently four types: • Reserve • Release • Transfer • Stage
DaP Job ClassAds [ Type = Reserve; Server = nest://turkey.cs.wisc.edu; Size = 100MB; reservation_no = 1; …… ] [ Type = Transfer; Src_url = srb://ghidorac.sdsc.edu/kosart.condor/x.dat; Dst_url = nest://turkey.cs.wisc.edu/kosart/x.dat; reservation_no = 1; ...... ]
Supported Protocols • Currently supported: • FTP • GridFTP • NeST (chirp) • SRB (Storage Resource Broker) • Very soon: • SRM (Storage Resource Manager) • GDMP (Grid Data Management Pilot)
DAGMan DAGMan A Condor Job Queue B C A D Case Study: DAGMan .dag File
Current DAG structure • All jobs are assumed to be computational jobs Job A Job C Job B Job D
Current DAG structure • If data transfer to/from remote sites is required, this is performed via pre- and post-scripts attached to each job. Job A PRE Job B POST Job C Job D
Reserve In & out Transfer in Job B Release in Transfer out Release out New DAG structure Add DaP jobs to the DAG structure PRE Job B POST
New DAGMan Architecture .dag File DAGMan DAGMan A DaPS Job Queue Condor Job Queue X X A C B Y D
Conclusion • More intelligent management of remote data transfer & staging • increase local storage utilization • maximize CPU throughput
Future Work • Enhanced interaction with DAGMan • Data Level Management instead of File Level Management • Possible integration with Kangaroo to keep the network pipeline full
Thank You for Listening &Questions • For more information • Drop by my office anytime • Room: 3361, Computer Science & Stats. Bldg. • Email to: • condor-admin@cs.wisc.edu