100 likes | 229 Views
Storage Management. Douglas Thain University of Wisconsin thain@cs.wisc.edu GriPhyN NSF Project Review 29-30 January 2003 University of Chicago. GriPhyN Storage Management. A key component of the Virtual Data Grid Data transfer IS a job – need to apply similar mechanisms
E N D
Storage Management Douglas Thain University of Wisconsin thain@cs.wisc.edu GriPhyN NSF Project Review29-30 January 2003University of Chicago
GriPhyN Storage Management • A key component of the Virtual Data Grid • Data transfer IS a job – need to apply similar mechanisms • A set of standalone tools that Play Well with Others • Must be acceptable to many communities: • Condor, GriPhyN, traditional system admins. • Target is deployment in VDT/NMI. • Must facilitate broader technology transfer • A set of overlapping research problems • Other Grid R&D: PPDG, US-CMS • Other CS Research: SOSP, ISCA, FAST Douglas Thain, University of Wisconsin thain@cs.wisc.edu
NeST SRB Planner Master Worker DAGMan Condor-G (compute) Stork (DaP) Gate Keeper RFT StartD Grid Shell Douglas Thain, University of Wisconsin thain@cs.wisc.edu
NeST SRB Planner Master Worker DAGMan Condor-G (compute) Stork (DaP) Gate Keeper RFT StartD Grid Shell Douglas Thain, University of Wisconsin thain@cs.wisc.edu
The Problem of Remote I/O Unreliable Internet Remote CPUs Survive disconnections. Hide high latencies. Hide bursty throughput. Audit progressive results. Ensure consistency between job and storage. Arbitrate between users. Make it easy. Condor Queue Job Storage Douglas Thain, University of Wisconsin thain@cs.wisc.edu
NeST Turns Raw Storageinto an Appliance Appl Appl Web Browser Admins and Owners End-User Tools HTTP User-Level Adapter OS Kernel Cmd Tool NFS Chirp FTP NeST GridFTP Stork Allocable Auditable Authentic Accessible SRM SRM ClassAds Storage POSIX Match Maker Brokers and Mgrs Douglas Thain, University of Wisconsin thain@cs.wisc.edu
Chimera Stork Makes Data Transfera Managed Task DAGMan Authenticated Allocated Logged Supervised Submit, Query, Remove Stork NeST SRBGridFTP Allocation Activation NeST FTPd Transfer Data Mvmt Queue Activity Log Storage Storage Douglas Thain, University of Wisconsin thain@cs.wisc.edu
Chimera Kangaroo Output DAGMan NeST Status and Supervision Chirp Output “What should I use?” Chirp Input GridFTP Transfer NeST NeST Chirp Reservation Integrating CPU and I/O Remote CPUs Job Condor Queue Adapter Stork Storage Douglas Thain, University of Wisconsin thain@cs.wisc.edu
Ph.D. ResearchEnabled by GriPhyN • NeST: Network Storage Technologies • John Bent and Joseph Stanley • http://www.cs.wisc.edu/condor/nest • Stork: Data Placement Manager • Tevfik Kosar • http://www.cs.wisc.edu/condor/stork • Distributed I/O in a Faulty System • Douglas Thain • http://www.cs.wisc.edu/~thain • Grid Security • Ian Alderman • Many MS and BS students. Douglas Thain, University of Wisconsin thain@cs.wisc.edu
Future Work: • I/O - CPU Specialization in Workloads • Automatically provision a cluster with the correct number of storage/worker nodes. • DaP / DAG integration • Convergence of technologies for reliable data scheduling and reliable job scheduling. • Error Management • What happens when something goes wrong? • How does the system/user define wrong? • Security • An online CA to issue task-specific certificates just-in-time for work to be done. Douglas Thain, University of Wisconsin thain@cs.wisc.edu