310 likes | 411 Views
The Virtual Data Toolkit. Todd Tannenbaum (Alain Roy). What is the VDT?. A packaging of software Grid software (Globus, Condor-G…) Virtual data software (Chimera) Utilities An easy installation mechanism Testing and hardening Support. Who makes the VDT?. Grid Physics Network (GriPhyN)
E N D
The Virtual Data Toolkit Todd Tannenbaum (Alain Roy) VDT
What is the VDT? • A packaging of software • Grid software (Globus, Condor-G…) • Virtual data software (Chimera) • Utilities • An easy installation mechanism • Testing and hardening • Support VDT
Who makes the VDT? • Grid Physics Network (GriPhyN) • Constructs the VDT • International Virtual Data Grid Laboratory (IVDGL) • Testing and hardening • Very tight collaboration between GriPhyN and IVDGL VDT
Who makes the VDT? (2) • Core VDT Team: • Miron Livny: The boss • Alain Roy • Carey Kireyev • VDT Testing • Xin Zhao • Brian Moe • Pacman • Saul Youssef VDT
Who uses the VDT? • GriPhyN collaborators • USCMS: In use today • USAtlas: In use today • LIGO: Will use soon • SDSS: Will use soon • European Data Grid • Uses subset of software • Uses just RPMs • LCG VDT
What exactly is in VDT? • VDT 1.1.8: • Globus 2.2.4 + advisories + patches • Condor & Condor-G 6.5.1 • Chimera/Pegasus • RLS • GLUE Schema • CA Certificates • Fault Tolerant Shell • EDG’s Make Gridmap • EDG’s CRL Update • ClassAds • Netlogger VDT
What exactly is in VDT? • VDT 1.1.8: • Globus 2.2.4 + advisories + patches • Condor & Condor-G 6.5.1 • Chimera/Pegasus • RLS • GLUE Schema • CA Certificates • Fault Tolerant Shell • EDG’s Make Gridmap • EDG’s CRL Update • ClassAds • Netlogger VDT
Grid Software Installation Typical Grid Software Installation Experience… VDT Installation Experience! VDT
VDT Installation • 2 Methods • Pacman • RPM VDT
Pacman Installation • Goal: • Type a single command • Everything downloads • Everything installs • Everything is configured • No questions asked • We’re close: • A few questions if you’re root • Basic configuration, may need changing VDT
Pacman Installation (2) • Download Pacman • http://physics.bu.edu/~youssef/pacman/ • Install VDT • cd <install-directory> • pacman -get VDT-Server • pacman -get VDT-Client • ls condor/ globus/ post-install/ setup.sh edg/ gpt/ replica/ vdt/ ftsh/ perl/ setup.csh vdt-install.log • Use VDT
Pacman post-installation • Post-install directory: • Notes on configuration choices made • Instructions for editing configuration • Configuration scripts: • Globus configuration • Condor configuration VDT
RPM Installation • Subset of whole VDT • Globus • Condor-G • Nice RPMs: • We repackage Globus • A dozen Globus RPMs, not hundreds • No configuration • No post-installation help VDT
Testing • VDT team is building test suite • Interaction with LCG testing group • Working with NMI* to leverage: • NMI test suite • Stress testing • Application testing (CMS pipeline) • NMI test infrastructure • NMI = NSF Middleware Initiative • http://www.nsf-middleware.org VDT
Support • Send us questions or problems • We will solve them if we can • We will interact with the developers, if necessary VDT
Interaction with EDG • EDG gets Globus and Condor-G RPMs from VDT • We do what we can to solve problems and get changes to Globus and Condor • We want to make a great package for you VDT
What exactly is in VDT? • VDT 1.1.8: • Globus 2.2.4 + advisories + patches • Condor & Condor-G 6.5.1 • Chimera/Pegasus • RLS • GLUE Schema • CA Certificates • Fault Tolerant Shell • EDG’s Make Gridmap • EDG’s CRL Update • ClassAds • Netlogger VDT
Chimera Virtual Data System • Much scientific data is not obtained from measurements but rather derived from other data by the application of computational procedures • Chimera catalog can be used by application environments to describe a set of application programs ("transformations"), and then track all the data files produced by executing those applications ("derivations"). • Chimera contains the mechanism to locate the "recipe" to produce a given logical file, in the form of an abstract program execution graph. These abstract graphs are then turned into and executable DAG for the Condor-G DAGMan meta-scheduler by the bundled Pegasus planner. • Enables on-demand execution of computation schedules constructed from database queries. VDT
NetLogger • “Networked Application Logger” • API w/ calls you add to existing source code to generate time-stamped monitoring events (sent to a file, network server, syslogd, or RAM) • Visualization Tools • Storage and Retrieval Tools • Store all events into a database VDT
Fault Tolerant Shell (FTSH) • The Grid is a hard environment. • FTSH • The ease of scripting with very precise error semantics. • Exception-like structure allows scripts to be both succinct and safe. • A focus on timed repetition simplifies the most common form of recovery in a distributed system. • A carefully-vetted set of language features limits the "surprises" that haunt system programmers. VDT
Simple Bourne script… #!/bin/sh cd /work/foo rm –rf data cp -r /fresh/data . What if ‘/work/foo’ is unavailable?? VDT
Getting Grid Ready… #!/bin/sh for attempt in 1 2 3 cd /work/foo if [ ! $? ] then echo "cd failed, trying again..." sleep 5 else break fi done if [ ! $? ] then echo "couldn't cd, giving up..." return 1 fi VDT
Or with FTSH #!/usr/bin/ftsh try 5 times cd /work/foo rm -rf bar cp -r /fresh/data . end VDT
Or with FTSH #!/usr/bin/ftsh try for 3 days or 100 times cd /work/foo rm -rf bar cp -r /fresh/data . end VDT
Or with FTSH #!/usr/bin/ftsh try for 3 days every 1 hour cd /work/foo rm -rf bar cp -r /fresh/data . end VDT
Or with FTSH #!/usr/bin/ftsh try for 3 days every 1 hour cd /work/foo rm -rf bar cp -r /fresh/data . end VDT
Or with FTSH #!/usr/bin/ftsh try for 3 days every 1 hour cd /work/foo rm -rf bar cp -r /fresh/data . end VDT
Or with FTSH hosts="mirror1.wisc.edu mirror2.wisc.edu mirror3.wisc.edu" forany h in ${hosts} echo "Attempting host ${host}" wget http://${h}/some-file end echo "Got file from ${h}" VDT
FTSH • All the usual constructs • Redirection, loops, conditionals, functions, expressions, nesting, … • And more • Logging • Timeouts • Process Cancellation • Complete parsing at startup • File cleanup • Used on Linux, Solaris, Irix, Cygwin, … • Simplify your life! VDT
VDT’s Future • Additional Software • MyProxy, Java ClassAds • Access to new versions • Globus 3.0 • Extra VDT to help early adopters • Condor-G will submit to GT2 or GT3 • Helping You • What can we do to make life easier for you? VDT
Where do you learn more? • http://www.griphyn.org/vdt • Support: • vdt-support@ivdgl.org • Alain Roy: roy@cs.wisc.edu • Miron Livny: miron@cs.wisc.edu VDT