410 likes | 710 Views
VDT and Interoperable Testbeds. Rob Gardner University of Chicago. Outline. VDT – Status and Plans VDT Middleware VDT Team VDT release description VDT and the LCG project Interoperable Grids – Status and Plans GLUE working group Grid operations
E N D
VDT and Interoperable Testbeds Rob Gardner University of Chicago
Outline • VDT – Status and Plans • VDT Middleware • VDT Team • VDT release description • VDT and the LCG project • Interoperable Grids – Status and Plans • GLUE working group • Grid operations • Distributed facilities and resource monitoring • ATLAS-kit deployment • WorldGrid: iVDGL-DataTAG grid interoperability project, ATLAS SC2002 Rob Gardner, University of Chicago VDT and Interoperable Testbeds
VDT Middleware • Joint GriPhyN and iVDGL deliverable • Basic middleware for US LHC program • VDT 1.1.5 in use by US CMS testbed • US ATLAS testbed is installing VDT with WorldGrid components for EU interoperability • Release structure • introduce new middleware components from GriPhyN, iVDGL, EDG, and other grid developers and working groups • framework for interoperability software and schema development (eg. GLUE) Rob Gardner, University of Chicago VDT and Interoperable Testbeds
Team • VDT Group • GriPhyN (development) and iVDGL (packaging, configuration, testing, deployment) • Led by Miron Livny of University of Wisconsin Madison • Alain Roy (CS staff, Condor team) • Scott Koranda (LIGO, University of Wisconsin, Milwaukee) • Saul Youssef (ATLAS, Boston University) • Scott Gose (iVDGL/Globus, Argonne Lab) • New iVDGL hire (CS) from Madison starting December 1 • Plus a community of participants • Dantong Yu, Jason Smith (Brookhaven): mkgridmap, post-install configuration • Patrick McGuigan, UT Arlington: valuable testing/install feedback, PIPPY • Alan Tackett, Bobby Brown (Vanderbilt): installation feedback, documentation Rob Gardner, University of Chicago VDT and Interoperable Testbeds
VDT • Basic Globus and Condor, plus EDG software (eg. GDMP) • Plus lots of extras … • Pacman • VO management • Test harness • Glue schema • Virtual Data Libraries • Virtual Data Catalog • Language and interpreter • Server and Client Rob Gardner, University of Chicago VDT and Interoperable Testbeds
VDT Releases • Current version: VDT 1.1.5 Released October 29, 2002 • Major recent upgrades (since 1.1.2) • Patches to Globus 2.0, including the OpenSSL 0.9.6g security update. • A new and improved Globus job manager created by the Condor team that is more scalable and robust than the one in Globus 2.0. This job manager has been integrated into the Globus 2.2 release. • Condor 6.4.3 and Condor-G 6.4.3. • New software packages, including: • FTSH (The fault tolerant shell) version 0.9.9 • EDG mkgridmap (including perl modules that it depends on) • EDG CRL update • DOE and EDG CA signing policy files, so you can interact with Globus installations and users that use CAs from the DOE and EDG. • A program to tell you what version of the VDT has been installed, vdt-version. • Test programs so that you can verify that your installation works. • The VDT can be installed anywhere--it no longer needs to be installed into the /vdt directory. • VDT configuration • Sets up Globus and GDMP daemons to run automatically • Configures Condor to work as a personal Condor installation or a central manager • Configures Condor-G to work • Enables the Condor job manager for Globus, and a few other basic Globus configuration steps. • Doing some of the configuration for GDMP. • VDT installation logs created, better README files • We now properly set up globus-gram-job-manager-tools.sh, and ensure that it is not overwritten when more Globus bundles are installed. • Fixed mkgridmap so that it can find the perl modules correctly • Most up-to-date CA signing policy files from the EDG and DOE have been included; new signing policy for the new INFN CA Rob Gardner, University of Chicago VDT and Interoperable Testbeds
Deployment with Pacman • Packaging and post-install configuration: Pacman • Key piece required to do anything: not only for middleware but also applications and higher level toolkits • Tools to easily manage installation and environment • fetch, install, configure, add to login environment, update • Sits over top of many software packaging approaches (rpm, tar.gz, etc.) • Uses dependency hierarchy, so one command can drive the installation of a complete environment of many packages • Packages organized into caches hosted at various sites • Distribute responsibility for support • Has greatly helped in testing/installation of VDT: many new features • Made it possible to quickly setup grids and application packages Rob Gardner, University of Chicago VDT and Interoperable Testbeds
VDT and LCG Project • VDT and EDG being evaluated for LCG-1 testbed • EDG • Collected tagged and supported set of middleware and application software packages and procedures from the European DataGrid project available as RPMs with a master location. • Includes Application Software for deployment and testing on EDG sites • Most deployment expects most/all packages to be installed with small set of uniform configuration • Base layer of software and protocols are common • Globus: X509 certificates, GSI authentication, GridFTP, MDS LDAP monitoring and discovery framework, GRAM job submission • Authorization extensions: LDAP VO service • Condor: Condor-G job scheduling, matchmaking (ClassAds), Directed Acyclic Graph (job task dependency manager – DAGMAN) • File movement and storage management: GDMP, GridFTP • Possible solution: VDT + EDG WP1, WP2 • If adopted, PI’s of GriPhyN and iVDGL, and US CMS and US ATLAS computing managers will need to define a response for next steps for support Rob Gardner, University of Chicago VDT and Interoperable Testbeds
Grid Operations • Operations areas • registration authority • VO management • Information infrastructure • Monitoring • Trouble tracking • Coordinated Operations • Policy • Full time effort • Leigh Grundhoefer (IU) • New hire (USC) • Part time effort • Ewa Deelman (USC) • Scott Koranda (UW) • Nosa Olmo (USC) • Dantong Yu (BNL) Rob Gardner, University of Chicago VDT and Interoperable Testbeds
Distributed Facilities Monitoring • VO Centric Nagios – Sergio Fantinel, Gennaro Tortonne (DataTAG) • VO Centric Ganglia – Catalin Dumitrescu (U of Chicago) • Ganglia • Cluster resource monitoring package from UC Berkeley • Local cluster and meta-clustering capabilities • Meta-daemon storage of machine sensors for CPU load, memory, I/O • Organize sensor data hierarchically • Collect information about job usages • Assign users to VO’s by queries to Globus job manager • Tool for policy development • express, monitor and enforce policies for usage according to VO agreements • Deployment • Both packages deployed on US and DataTAG ATLAS and CMS sites • Nagios plugin work by Shawn McKee – sensors for disk, I/O usage Rob Gardner, University of Chicago VDT and Interoperable Testbeds
Screen Shots Rob Gardner, University of Chicago VDT and Interoperable Testbeds
Glue Project report from Ruth Pordes • Background • Joint effort of iVDGL Interoperability Group and DataTAG WP4 • Led by Ruth Pordes (Fermilab, also PPDG Coordinator) • Initiated by HICB (HEP Intergrid Coordination Board) • Goals • Technical • Demonstrate that each Grid Service or Layer can Interoperate • Provide basis for interoperation of applications • Identify differences in protocols and implementations that prevent interoperability • Identify lacks in architecture and design that prevent interoperability • Sociological • Learn to work across projects and boundaries without explicit mandate or authority but for the “longer term good of the whole”. • Intent to expand from 2 continents to Global - inclusive not exclusive • Strategic • Any Glue code, configuration and documents developed will be deployed and supported through EDG and VDT release structure • Once interoperability demonstrated and part of the ongoing culture of global grid middleware projects Glue should not be needed • Provide short term experience as input to GGF standards and protocols • Prepare way for movement to new protocols - web and grid services (OGSA) Rob Gardner, University of Chicago VDT and Interoperable Testbeds
Glue Security • Authentication - X509 Certificate Authorities, Policies and Trust • DOE Science Grid SciDAC project CA is trusted by the European Data Grid and vice versa • Experiment testbeds (ATLAS, CMS, ALICE, D0 and BaBar) use cross-trusted certificates • Users starting to understand need multiple certificates also • Agreed upon mechanisms for communicating new CAs and CRLs have been shown to work but more automation for Revocation is clearly needed • Authorization • Initial authorization mechanisms are in place everywhere using the Globus gridmapfiles • Various supporting procedures are used to create gridmapfiles from LDAP databases of certificates or other means • Identified requirement for more control over access to resources at time of request for use. But no accepted or interoperable solutions are in place today • Virtual Organization Management or Community Authorization Service is under active discussion - see the PPDG SiteAA mail archives.. And this is just one mail list of several.. http://www.ppdg.net/pipermail/ppdg-siteaa/2002 Rob Gardner, University of Chicago VDT and Interoperable Testbeds
Other Glue Work Areas • GLUE Schema (for resource discovery, job submission) • EDG and the MDS LDAP schema and information were initially very different • Commitment made up front to move to common resource descriptions. • Effort has taken since February - weekly phone meetings and much email • GLUE Schema Compute and Storage Information are Released in V1 in MDS 2.2 and will be in EDG V2.0 - defined with UML and LDIF.. • Led to better understanding by all participants of CIM goals and to collaboration with CIM schema group through the GGF • File transfer and storage • Interoperability tests using GridFTP and SRM V1.0 within the US have started with some success • Joint demonstrations • Common submission to testbeds based on VDT and EDG in a variety of ways • ATLAS Grappa to iVDGL sites (US ATLAS, CMS, LIGO, SSDSS) and EDG (JDL on UI) • CMS-MOP Rob Gardner, University of Chicago VDT and Interoperable Testbeds
ATLAS-kit • ATLAS-kit • RPM’s based on Alessandro DeSalvo’s work; distributed to DataTAG and EDG sites; release 3.2.1 • Luca Vacarossa packaged version VDT sites with Pacman • Distributed as part of ScienceGrid cache • ATLAS-kit-verify: invokes ATLSIM, does one event • New release • 4.0.1 in preparation by AD • Distribution to US sites to be done by Yuri Smirnov (new ATLAS iVDGL hire, started November 1) • Continued work with Flavia Donno and others Rob Gardner, University of Chicago VDT and Interoperable Testbeds
Collaboration between US and EU grid projects Shared use of Global resources – across experiment and Grid domains Common submission portals: ATLAS-Grappa, EDG-Genius, CMS MOP master VO centric “grid” monitoring: Ganglia and Nagios-based Infrastructure development project: Common information index server with Globus, EDG and GLUE schema 18 sites, 6 countries ~130 CPUs Interoperability components (EDG schema and IP providers for VDT servers, UI and JDL for EDT sites); First steps towards policy instrumentation and monitoring Packaging with Pacman (VDT sites) and RPMs/lcfg (DataTAG sites) ScienceGrid: ATLAS, CMS, SDSS and LIGO application suites WorldGridhttp:www.ivdgl.org/worldgrid Rob Gardner, University of Chicago VDT and Interoperable Testbeds
WorldGrid Site • VDT installation and startup • Packaging and installation, configuration • Gridmap file generation • GIIS(s) registration • Site configuration, WorldGrid-WorkerNode testing • EDG information providers and testing • Gangila sensors, instrumentation • Nagios plugins, display clients • EDG packaged ATLAS and CMS code • Sloan applications • Testing with Grappa-ATLAS submission (Joint EDT and US sites) Rob Gardner, University of Chicago VDT and Interoperable Testbeds
WorldGrid at IST2002 and SC2002 Rob Gardner, University of Chicago VDT and Interoperable Testbeds
Grappa • Grappa • Web-based interface for Athena job submission to Grid resources • First one for ATLAS • Based on XCAT Science Portal technology developed at Indiana • Components • Built from Jython scripts using java-based grid tools • Framework for web-based job submission • Flexible: user-definable scripts saved as 'notebooks‘ • Interest from GANGA team to work collaboratively on grid interfaces • IST2002 and SC2002 demos Rob Gardner, University of Chicago VDT and Interoperable Testbeds
XCAT Science Portal • Portal framework for creating “science portals” (application-specific web portals that provide an easy and intuitive interface for job execution on the Grid) • Users compose notebooks to customize portal (eg. The Athena Notebook) • Jython scripts (the user notebooks) • flexible python scripting interface • easy incorporation of java-based toolkits • Java toolkit • IU XCAT Technology (component technology) • CoG (Java implementation of Globus) • Chimera (Java Virtual Data toolkit-GriPhyN) • HTML form interface • runs over Jakarta Tomcat server using https (secure web protocols) Java integration increases system-independence/ robustness Rob Gardner, University of Chicago VDT and Interoperable Testbeds
Athena Notebook • Jython implementation of java-based grid toolkits (CoG, etc) • Job Parameter input forms (Grid Resources, Athena JobOptions, etc) • Web-based framework for interactive job submission --- integrated with --- • Script-based framework for interactive or automatic (eg. Cron-job) job submission • Remote Job Monitoring (for both interactive and cron-based jobs) • Atlfast and Atlsim job submission • Visual Access to Grid Resources • Compute Resources • MAGDA Catalogue • System health monitors: Ganglia, Nagios, Hawkeye, etc • Chimera Virtual Data toolkit -- tracking of job parameters Rob Gardner, University of Chicago VDT and Interoperable Testbeds
Grappa communications flow Script-Based Submisson interactive or cron-job http: JavaScript Cactus framework https - JavaScript Web Browsing Machine (JavaScript) Netscape/Mozilla/Int.Expl/PalmScape Input files CoG : Submission,Monitoring http:// browse catalogue Grappa Portal Machine: XCAT tomcat server Data Storage: - Data Disk - HPSS . . . MAGDA: registers file/location registers file metadata Magda CoG : Resource A Resource Z (spider) Data Copy Compute Resources
Grappa Portal • Grappa is not • Just a GUI front end to external grid tools • Just a GUI linking to external web services • But rather • An integrated java implementation of grid tools • The portal itself does many tasks • Job scheduling • Data transfer • Parameter tracking Rob Gardner, University of Chicago VDT and Interoperable Testbeds
Grappa Milestones • Design Proposals: Spring 2001 • 1st Atlsim prototype: Fall 2001 • 1st Atlas Software Week Demo: March 2002 • Selected (May 2002) for US Atlas SC2002 Demo • 1st submission across entire US Atlas testbed (Feb 2002) • 1st Large Scale Job submission: 50Mevents (April 2002) • Integration with MAGDA (May 2002) • Registration of metadata with MAGDA (June 2002) • Resource Weighted Scheduling (July 2002) • Script based production cron system (July 2002) • GriPhyN-Chimera VDL Integration (Fall 2002) • EDG Compatibility plus DataTAG integration (Fall 2002) Rob Gardner, University of Chicago VDT and Interoperable Testbeds
Spare Slides about Grappa Follow Rob Gardner, University of Chicago VDT and Interoperable Testbeds
Various Grappa/Athena modes • Has been run using locally installed libraries • Has been run using AFS libraries • Has been run with static “boxed” versions of • Atlfast and Atlsim (where we bring along everything to the remote compute node) • This can translate to input data as well • Do we bring input data to the executable • Bring the executable to the data • Bring everything along • Many possibilities Rob Gardner, University of Chicago VDT and Interoperable Testbeds
A Quick Grappa Tour • Demo of running atlfast demo-production • Interactive mode • Automatic mode • GRAM contact monitoring • Links to resources external to the portal • Magda metadata queries • Ganglia cluster monitoring Rob Gardner, University of Chicago VDT and Interoperable Testbeds
Typical User session Start up portal on selected machine Start up web browser Configure/Select testbed resources Configure input files. Submit job Rob Gardner, University of Chicago VDT and Interoperable Testbeds
User Session Open monitoring window Auto refresh (User configurable) Monitor/cancel jobs Rob Gardner, University of Chicago VDT and Interoperable Testbeds
User Session Monitor Cluster health The new ganglia version creates an additional level combining different clusters into a “metacluster” Rob Gardner, University of Chicago VDT and Interoperable Testbeds
User Session Browse MAGDA Catalogue Search for Personal files would like: Search for Physics Collections (based on metadata) Rob Gardner, University of Chicago VDT and Interoperable Testbeds
Production Running • Configure Cronjob • Automatic job submission • Writing to magda cache • Automatic subnotebook naming • Setup up cron script location and timing • Script interacts with same portal as web-based • Use interactive mode to monitor production Rob Gardner, University of Chicago VDT and Interoperable Testbeds
Production Monitoring Configure and test scripts using command line tools Command line submission Automatic cron submission View text log files Rob Gardner, University of Chicago VDT and Interoperable Testbeds
Production Monitoring Access portal via web Cron submissions appear as new subfolders Click on subfolder to check what was submitted Monitor job status Rob Gardner, University of Chicago VDT and Interoperable Testbeds
Production Monitoring Select jobs/groups of jobs to monitor Auto refresh (user configurable) Cancel job button Rob Gardner, University of Chicago VDT and Interoperable Testbeds
Production Monitoring Browse MAGDA catalogue Auto registration of files Check metadata (currently available as command line tool) Search for files would like: Search for physics collections Rob Gardner, University of Chicago VDT and Interoperable Testbeds
Metadata in MAGDA Metadata published along with data files MAGDA registers metadata Metadata browsing available as command line tool: -check individual files Rob Gardner, University of Chicago VDT and Interoperable Testbeds
Chimera: Virtual Data Toolkit • Provides tracking of all job parameters • Browseable parameter catalogue • Simplified methods for data recreation • Processes that crash • Data that is lost • Data retrieval slower than re-creation • Condor-G job scheduling • And much more Rob Gardner, University of Chicago VDT and Interoperable Testbeds
Grappa & US Atlas Testbed • Currently about 60 CPUs available • 6 condor pools • IU, OU, UTA, BNL, BU, LBL • 2 standalone machines • ANL, UMICH • Grappa/testbed production rate, cron-based, • Achieved 15M atlfast events/day Rob Gardner, University of Chicago VDT and Interoperable Testbeds
Grappa: DC1 -- phase 1 • Atlfast-(demo)-production testing successful • Simple atlfast demo with several pythia options • Large scale submission across the grid • Interactive and Automatic submission demostrated • Files and meta-data incorporated into MAGDA • Atlsim production testing initially successful • Atlsim run in both “boxed” and “local” modes. • Only one atlsim mode tested Rob Gardner, University of Chicago VDT and Interoperable Testbeds
Grappa: DC1 -- phase 2 • Atlsim notebook upgrade: • Could make notebook tailor-made for phase2 • Launching mechanism for atlsim • Atlsim does a lot that grappa could do • vdc queries • data transfer (input or output) • leave it in atlsim for now -- or -- • incorporate some pieces into grappa • Would require additional manpower to define/incorporate/test atlsim production notebook Rob Gardner, University of Chicago VDT and Interoperable Testbeds