170 likes | 300 Views
Connecting OurGrid & GridSAM. A Short Overview. Content . Goals OurGrid : architecture overview OurGrid : short overview GridSAM : short overview GridSAM : example deployment with Condor Different paradigms: OurGrid Different paradigms: GridSAM Issues: File Staging
E N D
Connecting OurGrid & GridSAM A ShortOverview
Content • Goals • OurGrid: architectureoverview • OurGrid: short overview • GridSAM: shortoverview • GridSAM: exampledeploymentwith Condor • Different paradigms: OurGrid • Different paradigms: GridSAM • Issues: File Staging • Issues: many related job submissions • OurGrid<>GridSAM connector
Goals • To maintain two grid environments in parallel: OurGrid & Condor • To handle job submission process through common interface: JSDL, using GridSAM • To build connector for GridSAM to talk to OurGrid • GridSAM can already talk to Condor through a connector, no problems here
OurGrid: short overview • Workersaretypically desktop computersthatcan run jobsdirectlyintheir OS orthroughvirtualization (XEN, VMWare, VirtualBox etc.) • „Clouds of Workers” arecontrolled by Peers • JobsaresubmittedthroughBrokers • Twopossibilitieshere: • Broker can be a dedicatedweb-siteinterfacingwithspecificPeers • Broker can be anymachinewithMyGridtoolinstalledthatcommunicates to specifiedPeers
GridSAM: short overview • Web Service-typemiddlewarelayingbetweenjobsubmitter and coregridmachinery • Modulararchitecture: can talk to many gridinfrastructuresthroughspecificconnectors • Collectsjobsubmissionssent as XML JSDL files • Managesmultiplesubmissionsthanks to persistency and monitorssubmissionslifecycle • AfteracceptingJSDLs, re-submitsjobsdirectly to underlyinggridmachinery as definedinspecificconnectors
GridSAM: exampledeploymentwithCondor • Machine (B) runs GridSAM instance in secured OMII container • Machine (B) has capability of directly re-submitting jobs to Condor Pool (C) • Authorized job submitter (A) can submit jobs over the internet to the GridSAM instance running on (B)
Different paradigms: OurGrid • Designed for labsthathaveaccess to a pool of desktop machineswhosefree CPU cyclescan be utilized • Bag-of-Tasks: jobsareusuallydisjointunitswith independent input and output • Data setsoftenhavereasonableenoughsizes to be transferred many timesacross many machines • As end-userfriendly as possible: asksjobsubmitteronly for JDL jobsubmissionspecification, inputfiles and outputfiles • All details of jobscheduling and file transfer arehiddenfromjobsubmitter
Different paradigms: GridSAM • Designedprimarily for labsutilizing high performance computing (HPC) techniquesusingfewpowerfulmachines • HPC istypicallyused for CPU-demandingcomputationsthatusesextensive data sets • Everymilisecondisimportant: jobspecification, input and outputfilesmust be handledwith minimum human and OS intervention • Jobsareoften dependent on verylargedatasets, file transfer should be minimized • Data must be accessedinfast and secureway, preferablythroughURIswhichrequires minimum externalintervention • TheURIsmust be specifieddirectlyin JSDL file
Issues: File staging • In OurGrid, MyGridtooltakescare of transfer of inputfiles, distributingthemaccording to BoTparadigm, and transfer of outputfiles back to jobsubmitter • Also, whensubmittingthroughweb-site, feedbackissentwhenoutputfilesareavailable for download • Job submittercanjust point out files on itsownmachine, oruploadthem to somestorageserveraccessible to MyGrid • No dedicatedstorageisneeded for MyGrid to work
Issues: File staging • GridSAMdoes not handle input and outputfiles by itself; itdelegatesthissubtask to yetanothermiddleware, Apache VFS • VFS was designed to access resources identified by URIsbased on fullyqualifiedhostnames and fewrecognizedprotocols (FTP/SFTP, HTTP, GridFTP, WebDAV etc.) • Whensubmitting JSDL usingGridSAMclient on particularmachine, one cannotjust point out localfiles; theymust be uploaded to somededicatedstoragespacethatisidentifiablethrough URI to VFS machinery • Onlywhencorrectlyspecified (reliableURIs!) in JSDL, and uploaded to dedicatedstorage, filesmay be furtherprocessed by GridSAM
Issues: File staging • Possiblesolution 1: definededicatedstorageinthe form of SFTP/GridFTP file server, accessibleboth to OurGrid and GridSAM, and writeallURIsin JSDL filesaccording to thisdedicatedstorage • Possiblesolution 2: letjobsubmitterdecideitsownstoragemechanisms; accept URI ifitisaccessible (readable/writable), processthejob as usual, let VFS do therest
Issues: File staging • In both cases, security is an important feature to consider • JSDL processing is secure enough in GridSAM but secure access to external storage must be maintained separately
Issues: many relatedjobsubmissions • In OurGrid, jobsubmittercansubmit JDL jobspecificationwith many jobsdefined • Also, specific environment variables set by OurGridcan be utilized to differentiatebetweenmultiplejobs and multipleinput/outputfiles • No specificsupport for parametersweepconceptisprovided, but jobsubmittercansimulateit by usingproperlywritten JDL jobspecification
Issues: many relatedjobsubmissions • WithGridSAM, jobsubmitterissubmitting JSDL thatcontainsdetails for single jobonly • In theory, itispossible to submitmultipleJSDLsinshort time; theyshould be internallyscheduledusingpersistencymechanisms by GridSAM, thengraduallyre-submitted to gridmachinerythroughspecifiedqueuingstrategy • Parametersweep JSDL extensioniscurrently not supportedinGridSAM; intheory, jobsubmittercansubmitbunch of JSDLsthatsimulateit
Issues: many relatedjobsubmissions • Possiblesolution 1: rely on GridSAMschedulingmechanisms; allow to acceptmultiplesubmissionsinveryshort time and letGridSAMre-submitthemaccording to itsownstrategies • Possiblesolution 2: implementparametersweep JSDL extensioninOurGridconnectororeveninGridSAMcore module itself • Solution 1 isverystraightforward; however, thebehaviour of GridSAM under thoseconditionsneeds to be examinedclosely • Solution 2 isveryfeasible, but requires much time and resources
OurGrid<>GridSAM connector • For OurGrid, MyGrid tool instance (either installed on local machine or as component of job submission web-site) is a single „contact point” for job submitter, hiding all the underlying grid-specific mechanisms • The connector should be a wrapper over MyGrid instance