140 likes | 153 Views
Recent Developments in the ICAT Job Portal a generic job submission system built on a scientific data catalog. NOBUGS October 2016 Brian Ritchie, Rebecca Fair Steve Fisher, Kevin Phipps, Tom Griffin Dan Rolfe, Jianguo Rao STFC Rutherford Appleton Laboratory. Overview.
E N D
Recent Developments in theICAT Job Portala generic job submission systembuilt on a scientific data catalog NOBUGS October 2016 Brian Ritchie, Rebecca Fair Steve Fisher, Kevin Phipps, Tom Griffin Dan Rolfe, Jianguo Rao STFC Rutherford Appleton Laboratory
Overview • Search for datasets / datafiles using ICAT • Configure and submit jobs to process selected datasets / datafiles on one or more batch servers • Submit single job for all selected datasets/datafiles, or separate jobs for each • Jobs use IDS to retrieve data (or ICAT for metadata) • Monitor progress of jobs and inspect output
Architecture User’s desktop GUI (browser) CLI search ICAT & IDS REST IJP Server IJP web app REST Batch System 1 Batch System N retrieve … IJP batch connector 1 IJP batch connector N submit batch job submit batch job Batch server 1 Batch server N batch server magic batch server magic … … Worker Node 1 Worker Node 1 Worker Node n Worker Node n
Finding data • Use TopCAT to find data in ICAT • Configure job for single dataset or datafile • or build a cart with multiple datasets/datafiles • Configure job for cart
Job Types • Part of the IJP configuration • Each Job Type specifies: • Program (job script) to run • Dataset types for which the job can be run • If job is batch or interactive • If job accepts datasets, datafiles or both • If single job can take multiple datasets/datafiles • Other job parameters / options • GUI filters job types depending on selected data • or filter data by selecting job type first
Job configuration • Job options • Submit options
Job submission • IJP server gets estimates from each batch connector • Chooses one of the best • Batch connector submits job to its batch system • Jobscript executable defined in job type • Job is passed dataset/file IDs, ICAT/IDS session tokens and job options • Batch connector monitors submitted jobs • Queue status, standard/error output • IJP server monitors batch connectors • IJP server holds job status and output • Until user deletes the job
Monitoring batch jobs Job history, status, management Output of running job
Interactive jobs • Batch connector selects a worker node • Node is removed from pool of available workers • Sets up RDP session to run interactive executable • RDP connection details passed back to IJP server • GUI launches Remote Desktop (Windows) or gives pasteable command line (Linux) • Batch connector releases worker once session is closed • tries hard not to leave dangling interactive sessions
Jobscripts • Executable that runs on batch system workers • Receives dataset/datafile IDs, options, session tokens on command line • Uses IDS to retrieve datasets/datafiles (or ICAT for metadata) • Should add provenance records to ICAT • Does not communicate with the IJP
Developing jobs • Create jobscript • Python utility library for argument processing • Python-icat or similar to work with ICAT / IDS • Deploy jobscript on batch system • Define jobtype XML • Add to IJP server configuration (dynamic)
Recent developments • New GUI • TopCAT plugin (AngularJS) • RESTful interface to IJP server • Original GWT GUI still part of server, but won’t be developed further
Current status • One active customer, Octopus (CLF) • Test system in place (ingest, jobscript development) • Not yet in production • Batch connectors • Torque • Unix batch (for demos/tests) • Platform LSF (incomplete)
Future development • Improve batch system brokering • Add batch requirements to job types (e.g. requires GPU) • Support versioning of datasets • Specific requirement from Octopus – post-ingestion modifications • IJP GUI should only show latest version of each dataset (custom results filtering) • Version management separate from IJP, but may be developed as jobs