1 / 11

AJDL: Abstract Job Description Language

AJDL: Abstract Job Description Language. PPDG Collaboration Meeting Williams Bay. David Adams BNL June 29, 2004. Model Components Implementation. Contents. Model. Job-based model User selects an input dataset User selects/constructs a xform to apply to this dataset

tom
Download Presentation

AJDL: Abstract Job Description Language

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. AJDL: Abstract Job Description Language PPDG Collaboration Meeting Williams Bay David Adams BNL June 29, 2004

  2. Model Components Implementation Contents AJDL PPDG Collaboration Meeting

  3. Model • Job-based model • User selects an input dataset • User selects/constructs a xform to apply to this dataset • Distributed analysis system constructs a job to apply the xform to the dataset • Result is a new dataset • Partial results may be available during processing • User examines the result • From this identify the components of AJDL • Dataset • Transformation (e.g. application and task) • Job (xform, dataset, job preferences) AJDL PPDG Collaboration Meeting

  4. Model (cont) • Abstract means • User job definition should be suitable for invocation at any site using any WMS • Specify what to do; not how to do it • Analysis service • Receives abstract job request • Split into subjobs • Typically by splitting input dataset • Map transformation to local executable and runtime environment • Run executable on each sub-dataset • Gather and merge results from each sub-job AJDL PPDG Collaboration Meeting

  5. Components • Dataset • Identity • Dataset is immutable • Location • Typically list of LFN’s • May be absent (virtual dataset) • DRC then provides • Content • Which events • Type of data in each event (raw, trackxs, jets, aod, …) • Compound structure • List of sub-datasets • Can be a tree structure AJDL PPDG Collaboration Meeting

  6. Components (cont) • Application • Script to process a dataset • Output is another dataset • List of software packages • Assume package management service to provide location of a specified package • May have automatic installation • Application advertises the required content • Compare with content of input dataset to verify compatibility • Second script to build task before processing • E.g. compile provided sources AJDL PPDG Collaboration Meeting

  7. Components (cont) • Task • Carries the data used to configure the application • At present the task carries embedded text files • E.g. myalg.cxx • May add named parameters AJDL PPDG Collaboration Meeting

  8. Components (cont) • Job preferences • Allow user to provide hits for processing • Location for output data • User role • Desired response time • System may ignore or freely interpret these AJDL PPDG Collaboration Meeting

  9. Components (cont) • Job • ID • Current state (initializing, running, done, failed, …) • Start stop time • List of sub-job ID’s • Input application, task and dataset • Output dataset • Partial result if job is not complete • Access to control job • Suspend/resume • Kill AJDL PPDG Collaboration Meeting

  10. Implementation • Extensibility • Must be extensible to support different types of datasets and jobs • AtlasPoolEventDataset, RootHistogramDataset, … • ProcessJob, LsfJob, CondorJob, EgeeJob, … • Can we use the same schema for all types? • So far yes for jobs • Probably for applications and tasks • Not clear for datasets • Data representation • XML description for each type AJDL PPDG Collaboration Meeting

  11. Implementation • Classes • Provide class interfaces for each type • C++, python and maybe java • C++ from DIAL • Python binding to C++ using lcgdict (GANGA) • Convenience for implementing clients and services • Add operations to take action • E.g. fetch local replicas of files in a dataset • Update status or kill a job • May add functionality for subtypes • Extract histograms for a RootHistogramDataset AJDL PPDG Collaboration Meeting

More Related