320 likes | 479 Views
The GANGA Interface for ATLAS/LHCb. Roger W L Jones ( Lancaster University) For the GANGA Team. Project overview Design details Component descriptions Interfaces Refactorisation plans ARDA. The Project.
E N D
The GANGA Interfacefor ATLAS/LHCb Roger W L Jones ( Lancaster University) For the GANGA Team Project overview Design details Component descriptions Interfaces Refactorisation plans ARDA
The Project • Ganga is being developed as a joint project between the ATLAS and LHCb experiments • Began in the UK supported by GridPP, important collaborations with US colleagues • Current main contributors are: • Developers:K.Harrison, A.Soroko, C.L.Tan (GridPP funded) • Technical input and consulation:W.T.L.P.Lavrijsen, J.Martyniak, P.Mato, C.E.Tull • GridPP coordination:N.Brook, R.W.L.Jones, G.N.Patrick Ganga-related information regularly updated on web site http://ganga.web.cern.ch/ganga GridPP8 meeting, Bristol
Motivation and Background • ATLAS and LHCb develop applications within a (complex but powerful) common framework: Gaudi/Athena • Both collaborations aim to exploit potential the of the Grid for large-scale, data-intensive distributed computing • Simplify management of analysis and production jobs for end-user physicists by developing tools for accessing Grid services with built-in knowledge of how Gaudi/Athena works: Gaudi/Athena and Grid Alliance (GANGA) • Also aid job creation, submission, management and archival in non-Grid contexts • Generic components, especially those for interfacing to the Grid, can be used in other experiments GridPP8 meeting, Bristol
GANGA GUI Collective & Resource Grid Services Histograms Monitoring Results JobOptions Algorithms GAUDI/Athena application Milestones Summer 2001: First ideas for Ganga Spring 2002: Work on Ganga started, with strong support from GridPP Spring 2003: GANGA1 released for user evaluation Autumn 2003: Refactorisation for GANGA2 – 3 week workshop, BNL Summer 2004: Use of GANGA2 in Data Challenges and Computing Model tests GridPP8 meeting, Bristol
Ganga Deployment • Ganga has been installed and used at a number of sites: • Birmingham, BNL, Cambridge, CERN, Imperial, LBNL, Oxford • Ganga interfaces to several Grid implementations: • EDG; Trillium/US-ATLAS; NorduGrid under test • Ganga interfaces to local batch systems: • LSF, PBS • Ganga has been used to run a variety of LHCb and ATLAS applications: • LHCb analysis (DaVinci) • ATLAS reconstruction • ATLAS full and fast simulation • Ganga has been used to run BaBar applications (see talk by J.Martyniak) • A Ganga tutorial was given at BNL (US-Atlas Computing and Physics meeting), with some 50 participants, all of whom successfully used Ganga to submit jobs (at the same time) GridPP8 meeting, Bristol
Related Activities • AthAsk: • Creating a Gaudi/Athena job is in itself complicated, AthAsk wraps the complexity, incorporating knowledge of the component applications • DIAL: • DIAL focuses on the requirements for interactive Grid analysis • DIAL complements Ganga (focus on non-interactive processing), and work has started on a Ganga-DIAL interface to combine the best of both • Chimera: • Providing a Chimera-driven production and analysis workflow system for ATLAS • Automated installation and packaging with CMT&pacman • Needed for many-site code maintenance and to distribute user code and run-time environment • Looking at ways to make use of pacman from Ganga • AtCom: • Interim production tool 02/03 • GridPP development effort • GANAG test bed GridPP8 meeting, Bristol
General Design Characteristics • The user interacts with a single application covering all stages of a job’s life-time • The design is modular • Although Atlas and LHCb use the same framework and the same software management tool, there are significant differences in what is expected from the software, and Ganga must have the flexibility to cope • Ganga provides a set of tools to manipulate jobs and data. Tools are accessible from CLI (other scripts) or from GUI • Ganga allows access both to the local resources (e.g., LSF batch system) and to the GRID • Should follow, and contribute to, developments in LHC Computing Grid Implementation is in Python GridPP8 meeting, Bristol
Value Added • Single point of entry for configuring and running different types of ATLAS and LHCb jobs, uniform approach • It helps with job definition and configuration • Common task and user-defined templates • Application-specific job options, or user-supplied job-options file • Editing of job-option values, guided to meaningful values for some applications • Simple, flexible procedure for splitting and cloning jobs • Accepts user-provided script for splitting/cloning • Bookkeeping and stored job settings • Persistent job representation for archive & exchange • Automatic monitoring, job status query on local and distributed systems • Extensible framework for job-related operations • Integration of new services, and for the assimilation of contributions from users GridPP8 meeting, Bristol
GUI CLI Gaudi/Athena Job Definition Job Definition Job Registry Gaudi/Athena Job Options Editor Software Bus Job Handling File Transfer BaBar Job Definition and Splitting Python Native Py Magda Py ROOT Gaudi Python PyAMI PyCMT Software Bus Design • User has access to functionality of Ganga components through GUI and CLI, layered one over the other above a Software Bus • Software Bus itself is implemented as a Python module • Components used by Gangafall into 3 categories: • Ganga components of general applicability or core components (to right in diagram) • Ganga components providing specialised functionality (to left in diagram) • External components (at bottom in diagram) GridPP8 meeting, Bristol
Generic Components (1) • Components may have uses outside ATLAS and LHCb • Core component provides classes for job definition, where a job is characterised in terms of: name, workflow, required resources, status • Workflow is represented as a sequence of elements (executables, parameters, input/output files, etc) for which associated actions are implicitly defined • Required resources are specified using a generic syntax • Future workflow will merge with DIAL and Chimera GridPP8 meeting, Bristol
Generic Components (2) • Job-registrycomponent allows for storage and recovery of job information, and allows for job objects to be serialized • Multi-threaded environment based on Python threading module • Serialisation of objects (user jobs) is implemented with the Python pickle module • Script-generationcomponent translates a job's work flow into the set of instructions to be executed when the job is run • Job-submissioncomponent submits work flow script to target batch system, creates JDL file and translates resource requests • EDG, Trillium/US-ATLAS, LSF, PBS • Can submit, monitor, and get output from GRID jobs • File-transfercomponent handles transfer between sites of input & output files, adds commands to work flow script on submission • Job-monitoringcomponent performs queries of job status • Should move to R-GMA • Local/batch job monitoring problematic; move to job pushing info to specified location, integrate with NetLogger for Grid GridPP8 meeting, Bristol
Experiment-Specific Components • GaudiApplicationHandler • Can access Configuration DB for some Gaudi applications, using the xmlrpclib module • Ganga can create user-customized Job Options files using this DB • Intelligent Job Options editor exists for some applications • Specialised application handlers exist for ATLAS fast simulation and for LHCb analysis • Components incorporate knowledge of the experiments’ Gaudi/Athena framework • Gaudi job definition component adds to workflow elements in general-purpose job-definition component, e.g. dealing with configuration management; also provides workflow templates covering common tasks • Other components provide for job splitting, and output collection • Job splitting may have generic aspects; will be investigated in collaboration with DIAL • More work needed on job merging GridPP8 meeting, Bristol
Subjob 1 Repository Splitting script 1 Subjob 2 Splitting script 2 Selects or creates Job Handling module Subjob 3 Splitting script … Subjob 4 Selects or creates Template Job Subjob … Job handling: splitting a job(lots of potential reuse) GridPP8 meeting, Bristol
External Components Additional functionality is obtained using components developed outside of Ganga: • Modules of python standard library • Non-python components for which appropriate interface has been written • Gaudi framework itself (GaudiPython) • Analysis package, ROOT (PyROOT) • Configuration management tool (CMT) • ATLAS metadata interface, AMI (PyAMI) • ATLAS manager for Grid data, Magda (PyMagda) GridPP8 meeting, Bristol
JobsCatalog JobsRegistry Job handling component Job registry component 1 0… Job definition component Job 1 1… 1 JobHandler Requirements 1 1 Application 0… 1 Credentials JobAttributes Parameter Executable 1… Specialised component: Gaudi/Athena job definition GaudiApplicationHandler Implementation of Components (1) GridPP8 meeting, Bristol
Application specific components ApplicationHandler JobHandler BaBarApplicationHandler GridJobHandler LocalJobHandler GaudiApplicationHandler LSFJobHandler PBSJobHandler AtlfastApplicationHandler AnotherJobHandler DaVinchiApplicationHandler Job handling component Implementation of Components (2) GridPP8 meeting, Bristol
Job Handler class Job class JobsRegistry class Data management service Job submission Job monitoring Security service dg-job-list-match dg-job-submit dg-job-cancel dg-job-status dg-job-get-logging-info R-GMA edg-replica-manager dg-job-get-output globus-url-copy grid-proxy-init MyProxy EDGUI Interfacing to the Grid GridPP8 meeting, Bristol
Interfaces: CLI atlasSetup = GangaCommand(“source /afs/cern.ch/user/h/harrison/public/atlasSetup.sh”) atlfast = GangaCMTApplication(“TestRelease”, “TestRelease-00-00-15”,”athena.exe”, “run/AtlasfastOptions.txt”) atlfastOutput = GangaOutputFile(“atlfast.ntup”) workStep1 = GangaWorkStep([atlasSetup,atlfast,atlfastOutput]) workFlow = GangaWorkFlow([workStep1]) lsfJob = GangaLSFJob(“atlfastTest”,workFlow) lsfJob.build() lsfJob.run() At the moment CLI is based on low-level tools; a higher-level set of commands is under development GridPP8 meeting, Bristol
Interfaces: GUI • GUI has been implemented using wxPython extension module • Layered on CLI • All job configuration data are represented in a hierarchical structure accessible via tree control; most important job parameters are brought to the top of the tree • “User view” provides easy access to the top-level parameters • All job parameters defined by the user can be edited via GUI dialogs • A help system has been implemented, using html browser classes from wxPython • All implemented tools are available through the GUI, but some require a more elaborate interface, e.g., Job Options browser/editor • Python shell is embedded into the GUI and allows user to configure interface from the command line GridPP8 meeting, Bristol
Toolbar Job tree Main panel Python interpreter Basic GUI GridPP8 meeting, Bristol
Job creation GridPP8 meeting, Bristol
Job-parameters panel GridPP8 meeting, Bristol
Job-options editor: sequences GridPP8 meeting, Bristol
Job-options editor: options GridPP8 meeting, Bristol
Job position depends on monitoring info Job submission GridPP8 meeting, Bristol
Examination of job output GridPP8 meeting, Bristol
Job splitting • User or “third party” splitting script is required • GUI displays script descriptions, where these exist, to guide user choices • If “split” function of the script accepts a parameter it is interpreted as a number of subjobs and can be entered in the job splitting dialogue GridPP8 meeting, Bristol
Ganga Help GridPP8 meeting, Bristol
Ganga Refactorisation Scheme can be broken down as follows: • Definition of job options • The user retrieves a set of job options from a database (or other standard location), and is then able to make modifications using an intelligent job-options editor; the result is a job-options template • Definition of dataset • The user selects the input dataset, obtaining information on the available datasets from a catalogue • Definition of execution strategy • The strategy might be selected from a database, or the user might provide a new strategy definition • Creation of job-collection description • An XML description of the job, or collection of jobs, to be submitted is created on the basis of the previously defined job-options template, input dataset and execution strategy • Definition of job requirements • Some standard requirements may be defined for each experiment • The user may also specify requirements, for example imposing that jobs be submitted to a particular cluster • Additional requirements may be derived from the job-collection description • Job submission • A dispatcher determines where to submit jobs, on the basis of the job-collection description and the job requirements, and invokes a service that returns the appropriate submission procedure • Installation of software and components • Experiment-specific software required to run the user job is installed as necessary on the remote client • Any (Ganga) components required to interpret the job-collection description are also installed • Job execution • Agents supervise the execution and validation of jobs on the batch nodes GridPP8 meeting, Bristol
Software/Component Server Software Cache Component Cache Remote Client Execution node Remote-Client Scheduler Grid/ Batch-System Scheduler Agent (Runs/Validates Job) Local Client JDL, Classads, Scheduler Proxy Dispatcher Job Requirements LSF Resources, etc NorduGrid Local DIAL DIRAC Other LSF PBS EDG USG Job Collection (XML Description) Derived Requirements Job Factory (Machinery for Generating XML Descriptions of Multiple Jobs) Scheduler Service Job-Options Template Dataset Job-Options Editor Dataset Selection Strategy Selection User Requirements Job-Options Knowledge Base Database of Standard Job Options Dataset Catalogue Strategy Database (Splitter Algorithms) Database of Job Requirements Future plans Refactorisation of Ganga, with submission on remote client Motivation • Ease integration of external components • Facilitate multi-person, distributed development • Increase Customizability/Flexibility • Allow GANGA components to be used externally more easily GridPP8 meeting, Bristol
Use of Components Outside of Ganga • Ganga complies with recent requirements for grid services domain decomposition as described in the Architectural Roadmap towards Distributed Analysis (ARDA) document. • Some Ganga components provide native services (API, UI) • Majority of components just represent an uniform interface to the existent grid middleware services (e.g., Data Management, Job Monitoring) GridPP8 meeting, Bristol
Future Plans • GANGA prototype has had enthusiastic and demanding early adopters • A new release will provide an interim production version • A refactorisation is now underway • Stricter adherence to the component model • Compliance with the draft LCG distributed analysis services model • Installation tools need to be interfaced, at least for user analysis code • Add new job handlers • Web-based variant GUI (or thin remote client) should be considered. Security issues need to be addressed in this case. • Exploit Grid Monitoring Architecture • Components should be capable of wide reuse • GANGA can deal with the ARDA framework • Software installation for analysis jobs a priority • Metadata query/selection/browsing and design a high priority GridPP8 meeting, Bristol