ARDA Reports to the LCG SC2

ARDA Reportsto the LCG SC2 L.A.T.Bauerdick for the RTAG-11/ARDA group *Architectural Roadmap towards a Distributed Analysis

ARDA Mandate

ARDA Schedule and Makeup Done Done • Alice: Fons Rademakers and Predrag Buncic • Atlas: Roger Jones and Rob Gardner • CMS: Lothar Bauerdick and Lucia Silvestris • LHCb: Philippe Charpentier and Andrei Tsaregorodtsev • LCG GTA: David Foster, stand-in Massimo Lamanna • LCG AA: Torre Wenaus • GAG: Federico Carminati

ARDA mode of operation • Thank you for an excellent committee -- large expertise, agility and responsiveness, very constructive and open-minded, and sacrificing quite a bit of the summer • Series of weekly meetings July and August, mini-workshop in September • Invited talks from existing experiment’s projects: • Summary of Caltech GAE workshop (Torre) • PROOF (Fons) • AliEn (Predrag) • DIAL (David Adams) • GAE and Clarens (Conrad Steenberg) • Ganga (Pere Mato) • Dirac (Andrei) • Cross-check w/ other projects of emerging ARDA decomposition of services • Magda, DIAL -- Torre, Rob • EDG, NorduGrid -- Andrei, Massimo • SAM, MCRunjob -- Roger, Lothar • BOSS, MCRunob -- Lucia, Lothar • Clarens, GAE -- Lucia, Lothar • Ganga -- Rob, Torre • PROOF -- Fons • AliEn -- Predrag • DIRAC -- Andrei • VOX -- Lothar Done Done

Initial Picture Distributed Analysis (Torre, Caltech w/s)

Hepcal-II Analysis Use Cases • Scenarios based on GAG HEPCAL-II report • Determine data sets and eventually event components • Input data are selected via a query to a metadata catalogue • Perform iterative analysis activity • Selection and algorithm are passed to a workload management system, together with spec of the execution environment • Algorithms are executed on one or many nodes • User monitors progress of job execution • Results are gathered together and passed back to the job owner • Resulting datasets can be published to be accessible to other users • Specific requirements from Hepcal-II • Job traceability, provenance, logbooks • Also discussed: support for finer-grain access control and enabling to share data within physics groups

Analysis Scenario • This scenario represents the analysis activity from the user perspective. However, some other actions are done behind the scene of the user interface: • To carry out the analysis tasks users are accessing shared computing resources. To do so, they must be registered with their Virtual Organization (VO), authenticated and their actions must be authorized according to their roles within the VO • The user specifies the necessary execution environment (software packages, databases, system requirements, etc) and the system insures it on the execution node. In particular, the necessary environment can be installed according to the needs of a particular job • The execution of the user job may trigger transfers of various datasets between a user interface computer, execution nodes and storage elements. These transfers are transparent for the user

Example: Asynchronous Analysis Running Grid-based analysis from inside ROOT (adapted from AliEn example) • ROOT calling the ARDA API from the command prompt // connect + authenticate to the GRID Service arda as “lucia” TGrid *arda = TGrid::Connect(”arda",”lucia”,"",""); // create a new analysis Object ( <unique ID>, <title>, #subjobs) TArdaAnalysis* analysis =new TArdaAnalysis(“pass001",“MyAnalysis",10); // set the program, which executes the Analysis Macro/Script analysis->Exec("ArdaRoot.sh”,"file:/home/vincenzo/test.C"); // script to execute // setup the event metadata query analysis->Query("2003-09/V6.08.Rev.04/00110/%gjetmet.root?pt>0.2"); // specify job splitting and run analysis->OutputFileAutoMerge(true); // merge all produced .root files analysis->Split(); // split the task in subjobs analysis->Run(); // submit all subjobs to the ARDA queue // asynchronously, at any time get the (partial or complete) results analysis->GetResults(); // download partial/final results and merge them analysis->Info(); // display job information

Asynchronous Analysis Model • Extract a subset of the datasets from the virtual file catalogue using metadata conditions provided by the user. • Split the tasks according to the location of data sets. • A trade-off has to be found between best use of available resources and minimal data movements. Ideally jobs should be executed where the data are stored. Since one cannot expect a uniform storage location distribution for every subset of data, the analysis framework has to negotiate with dedicated Grid services the balancing between local data access and data replication. • Spawn sub-jobs and submit to Workload Management with precise job descriptions • User can control the results while and after data are processed • Collect and Merge available results from all terminated sub-jobs on request • Analysis objects associated with the analysis task remains persistent in the Grid environment so the user can go offline and reload an analysis task at a later date, check the status, merge current results or resubmit the same task with modified analysis code.

Synchronous Analysis • Scenario: using PROOF in the Grid environment • Parallel ROOT Facility, main developer Maarten Ballintjin/MIT • PROOF already provides a ROOT-based framework to use a (local) cluster computing resources • balancing dynamically the workload, with the goal of optimizing CPU exploitation and minimizing data transfers • makes use of the inherent parallelism in event data • works in heterogeneous clusters with distributed storage • Extend this to the Grid using interactive analysis services, that could be based on the ARDA services

ARDA Roadmap “Informed” By DA Implementations • Following SC2 advice, reviewed major existing DA projects • Clearly AliEn today provides the most complete implementation of a distributed analysis services, that is “fully functional” -- also interfaces to PROOF • Implements the major Hepcal-II use cases • Presents a clean API to experiments application, Web portals, … • Should address most requirements for upcoming experiment’s physics studies • Existing and fully functional interface to complete analysis package --- ROOT • Interface to PROOF cluster-based interactive analysis system • Interfaces to any other system well defined and certainly feasible • Based on Web-services, with global (federated) database to give state and persistency to the system • ARDA approach: • Re-factoring AliEn, using the experience of the other project, to generalize it in an architecture; Consider OGSI as a natural foundation for that • Confront ARDA services with existing projects (notably EDG, SAM, Dirac, etc) • Synthesize service definition, defining their contracts and behavior • Blueprint for initial distributed analysis service infrastructure

ARDA Distributed Analysis Services • Distributed Analysis in a Grid Services based architecture • ARDA Services should be OGSI compliant -- built upon OGSI middleware • Frameworks and applications use ARDA API with bindings to C++, Java, Python, PERL, … • interface through UI/API factory -- authentication, persistent “session” • Fabric Interface to resources through CE, SE services • job description language, based on Condor ClassAds and matchmaking • Database(ses) through Dbase Proxy provide statefulness and persistence • We arrived at a decomposition into the following key services • API and User Interface • Authentication, Authorization, Accounting and Auditing services • Workload Management and Data Management services • File and (event) Metadata Catalogues • Information service • Grid and Job Monitoring services • Storage Element and Computing Element services • Package Manager and Job Provenance services

AliEn (re-factored)

ARDA Key Services for Distributed Analysis

File catalog • Data management • Storage Element ARDA HEPCAL matching: an example • HEPCAL-II Use Case: Group Level Analysis (GLA) • User specifies job information including • Selection criteria; • Metadata Dataset (input); • Information about s/w (library) and configuration versions • Output AOD and/or TAG Dataset (typical); • Program to be run; • User submits job; • Program is run; • Selection Criteria are used for a query on the Metadata Dataset; • Event ID satisfying the selection criteria and Logical Dataset Name of corresponding Datasets are retrieved; • Input Datasets are accessed; • Events are read; • Algorithm (program) is applied to the events; • Output Datasets are uploaded; • Experiment Metadata is updated; • Report summarizing the output of the jobs is prepared for the group (eg. how many evts to which stream, ...) extracting the information from the application and GridMW • Authentication • Authorization • Metadata catalog • Workload management • Package manager • Compute element • Metadata catalog • File catalog • Data management • Storage Element • Metadata catalog • Job provenance • Auditing

API to Grid services • ARDA services present an API, called by applications like the experiments frameworks, interactive analysis packages, Grid portals, Grid shells, etc • In particular the importance of UI/API • Interface services to higher level software • Exp. framework • Analysis shells, e.g. ROOT • Grid portals and other forms of user interactions with environment • Advanced services e.g. virtual data, analysis logbooks etc • Provide experiment specific services • Data and Metadata management systems • Provide an API that others can project against • Benefits of common API to framework • Goes beyond “traditional” UIs à la GANGA, Grid portals, etc • Benefits in interfacing to analysis applications like ROOT et al • Process to get a common API b/w experiments --> prototype • The UI/API can use the Condor ClassAds as a Job Description Language • This will maintain compatibility with existing job execution services, in particular LCG-1.

API and User Interface

File Catalogue and Data Management • Input and output associated with any job can be registered in the VO’s File Catalogue, a virtual file system in which a logical name is assigned to a file. • Unlike real file systems, the File Catalogue does not own the files; it only keeps an association between the Logical File Name (LFN) and (possibly more than one) Physical File Names (PFN) on a real file or mass storage system. PFNs describe the physical location of the files and include the name of the Storage Element and the path to the local file. • The system should support file replication and caching and will use file location information when it comes to scheduling jobs for execution. • The directories and files in the File Catalogue have privileges for owner, group and the world. This means that every user can have exclusive read and write privileges for his portion of the logical file namespace (home directory).

Job Provenance service • The File Catalogue is extended to include information about running processes in the system (in analogy with the /proc directory on Linux systems) and to support virtual data services • Each job sent for execution gets an unique id and a corresponding /proc/id directory where it can register temporary files, standard input and output as well as all job products. In a typical production scenario, only after a separate process has verified the output, the job products will be renamed and registered in their final destination in the File Catalogue. The entries (LFNs) in the File Catalogue have an immutable unique file id attribute that is required to support long references (for instance in ROOT) and symbolic links.

Package Manager Service • Allows dynamic installation of application software released by the VO (e.g. the experiment or a physics group). • Each VO can provide the Packages and Commands that can be subsequently executed. Once the corresponding files with bundled executables and libraries are published in the File Catalogue and registered, the Package Manager will install them automatically as soon as a job becomes eligible to run on a site whose policy accepts these jobs. • While installing the package in a shared package repository, the Package Manager will resolve the dependencies on other packages and, taking into account package versions, install them as well. This means that old versions of packages can be safely removed from the shared repository and, if these are needed again at some point later, they will be re-installed automatically by the system. This provides a convenient and automated way to distribute the experiment specific software across the Grid and assures accountability in the long term.

Computing Element • Computing Element is a service representing a computing resource. Its interface should allow submission of a job to be executed on the underlying computing facility, access to the job status information as well as high level job manipulation commands. The interface should also provide access to the dynamic status of the computing resource like its available capacity, load, number of waiting and running jobs. • This service should be available on a per VO basis. • Etc pp.

Talking Points • Horizontally structured system of services with a well-defined API and a database backend • Can easily be extended with additional services, new implementations can be moved in, alternative approaches tested and commissioned • Interface to LCG-1 infrastructure • VDT/EDG interface through CE, SE and the use of JDL, compatible with existing i/s • ARDA VO services can build on emerging VO management infrastructure • ARDA initially looked at file based datasets, not object collection • talk with POOL how to extend the file concept to a more generic collection concept • investigate experiment’s metadata/file catalog interaction • VO system and site security • Jobs are executed on behalf of VO, however users fully traceable • How do policies get implemented, e.g. analysis priorities, MoU contributions etc • Auditing and accounting system, priorities through special “optimizers” • accounting of site “contributions”, that depend what resources sites “expose” • Database backend for the prototype • Address latency, stability and scalability issues up-front; good experience exists • In a sense, the system is the database (possibly federated and distributed) that contains all there is to know about all jobs, files, metadata, algorithms of all users within a VO • set of OGSI grid services provide “windows”/”views” into the database, while the API provides the user access • allows structuring into federated grids and “dynamic workspaces”

General ARDA Roadmap • Emerging picture of “waypoints” on the ARDA “roadmap” • ARDA RTAG report • review of existing projects, common architecture component’a decomposition & re-factoring • recommendations for a prototypical architecture and definition of prototypical functionality and a development strategy • Development of a prototype and first release • Integration with and deployment on LCG-1 resources and services • Re-engineering of prototypical ARDA services, as required • OGSI gives framework in which to run ARDA services • Addresses architecture • Provides framework for advanced interactions with the Grid • Need to address issues of OGSI performance and scalability up-front • Importance of modeling, plan for scaling up, engineering of underlying services infrastructure

Roadmap to a GS Architecture for the LHC • Transition to grid services explicitly addressed in several existing projects • Clarens and Caltech GAE, MonaLisa • Based on web services for communication, Jini-based agent architecture • Dirac • Based on “intelligent agents” working within batch environments • AliEn • Based on web services and communication to distributed database backend • DIAL • OGSA interfaces • Initial work on OGSA within LCG-GTA • GT3 prototyping • Leverage from experience gained in Grid M/W R&D projects

ARDA Roadmap for Prototype • No “evolutionary” path from GT2-based grids • Recommendation: build early a prototype based on refactoring existing implementations • Prototype provides the initial blueprint • Do not aim for a full specification of all the interfaces • 4-prong approach: • Re-factoring of AliEn, Dirac and possibly other services into ARDA • Initial release with OGSI::Lite/GT3 proxy, consolidation of API, release • Implementation of agreed interfaces, testing, release • GT3 modeling and testing, ev. quality assurance • Interfacing to LCG-AA software like POOL, analysis shells like ROOT • Also opportunity to “early” interfacing to complementary projects • Interfacing to experiments frameworks • metadata handlers, experiment specific services • Provide interaction points with community

Experiments and LCG Involved in Prototyping • ARDA prototype would define the initial set of services and their interfaces. Timescale: spring 2004 • Important to involve experiments and LCG at the right level • Initial modeling of GT3-based services • Interface to major cross-exp packages: POOL, ROOT, PROOF, others • Program experiment frameworks against ARDA API, integrate with experiment environments • Expose services and UI/API to other LHC projects to allow synergies • Spend appropriate effort to document, package, release, deploy • After the prototype is delivered, improve on • Scale up and re-engineer as needed: OGSI, databases, information services • Deployment and interfaces to site and grid operations, VO management etc • Build higher-level services and experiment specific functionality • Work on interactive analysis interfaces and new functionalities

Major Role for Middleware Engineering • ARDA roadmap based on a well-factored prototype implementation that allows evolutionary development into a complete system that evolves to the full LHC scale • ARDA prototype would be pretty lightweight • Stability through basing on global database to which services talk through a database proxy • “people know how to do large databases” -- well founded principle (see e.g. SAM for RunII), with many possible migration paths • HEP-specific services, however based on generic OGSI-compliant services • Expect LCG/EGEE middleware effort to play major role to evolve this foundation, concepts and implementation • re-casting the (HEP-specific event-data analysis oriented) services into more general services, from which the ARDA services would be derived • addressing major issues like a solid OGSI foundation, robustness, resilience, fault recovery, operation and debugging • Expect US middleware projects to be involved in this!

Conclusions • ARDA is identifying a services oriented architecture and an initial decomposition of services required for distributed analysis • Recognize a central role for a Grid API which provides a factory of user interfaces for experiment frameworks, applications, portals, etc • ARDA Prototype would provide an distributed physics analysis environment of distributed experimental data • for experiment framework based analysis • Cobra, Athena, Gaudi, AliRoot, • for ROOT based analysis • interfacing to other analysis packages like JAS; event displays like Iguana; grid portals; etc. can be implemented easily

ARDA Reports to the LCG SC2

ARDA Reports to the LCG SC2

Presentation Transcript

LCG Deployment

LCG DER

D eejae arda

LCG-France

LcgCAF:CDF submission portal to LCG

LCG Gridview / LCG SAM use cases

ARDA status and plans

The LCG Project

SC2 cracking furnace

ATLAS/ARDA LCG Demo

ATLAS and ARDA

ATLAS and ARDA

angelfire/sc2/frankt/start_and_end_malaria.htm

LCG ARDA project Status and plans

LCG middleware and LHC experiments ARDA project

ARDA Prototypes

The LCG Test Suites

SC2 meeting #42

“ARDA in a nutshell” Massimo Lamanna / CERN on behalf of the LCG-ARDA project

SC2 report to the POB