100 likes | 245 Views
Distributed Analysis at the LCG. Torre Wenaus, BNL/CERN LCG Applications Area Manager http://cern.ch/lcg/peb/applications Caltech Grid Enabled Analysis Workshop June 24, 2003. Distributed Analysis Related Activity at the LCG.
E N D
Distributed Analysis at the LCG Torre Wenaus, BNL/CERN LCG Applications Area Manager http://cern.ch/lcg/peb/applications Caltech Grid Enabled Analysis Workshop June 24, 2003
Distributed Analysis Related Activity at the LCG • Middleware requirements and use cases arising from distributed analysis (GAG, HEPCAL) • See Ruth’s talk • Analysis modelling (Grid Technology Area) • See Kathrin Paschen’s talk • Distributed analysis application layer (Applications Area) • ‘ARDA’ RTAG • Hopes for this meeting
Applications Area Activity Blue: Common activity Grey: Experiment specific Products mentioned are examples; not a comprehensive list
Distributed Analysis in the Applications Area • Anticipated activity: • Grid interfaces to the experiments – interfaces to physicist end users, and grid-enabled services serving higher level applications and frameworks • Integration/adaptation of physics applications software in the grid environment • Prerequisite: A mandate coming from agreement among experiments on common work • Via an ‘RTAG’: Requirements and Technical Assessment Group • Distributed Analysis RTAG just established a week ago • But even in the absence of a mandate, we have started limited, focused work because we have two people hired explicitly to work on distributed analysis • Development of a remote launch service • Task agreed upon a week ago, and now starting
Remote Launch Service • A ‘grid service’ in the LCG architecture • Remotely launch the clients and/or masters making up a distributed parallel interactive analysis task • Using grid middleware • Providing immediate launch and responsiveness • A generic service usable in different analysis tool contexts • The service will be integrated and used in both PROOF and Ganga • ie. integrated with ROOT/CINT and as a Ganga Python module • What middleware can/should we use? Looking first at Condor ‘Computing On Demand’ (COD) – appears to have the specs we need • Very interesting talk by Derek Wright at http://www.cs.wisc.edu/condor/CondorWeek2003/presentations/ • Maarten Ballintijn may already have Condor/PROOF COD working? • Looking forward to PROOF demo
Other Distributed Analysis Tasks • Before ‘remote launch service’ was chosen as an initial distributed analysis task, others were proposed and considered • An indication of (some of) what is seen to be missing for interactive analysis • Proposed tasks were • Grid-based control/communication service used between interactive masters/clients • Development of an OGSA(-like?) service making use of GSI • Is no middleware project going to provide us with this essential service? • Interface to datasets/file catalogs including querying on tags, LFN, etc. – i.e., a dataset service • Interface to resource broker to find the best location(s), based on the data set and interactive availability, where to run the query • Do today’s resource brokers understand distributed interactive analysis? Will tomorrow’s? • Comments on these and on how best to use 1-1.5 FTEs on distributed analysis are welcome
RTAG on An Architectural Roadmap towards Distributed Analysis (ARDA) 1). Observation: • Different LHC experiments have developed packages (AliEn, Ganga, Dirac, Impala, Boss, Grappa, Magda…) that either sit on top, complement, expand or parallel the functionality of the Grid middleware (VDT, EDG…) • At this time the LCG is coming to grips with the middleware development requirements • There is an expectation that an OGSA Services Architecture will be the basis for future development. • The Experiments need to specify in their TDR’s, baselines, fallback and development strategies Motivation: • To agree on requirements as laid out in a first step by recent work within the GAG and identify commonalities within the current projects which might allow the LCG (both in the AA and GTA areas) to provide a focus of effort. • To provide guidance to the LCG on future Middleware development directions and interfacing work to match the experiment requirements • To build on the richness of the current technical solutions to avoid duplication of efforts • To clearly identify the roles and responsibilities of the components/layers/ services in the experiment DA planning • To give guidance to the community on the expected division of work between the experiments, the LCG and the external projects. 1)Ardawas the name given by the Elves to their World and all it contained, see www.glyphweb.com/arda/
Mandate for the ARDA RTAG • To review the current DA activities and to capture their architectures in a consistent way • To confront these existing projects to the HEPCAL II use casesand the user's potential work environments in order to explore potential shortcomings. • To consider the interfaces between Grid, LCG and experiment-specific services • Review the functionality of experiment-specific packages, state of advancement and role in the experiment. • Identify similar functionalities in the different packages • Identify functionalities and components that could be integrated in the generic GRID middleware • To confront the current projects with critical GRID areas • To develop a roadmap specifying wherever possible the architecture, the components and potential sources of deliverables to guide the medium term (2 year) work of the LCG and the DA planning in the experiments.
Schedule and Makeup of ARDA RTAG The RTAG shall provide a draft report to the SC2 by September 03. • It should contain initial guidance to the LCG and the experiments to inform the September LHCC manpower review, in particular on the expected responsibilities of • The experiment projects • The LCG (Development and interfacing work rather than coordination work) • The external projects The final RTAG report is expected for October 03. The RTAG shall be composed of • Two members from each experiment • Representatives of the LCG GTA and AA • If not included above, the RTAG shall co-opt or invite representatives from the major Distributed Analysis projects and non-LHC running experiments with DA experience.
This Meeting • I hope this meeting can give a kick start to the RTAG… • Informed by a survey of what exists (code, use cases) now, • What are the components/layers/services required specifically for distributed analysis? • What software is currently existing or in the works to cover these? • Can an architecture that is realizable in the near term be blocked out? Can it be agreed on? • On the principle that we have to start with realizable architectures and tools and build upwards incrementally over time • With due consideration for the R&D nature of present work, can we work in a coherent and complementary way? • Can we identify elements which should be pursued as common solutions? • When we confront current middleware with our needs, what is missing? How will the holes be filled?