180 likes | 294 Views
US CMS + US ATLAS ITR (pre-)proposal “Globally Enabled Analysis Communities”. Sound Bites: “Dynamic workspaces” “Private Grids to support scientific analysis communities” “Build autonomous communities operating within global collaborations”
E N D
US CMS + US ATLAS ITR (pre-)proposal “Globally Enabled Analysis Communities” Sound Bites: “Dynamic workspaces” “Private Grids to support scientific analysis communities” “Build autonomous communities operating within global collaborations” “Empower small groups of scientists to profit from and contribute to international big science” “Democratization of Science via new technologies” What is it? Well, we will have to define it in the proposal, due in March 2003
Berkeley Workshop Nov 2002 -- The Global Picture other science applications LHC applications bio applications middleware R&D (CS) science science science ….. end-to-end middleware end-to-end hep middleware end-to-end bio middleware Application/grid interface middleware engineering Advanced Middleware requirements defined by current projects • operation of base grid infrastructure • ~ 4-5 core centres (including the LCG Tier 1s) • information service, catalogues, .. • coordination, operations centre, .. • call centre, user support, training, .. other grid nodes for physics, biology, medicine, …. Hardening/Reworking of basic middleware prototyped by current projects Development of A Science Grid Infrastructure (L.Robertson)
… and the “missing pieces” • Transition to Production Level Grids(middleware support, error recovery, robustness, 24x7, monitoring and system usage optimization, strategy and policy for resource allocation, authentication and authorization, simulation of grid operations, tools for optimizing distributed systems) • Globally Enabled Analysis Communities (WG2) • Enabling Global Collaboration (a medium ITR?)
The goal: • Provide individual physicists and groups of scientists capabilities from the desktop that allow them: • To participate as an equal in one or more “Analysis Communities” • Full representation in the Global Experiment Enterprise • To on-demand receive whatever resources and information they need to explore their science interest while respecting the collaboration wide priorities and needs.
Assumptions of the ITR pre-proposal • The Project will provide some of the missing capabilities for ATLAS and CMS data analysis systems. • Project will be managed as part of US CMS and US ATLAS S&C projects. • An existing robust, fully functional core Grid Infrastructure for work flow management, data management (data, meta-data, provenance), and security management. • Project will deploy incremental capabilities for experiment use throughout the lifetime. • Up front experiment-wide management and oversight will ensure appropriateness and buy-in.
Constraints of the ITR pre-proposal • It may actually not get accepted... • Five year development and deployment program, can’t start before end of 2003 • the total possible funding is $15M over 5 years (including all overheads that is “only” some ten FTE!) • Many (>12?) institutions/universities want to participate in the project • Development teams include Computer Scientists (Information Technologists?) and Physicists • Funding comes from Computer Science department in the NSF, and reviewers are mainly Computer Scientists.
The Experiment controls and maintains the global enterprise: Hardware: Computers, Storage (permanent and temporary) Software Packages: physics, framework, data management, build and distribution mechanisms; base infrastructure (operating systems, compilers, network, grid); Event and Physics Data and Datasets Schema which define: meta-data, provenance, ancillary information (run, luminosity, trigger, Monte-Carlo parameters, calibration etc) Organization, Policy and Practice Analysis Groups - Communities - are of 1 to many individuals Each community is part of the Enterprise : Is assigned or shares the total Computation and Storage Can access and modify software, data, schema (meta-data) is subject the overall organization and management Each community has local (private) control of Use of outside resources e.g. local institution computing centers Special versions of software, datasets, schema, compilers Organization, policy and practice Physics Analysis in CMS We must be able to reliably and consistently move resources & information in both directions between the Global Collaboration and the Analysis Communities Communities can share among themselves.
Environment for CMS (LHC) Distributed Analysis on the Grid • Dynamic Workspaces - provide capability for individual and community to request and receive expanded, contracted or otherwise modified resources, while maintaining the integrity and policies of the Global Enterprise. • Private Grids - provide capability for individual and community to request, control and use a heterogeneous mix of Enterprise wide and community specific software, data, meta-data, resources.
The Global Community C C C C C C St C X St So So Sc Sc X Sc C St C C Sc St So So So X X X - physicist So - software C- compute St- storage Sc - schema/information - desktop X X a private grid/analysis community Information flow
Technologies to be Developed The CS and IT part of the proposal!
Infrastructure to support Private “Community Grids” Meta-data to describe and manage private grids. Tools for information and data transfer and communication between communities Synchronization and validation tools between the community grids and the global enterprise. Application/user interfaces for management, administration and operation of set of private grids within an enterprise.
Infrastructure to support dynamic workspace capabilities Rapid response reconfiguration and administration tools. Enterprise wide integrity and validation tools across all private grids. Application/user interfaces for the distributed request and control of extensions and contractions of private grids.
De-centralized, multi-tiered schema evolution and synchronization Mechanisms to support parallel evolution and resynchronization of decentralized heterogeneous local and enterprise schema. Application interfaces for definition, modification of and access to local and enterprise schema.
HEP specific developments: • User Interfaces: • Describe, modify, control and access all physics analysis data and meta-data • Control and manage analysis processes • Request and use private community grids • Integration with physics codes: • Definition of architecture into which ITR deliverables will fit • Integration activities to include, test and validate ITR deliverables interactively.
Architecture and Work Plan • Architecture is very important for developing the work plan --> Proposal!! • Blueprint RTAG continuation w/r to Grid interfaces? • Computing model, work loads, data flows? • We need to discuss and agree on a system architecture and components from the end-to-end physicist user to underlying grid infrastructure assumed to be in place; • We need to have a project work plan (concrete set of tasks) in place with deliverables to make people realize this is a project to deliver working production systems for the experiments to use as the project proceeds; • We need to negotiate and agree on some deliverables/components from other projects e.g. from LCG Applications Area; and to understand the time line in terms of what capabilities would be available when; • We need to upfront explicitly identify who is going to do test cases of running analyses with these tools to provide feedback to development effort --> link to data challenges
LCG Application Domain • LCG Applications Area Blueprint: • Blueprint Scope 4) Physics data management: • … Provide data management services meeting the scalability requirements of the LHC experiments, including integration with large-scale storage management and the Grid… Schema evolution , propagation, merging • relevance to grid interfaces and capabilities?it might for example be useful to look at extending it
This ITR should be of direct benefit for CMS • Each analysis group/physicist will be able to perform local analyses which can be reliably and quickly validated and trusted by the collaboration on request. The experiment will be able to demonstrate and compare methods and results reliably and improve the turnaround time to physics publications. • The experiment will be able to quickly respond to and decide upon new requests from analysis groups/physicists for resources, with minimal perturbation to the rest of the collaboration • The experiment will have an established infrastructure for evolution and extension for its long life. • There will be a lowering of the intellectual cost barrier for new physicists and researchers to contribute. We will enable small groups to perform reliable exploratory analyses on their own. There will be an increased potential for individual or small community analyses and discovery. • Individuals and groups will be assured they are using a well defined set of software and data.
This may all “seem easy” but ask any physicist doing analysis on a large experiment today… And the LHC is 10 times larger in all dimensions.