250 likes | 389 Views
Grid and VOs. Grid from 10 000 feet. Researchers perform their activities regardless geographical location, interact with colleagues, share and access data. Scientific instruments, libraries and experiments provide huge amounts of data.
E N D
Grid from 10 000 feet Researchers perform their activities regardless geographical location, interact with colleagues, share and access data Scientific instruments, libraries and experiments provide huge amounts of data The GRID: networked data processing centres and ”middleware” software as the “glue” of resources. based on material from Federico.Carminati@cern.chand the 3D RTM map by Gidon Moont, IC and GridPP
What is Grid? The word ‘grid’ has been used in many ways • cluster computing • cycle scavenging • cross-domain resource, data and information sharing A definition for what we mean with grid • Coordinates resources not subject to centralised control • Using standard, open and generic protocols & interfaces • Provides non-trivial qualities of collective service Definition source: Ian Foster in Grid Today, July 22, 2002; Vol. 1 No. 6, see http://www-fp.mcs.anl.gov/~foster/Articles/WhatIstheGrid.pdf
Grid Computing: “More Than One” • More than one machine • More than one user • More than one research community • More than one administrative domain • More than one geographical location General case: more than one of each!!!
Consequences of Plurality • More than one user / research community • Partitioning of resources, authentication, authorization, accounting • More than one machine • Software engineering, distributions • More than one administrative domain / research community • Authentication / authorization, non-invasive installations, genericity • More than one admin domain, geographical location • Worldwide operations coordination
Grid characteristics Things in e-Science grids that may contrast with other distributed efforts • collaboration of individuals from different organisations • most of the scientific grid communities today consist of people ‘scattered’ over many home organisations … in many cases internationally • ‘Virtual organisations’ – but that’s what we are used to as scientific collaborations! • delegation – services acting on your behalf – are an integral part of the architecture • for service and data brokering • integrating compute, data access, and databases in the same task • unattended work flows
Virtual Organisations A set of individuals or organisations, not under single hierarchical control, (temporarily) joining forces to solve a particular problem at hand, bringing to the collaboration a subset of their resources, sharing those at their discretion and each under their own conditions. • Users are usually a member of more than one VO • Any “large” VO will have an internal structure, with groups, subgroups, and various roles
Virtual organisation structure Lots of overlapping groups and communities graphic: OGSA Architecture 1.0, OGF GFD-I.030
Virtual vs. Organic structure • Virtual communities (“virtual organisations”) are many • An individual will typically be part of many communities • has different roles in different VOs (distinct from organisational role) • all at the same time, at the same set of resources • but will need single sign-on across all these communities graphic: OGSA Architecture 1.0, OGF GFD-I.030
Expressing collaboration • provide the means to express collaboration • membership • groups and roles • organisation management tools • support access control as function of VOs • access control as a function of VO, group, and role • both at the service and at the content level • maintain autonomy • sharing defined by access controls at the source • no need to hand off the actual data to a third party
PoC Position in the VL-e structure Application specific service App1 App 2 App 3 Application Potential Generic service & Virtual Lab. services Virtual Lab. rapid prototyping (interactive simulation) Virtual Laboratory Additional Grid Services (OGSA services) Grid Middleware Grid & Network Services Network Service (lambda networking) Networking VL-e Experimental Environment VL-e Proof of concept Environment
The VL-e PoC: Proof-of-Concept What is the PoC Environment? • A shared, common environment, • where different tools and services are • both used and • provided by the VL-e community basis for subsequent application development
Elements in the PoC The PoC refers to three distinct elements PoC Software Distribution • set of software that is the basis for the applications • both grid middleware and virtual lab generic software PoC Environment • the ensemble of systems running the Distribution • including user desktops or local clusters and storage PoC Central Facilities • those systems with the PoC Distribution centrally managed for the entire collaboration • large-scale computing, storage and hosting resources
PoC Distribution The PoC distribution contains components to • enable service-oriented development • enable application development • provide access to data, computing, and storage, distributed geographically driven by specific VL-e application scenarios Work flow to be the integrative layer of VL-e • functionality should be invocable as a service • work flow (graphical) systems help in compositionbut are not the only way to interact with services
The PoC software distribution The PoCsoftware suite. the following elements of this suite can be distinguished: • Grid foundation middleware; the basic software that is based on interfaces and concepts that are internationally adopted. This includes elements such as the security model, resource allocation interface, … based on EGEE middleware suite • Generic Virtual Laboratory software; the software developed within the project for the PoC. • Services imported from outside; given that not all services are necessarily developed within VL-e, components have been imported. • Associated installation and deployment tools; the PoC suite is installed on the central facilities and (where applicable) also available for distributed installation.
The PoC software distribution • Software environment • geared towards application software developers • enables cross-leveraging VL-e developments between applications • predictable lifecycle management • Primary metric is the effectiveness in addressing real cross-application needs • PoC is liberal in including software • as long as it is useful for multiple domains • does not compromise integrity • can be supported and safely deployed
Stable, reliable, supported releases of the Grid MW & VL-software Flexible test environment ‘keywords’ Flexible, ‘unstable’ Virtual Lab. rapid prototyping Test & Cert. Grid MW & VL-software Compatibility Application development Typical usage Matrix clusters Central Storage (SRB, dCache/SRM)Distributed Clusters, SURFnet NL-Grid Fabric Research Cluster DAS-2, local resources Initial compute platform VL-e Rapid Prototyping Environment Environments VL-e Certification Environment VL-e Proof of Concept Environment Defining content of the PoC Distribution Tagged Release Candidates Download RepositoryPoC Installer Common repositoryIntegration tests stable, tested releases External software VLeIT Recommendation Point
Working with the application developers Each generic component has an ‘expert’ on VLeIT • to work on its optimal use or deployment and • coordinate enhancement requests Latest developments from within the VL-e project • availability via a fast-lane ‘contrib’ trajectory • same installation mechanism • but supported directly by the developers addressing the chicken-and-egg dead lock
The VL-e PoC Distribution What is the VL-e PoC Distribution? The PoC distribution is • meant to be installed on a RedHat Enterprise Linux 3 compatible system • a stable base environment, with managed releases The PoC distribution contains components to • enable service-oriented development • enable application development • provide access to computing, storage, and information systems
The VL-e PoC Distribution VL-e PoC Release 1.0 Contents: gLite 3.0 Sun Java2SDK 1.4.2_12 Plus JavaGAT-1.5 MatlabMPI-1.2 Mesa3D-6.4.1 R-2.2.0 Rmpi-0.5 SRB-client-3.4.0 SRB-devel-3.4.0 fsl-bin-3.2 fsl-devel-3.2 gat-adaptors-1.8.2 gat-cpp-wrapper-1.8.2 gat-engine-1.8.2 gat-python-wrapper-1.8.2 globus-toolkit-4.0.1 graphviz-2.8 ibis-1.2.1 itk-2.4.1 kepler-1.0.0alpha7 lam-devel-7.1.2 lam-docs-7.1.2 lam-extras-7.1.2 lam-runtime-7.1.2 libRmath-2.2.0 libRmath-devel-2.2.0 medline-1.0 mpitb-2.1.72 mricro-1.3.9-4 nimrod-3.0.1 octave-2.1.72 ogsadai-wsrf-2.1 paraview-2.4.2 pl-5.6.4-200 sesame-client-1.2.3 taverna-workbench-1.3.1 triana-3.2 vtk-4.4
The VL-e PoC Distribution Distribution formats: • Network-based installation • http access • http proxy access • DVD-based installation: the PoC DVD • Pre-installed VMware image (present on PoC DVD) • CentOS 3 with GNOME GUI • gLite UI • VL-e Release 1.0 UI packages • Works with free VMware Player on both Linux and Windows
PoC Environment • All systems can be used to perform the application scenarios, using the PoC distribution • Installed both • at specific central facilities • on desktops, remote clusters, data servers
PoC Central Facilities • For applications in the Netherlands • both applications within VL-e and others • shared common infrastructure • accessible via grid middleware • has of course PoC distribution installed • Location and capacity • SARA – tape (~1.2 TB), disk storage (~100 TB), clusters (~1400 cores Debian, 60 RHEL3),database servicesuser interface gateway catch-all • NIKHEF – disk storage (~25 TB), clusters (550 cores RHEL3)
PoC Central Facility Usage Today PoC (NDPF) shared between various applications SARA LISA Occupancy grey: VLEIBU, VLEMED; green ATLAS, blue: LHCb PHICOS production jobs on the PoC (NDPF) at NIKHEF