180 likes | 193 Views
This document outlines the objectives and participating sites of DEISA, an European Supercomputing Service. It also discusses how DEISA is enhancing HPC services in Europe and provides an overview of the basic DEISA services. The text is in English.
E N D
DEISA requirements for federations and AA Jules Wolfrat SARA www.deisa.org
Outline • Introduction to DEISA • AA and User administration • Federation issues
DEISA objectives • To enable Europe’s terascale science by the integration of Europe’s most powerful supercomputing systems. • Enabling scientific discovery across a broad spectrum of science and technology is the only criterion for success. • DEISA is an European Supercomputing Service built on top of existing national services. This service is based on the deployment and operation of a persistent, production quality, distributed supercomputing environment with continental scope. • The integration of national facilities and services, together with innovative operational models, is expected to add substantial value to existing infrastructures. • Main focus is High Performance Computing (HPC).
Participating Sites BSC Barcelona Supercomputing Centre Spain CINECA Consortio Interuniversitario per il Calcolo AutomaticoItaly CSC Finnish Information Technology Centre for ScienceFinland EPCC/HPCx University of Edinburgh and CCLRCUK ECMWF European Centre for Medium-Range Weather Forecast UK (int) FZJ Research Centre Juelich Germany HLRS High Performance Computing Centre Stuttgart Germany IDRIS Institut du Développement et des Ressources France en Informatique Scientifique- CNRS LRZ Leibniz Rechenzentrum Munich Germany RZG Rechenzentrum Garching of the Max Planck Society Germany SARA Dutch National High Performance ComputingThe Netherlands and Networking centre
The DEISA supercomputing environment(21.900 processors and 145 Tf in 2006, more than 190 Tf in 2007) • IBM AIX Super-cluster • FZJ-Julich, 1312 processors, 8,9 teraflops peak • RZG – Garching, 748 processors, 3,8 teraflops peak • IDRIS, 1024 processors, 6.7 teraflops peak • CINECA, 512 processors, 2,6 teraflops peak • CSC, 512 processors, 2,6 teraflops peak • ECMWF, 2 systems of 2276 processors each, 33 teraflops peak • HPCx, 1600 processors, 12 teraflops peak • BSC, IBM PowerPC Linux system (MareNostrum) 4864 processeurs, 40 teraflops peak • SARA, SGI ALTIX Linux system, 416 processors, 2,2 teraflops peak • LRZ, Linux cluster (2.7 teraflops) moving to SGI ALTIX system (5120 processors and 33 teraflops peak in 2006, 70 teraflops peak in 2007) • HLRS, NEC SX8 vector system, 646 processors, 12,7 teraflops peak. • Systems interconnected with dedicated 1Gb/s network – currently upgrading to 10 Gb/s – provided by GEANT and NRENs
How is DEISA enhancing HPC services in Europe? • Running larger parallel applications in individual sites, by a cooperative reorganization of the global computational workload on the whole infrastructure, or by the operation of the job migration service inside the AIX super-cluster. • Enabling workflow applications with UNICORE (complex applications that are pipelined over several computing platforms) • Enabling coupled multi-physics Grid applications (when it makes sense) • Providing a global data management service whose primary objectives are: • Integrating distributed data with distributed computing platforms • Enabling efficient, high performance access to remote datasets (with Global File Systems and striped GridFTP). We believe that this service is critical for the operation of (possible) future European petascale systems • Integrating hierarchical storage management and databases in the supercomputing Grid. • Deploying portals as a way to hide complex environments to new users communities, and to interoperate with other existing grid infrastructures.
The most basic DEISA services • UNIfied access to COmputing REsources (UNICORE). Global access to all the computing resources for batch processing, including workflow applications(in production) • Co-scheduling service. Needed to support grid applications with synchronous access to resources, as well as high performance data movement • Global data management. Integrating distributed data with distributed computing platforms, including hierarchical storage management and databases. Major highlights are: • High performance remote I/O and data sharing with global file systems, using full network bandwidth (in production) • High performance transfers of large data sets, using full network bandwidth (end 2006) GridFTP Co-scheduled, parallel data mover tasks
Basic services: workflow simulations using UNICORE UNICORE supports complex simulations that are pipelined over several heterogeneous platforms (workflows). UNICORE handles workflows as a unique job and transparently moves the output – input data along the pipeline. UNICORE clients that monitor the application can run in laptops. UNICORE has a user friendly graphical interface. DEISA has developed a command line interface for UNICORE. UNICORE infrastructure including all sites has full production status.
DEISA Global File System integration in 2006 (based on IBM’s GPFS) AIX IBM domain Linux SGI SARA (NL) IDRIS (FR) ECMWF (UK) RZG (DE) High Performance Common Global File System various architectures / operating systems High bandwidth (up to 10 Gbit/s) HPC Common Global File System similar architectures / operation systems High bandwidth (10 Gbit/s) LRZ (DE) LRZ (DE) CSC (FI) BSC (ES) LINUX Power-PC CINECA (IT) FZJ (DE)
Enabling science • The DEISA Extreme Computing Initiative: identification, deployment and operation of a number of « flagship » applications in selected areas of science and technology. • Applications are selected on the basis of scientific excellence, innovation potential and relevance criteria (the application must require the extended infrastructure services) • European call for proposals: May-June every year (first one took place in 2005) • Evaluation June -> September. • We had in 2005 56 Extreme Computing Proposals and in 2006 40 • 29 projects were retained for operation in 2005-2006. For the 2006 call 23 projects are retained. Full information on DEISA Web server (www.deisa.org).
Extreme Computing proposals • Bioinformatics 4 • Biophysics 3 • Astrophysics 11 • Fluid Dynamics 6 • Materials Sciences 11 • Cosmology 3 • Climate, Environment 5 • Quantum Chemistry 5 • Plasma Physics 2 • QCD, Quantum computing 3 • Profiles of applications in operation in 2005 – 2006 • Huge parallel applications running in single remote nodes (dominant) • Data Intensive applications of different kinds. • Workflows (about 10%)
AA and User Administration • Users authenticate with login/passwd at home organization or through UNICORE. • For GPFS and LL-MC authZ is based on POSIX uids and gids • Uid/gid for DEISA users have to be synchronized on all sites • Each site has local administration, e.g. LDAP, NIS, passwd replication. It wasn’t feasible to couple these systems directly • A separate DEISA administration system is built based on LDAP SARA SARA HLRS IDRIS LRZ RZG BSC CINECA CSC ECMWF EPCC FZJ
User Administration (1) • Each partner is responsible for the registration of users affiliated to the partner (home organization) • Other partners update local user administration with data from other sites on a daily basis. Based on trust between partners! Administrator at site B creates local account based on ldap query LDAP server Site A DEISA user added to LDAP server at site A HPC system at Site B
User Administration (1) • Around 20 attributes used for the registration of users using existing object classes and a DEISA defined schema • Information in LDAP not only used for creation and maintenance of user accounts on system. Contains additional information too, e.g. • Phone number, email address, Science field, Nationality, Status, Project • Additional information needed to comply with requirements partners • Nationality because of export regulations for some of the systems in use • To avoid overlap between DEISA uid numbers and local numbers each site uses reserved ranges • Policies for administrators formulated, e.g. if user is to be deactivated.
X.509 certificates • UNICORE AuthN and AuthZ is based on X.509 certificates • AuthZ based on Subject Name mapping to uids in UUDB (like the gridmapfile) • UUDB is maintained at each site. So sites can decide if user can get access through UNICORE, e.g. based on the project the user is working on. Subject names are distributed using the LDAP system. • Subject name can be mapped to more than one uid, the user can specify with UNICORE which uid to use
Login A Certificate 1 User Certificate AJO Client Certificate 2 Login B Login C Certificate 3 Login D Certificate 4 Certificate 5 Login E UNICORE AA Typical UNICORE User Gateway User Certificate UUDB IDB NJS User Login TSI
Config for CSC users SARA users BSC users LRZ users FZJ users CNE users RZG users RZG users IDR users DEISA CSC gateway DMZ DEISA SARA gateway DMZ DEISA BSC gateway DMZ DEISA LRZ gateway DMZ DEISA FZJ gateway DMZ DEISA CNE gateway DMZ DEISA RZG gateway DMZ DEISA IDR gateway DMZ CSC NJS SARA NJS BSC NJS LRZ NJS FZJ NJS CNE NJS RZG NJS IDR NJS intranet intranet intranet intranet intranet intranet intranet intranet
Federation issues • Internally • X.509 based AA alone not enough for sites. Access to additional user attributes needed, e.g. uid, nationality • Discussion on deployment of portal software. Sites don’t accept access to their systems based on a shared account • Currently concept of VO not deployed. Users are managed on individual level or project level. • How to make it more dynamically • User attributes are replicated to local systems, error prone. • Interoperability with other (grid) infrastructures • Public Key authN based on X.509 certs issued by IGTF accredited CAs – will work with any other relying party. • AuthZ will be difficult – deploying VOMS may help here, but internally support from UNICORE needed. • Work to a common attribute schema?!