550 likes | 650 Views
Tecnologia ed impatto sociale delle GRID. Federico Ruggieri – INFN Roma “La Sapienza” 18 Ottobre 2006. Indice. Cos’è Grid ? Dalla Fisica A.E. alle altre Scienze La Infrastruttura di GRID Europea (EGEE) Le Applicazioni e le Comunità Scientifiche Le Aree Geografiche
E N D
Tecnologia ed impatto sociale delle GRID Federico Ruggieri – INFN Roma “La Sapienza” 18 Ottobre 2006
Indice • Cos’è Grid ? • Dalla Fisica A.E. alle altre Scienze • La Infrastruttura di GRID Europea (EGEE) • Le Applicazioni e le Comunità Scientifiche • Le Aree Geografiche • Impatto Sociale e Digital Divide • Luci ed Ombre • Conclusioni
What GRID is supposed to be “A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities.” I. Foster & K. Kesselman - The Grid: Blueprint for a New Computing Infrastructure – Morgan Kaufman 1998. • A dependable infrastructure that can facilitate the usage of distributed resources by many groups of distributed persons or Virtual Organizations. • The GRID paradigm is an extension of the WEB one, which was originally limited to distributed access to distributed information and documents. • The classical example is the Power GRID: you plug in and receive power; you don’t know (and you don’t care) where it comes from.
A Grid Checklist • Ian Foster more recently suggested that GRID is a system that: • 1) coordinates resources that are not subject to centralized control… (A Grid integrates and coordinates resources and users that live within different control domains—for example, the user’s desktop vs. central computing; different administrative units of the same company; or different companies; and addresses the issues of security, policy, payment, membership, and so forth that arise in these settings. Otherwise, we are dealing with a local management system.) • 2) … using standard, open, general-purpose protocols and interfaces… (A Grid is built from multi-purpose protocols and interfaces that address such fundamental issues as authentication, authorization, resource discovery, and resource access. As I discuss further below, it is important that these protocols and interfaces be standard and open. Otherwise, we are dealing with an applicationspecific system.) • 3) … to deliver nontrivial qualities of service. (A Grid allows its constituent resources to be used in a coordinated fashion to deliver various qualities of service, relating for example to response time, throughput, availability, and security, and/or co-allocation of multiple resource types to meet complex user demands, so that the utility of the combined system is significantly greater than that of the sum of its parts.)
A bit of (My) short history in Grids • In the 80’ and early 90’ the accent was on client-server and meta-computing. • In 1998 I. Foster & K. Kesselman - The Grid: Blueprint for a New Computing Infrastructure & Globus project (www.globus.org). • First GRID presentation in CHEP’98 – Chicago. • 1999 – 2000 INFN-GRID Project started based on Globus, GridPP in UK, CHEP2000 in Padova. • 2000 - 2003 First EU Project DataGRID and PPDG & GRIPHYN in US. • 2003 – 2006 EGEE Project in EU and OSG in US • Many other projects in many countries (Japan, China, etc.)
A GRID for LHC and HEP • We got involved in Grids to solve the huge LHC computational problem which was, at that time, starting to be investigated (after an initial under-evaluation). • In the late 90’ client-server and meta-computing were the frontier and Computer Farms were just started (Beowulf). • The largest problem anyway was the huge amount of data expected to be produced and analyzed (PB). • The “social” challenge was to allow thousands of physicists to access those data easily from tens of countries in different continents.
LHC Computational Problem Several PetaBytes (1015 Bytes) of Data every Year
Non-LHC technology-price curve (40% annual price improvement) ~10K SI95300 processors LHC Estimate of Computing needs at CERN for LHC (2000)
http:// http:// Extension of Web Paradigm Web: Uniform Access to Information and Documents Software catalogs Sensor nets Grid: Flexible and High Performance access to (any kind of) resources Computers Data Stores Colleagues On-demand creation of powerful virtual computing and data systems
Tier2 Tier2 Tier2 Tier2 Tier2 CR - Tier1 Tier2 Tier2 Tier2 Tier2 A Power GRID for Computing
Environment Cosmology Chemistry Applications High Energy Physics Biology Distributed Data- Remote Problem Remote Application Collaborative Computing Intensive Visualization Solving Instrumentation Applications Toolkits Applications Applications Applications Applications Toolkit Toolkit Toolkit Toolkit Toolkit Toolkit Resource-independent and application-independent services Grid Services authentication, authorization, resource location, resource allocation, events, accounting, (Middleware) remote data access, information, policy, fault detection Resource-specific implementations of basic services Grid Fabric E.g., Transport protocols, name servers, differentiated services, CPU schedulers, public key (Resources) infrastructure, site accounting, directory service, OS bypass DataGRID Layered structure (2000)
Event #1 CPU 1 Event #2 CPU 2 Event #3 Dispatcher Event #4 CPU 3 Event #5 CPU 4 Event #6 Natural HEP FarmingHigh Throughput Computing
Wide Area Network Computing Element Storage Element WN WN WN WN CE SE Worker Nodes GRID Computing Farm
Strategia di base di GRID • Usare il più possibile ciò che già esiste: • Network: Internet and TCP/IP • Protocols: http, TCP, UDP, …. • Operating Systems: Linux, Solaris, ….. • Batch Systems: PBS, LSF, Condor, ….. • Storage: Disks, HPSS, HSM, CASTOR, ….. • Directory Services: LDAP, …. • Certificates: X509 • Creare uno strato software (middleware) per interfacciare i servizi.
Middleware structure Applications • Applications have access both to Higher-level Grid Services and to Foundation Grid Middleware • Higher-Level Grid Services are supposed to help the users building their computing infrastructure but should not be mandatory • Foundation Grid Middleware will be deployed on the EGEE infrastructure • Must be complete and robust • Should allow interoperation with other major grid infrastructures • Should not assume the use of Higher-Level Grid Services Higher-Level Grid Services Workload Management Replica Management Visualization Workflow Grid Economies ... Foundation Grid Middleware Security model and infrastructure Computing (CE) and Storage Elements (SE) Accounting Information and Monitoring Overview paper http://doc.cern.ch//archive/electronic/egee/tr/egee-tr-2006-001.pdf
CERN Some history … LHC EGEE Grid • 1999 – Monarc Project • Early discussions on how to organise distributed computing for LHC • 2000 – growing interest in grid technology • HEP community was the driver in launching the DataGrid project • 2001-2004 - EU DataGrid project • middleware & testbed for an operational grid • 2002-2005 – LHC Computing Grid – LCG • deploying the results of DataGrid to provide a production facility for LHC experiments • 2004-2006 – EU EGEE project phase 1 • starts from the LCG grid • shared production infrastructure • expanding to other communities and sciences • 2006-2008 – EU EGEE-II • Building on phase 1 • Expanding applications and communities … • … and in the future – Worldwide grid infrastructure?? • Interoperating and co-operating infrastructures?
LCG-2 gLite 2004 prototyping prototyping product 2005 product 2006 gLite 3.0 The release of gLite 3.0 • Convergence of LCG 2.7.0 and gLite 1.5.0 in spring 2006 • Continuity on the production infrastructure ensured usability by applications • Initial focus on the new Job Management • Thorough testing and optimization together with the applications • Migration to the ETICS build system • ETICS project started in January • Reorganization of the work according to the new process • EGEE Technical Coordination Group and Task Forces • Start of the EGEE SA3 Activity for integration and certification • “Continuous release process” • No big-bang releases!
CPU sites Production service • Size of the infrastructure today: • 192 sites in 40 countries • ~25 000 CPU • ~ 3 PB disk, + tape MSS
(Some) GRID Services • Workload Management System (Resource Broker) chooses the best resources matching the user requirements. • Virtual Organization Management System allows to map User Certificates with VO’s describing rights and roles of the users. • Data Oriented Services: Data & Meta-data Catalogs, Data Mover, Replica Manager, etc. • Information & Monitoring Services which allow to know which resources and services are available and where. • Accounting services to extract resource usage level related to users or group of users and VO’s.
Replica Catalogue Information Service Input “sandbox” UI JDL Input “sandbox” Output “sandbox” Resource Broker Author. &Authen. Job Submit Job Query Storage Element Brokerinfo Job Status Job Submission Service Logging & Book-keeping Output “sandbox” Compute Element Job Status Job Submission with Brokering
Security & VOMS • GRID usa per la autenticazione degli utenti i certificati digitali (X509) con servizio Globus/GSI ed un sistema di mapping fra questi e gli UID, GID locali dei sistemi. • GRID si propone di facilitare l’uso di risorse distribuite da parte di comunità virtuali legate ad un tipo di attività, e/o ad esperimenti/progetti. • E’ perciò necessario un sistema che non solo garantisca AAA (Authentication, Authorization, Accounting), in maniera pressochè uniforme, ma sia in grado di gestire policy di accesso anche complesse. • Il Virtual Organization Management System cerca di rispondere a queste esigenze facilitando la gestione delle comunità virtuali e le definizioni di ruoli e gruppi di utenti che verranno usate per gestire le policy di accesso alle risorse. • L’uso di VOMS permette anche ai possessori delle risorse di definire quali comunità (Virtual Organizations) possono accedere alle proprie risorse e con quali limiti.
B A Public Key Infrastructure • Based on asymmetric algorithms • Two keys: private key and public key • It is “almost impossible” to derive private from public. • Data encrypted with one key can be only decrypted with the other.
GRID Security Infrastructure • Based on Public key infrastructure (PKI) • Certification of Personal Identity: a key <=> a user / physical person • PKI: asymmetric encryption • X509 certificate • Identification of Computers and Services with PKI Certificates.
X.509 Certificate • ITU-T standard for PKI • X.509 == IETF PKI cert + CRL of X.509v3 standard ▪ Certificate ▪ Version ▪ Serial Number ▪ Algorithm ID ▪ Issuer ▪ Validity ▪ Subject ▪ Subject Public Key Info ▪ Public Key Algorithm ▪ Subject Public Key ▪ Issuer Unique Identifier (Optional) ▪ Subject Unique Identifier (Optional) ▪ Extensions (Optional) ▪ ... ▪ Certificate Signature Algorithm ▪ Certificate Signature
CA & Proxy • National/Regional Certification Authorities issue Certificates. • Usage of short lived Proxy Certificates to avoid real Certificates to be archived together with the applications (Delegation). % grid-proxy-init Your identity: /C=IT/O=INFN/OU=Personal/L= Enter GRID pass phrase for this identity: Creating proxy ............................................. Done Your proxy is valid until: Thu Aug 31 21:56:18 2006
Proxy certificate Authentication Holder, Issuer,Validity Certificate extensions GSI Request VOMS1 signed information vomsd VOMS2 signed information User’sattributes User client vomsd KCA certificate Signature C=IT/O=INFN /L=CNAF/CN=Pinco Palla/CN=proxy VOMSServer User’sattributes AuthDB AuthDB Admin client vadmind soap OK user Architettura di VOMS • Attributes • Group (hierarchically organized) • Role (admin, staff, student, ..) • Capability (free-form string)
IS Components Abbreviations: BDII: Berkeley DataBase Information Index GIIS: Grid Index Information Server GRIS: Grid Resource Information Server Each site can run a BDII. It collects the information coming from the GIISs At each site, a site GIIS collects the information given by the GRISs Local GRISes run on CEs and SEs at each site and report dynamic and static information From LCG2.3.0 site GIIS has been replaced by “local” BDII
WMS & JDL • The Workload Management System (WMS) is the gLite 3.0 component that allows users to submit jobs, and performs all tasks required to execute them, without exposing the user to the complexity of the Grid. • The JDL attributes described in this document are the ones supported when the submission to the WMS is performed through the legacy Network Server interface, i.e. using the python command line interface or the C++/Java API of the gLite WMS-UI subsystem . It basically represents a subset of the whole set of attributes supported by the WMS when accessed via the new web services based interface (WMProxy).
JDL example • Type="Job"; • JobType="Normal"; • Executable = “ls"; • StdError = "stderr.log"; • StdOutput = "stdout.log"; • Arguments = “-lrt"; • InputSandbox = "start_hostname.sh"; • OutputSandbox = {"stderr.log", "stdout.log"}; • RetryCount = 7; • VirtualOrganisation=“ATLAS";
Job Types • Normal a simple batch job • Interactive a job whose standard streams are forwarded to the submitting client • MPICH a parallel application using MPICH-P4 implementation of MPI • Partitionable set of independent sub-jobs, each one taking care of a step or of a sub-set of steps, and which can be executed in parallel • Checkpointable a job able to save its state, so that the job execution can be suspended and resumed later, starting from the same point where it was first stopped.
WN is the real execution node • WN is like a Slave User access point to Grid • Proxy credential • Job submission • Job Monitoring • Command info site CE • Entry point of a queue in a batch system • CE is like a Master User Interface (UI)
Applications • Many applications from a growing number of domains • Astrophysics • Computational Chemistry • Earth Sciences • Financial Simulation • Fusion • Geophysics • High Energy Physics • Life Sciences • Multimedia • Material Sciences • … Applications have moved from testing to routine and daily usage ~80-90% efficiency
User Interface SRM (DPM) FTS LFC local DB User Interface SRM (DPM) FTS LFC local DB YBJ: User Interface SRM (DPM) local DB Figure 5 - ARGO Data Transfer Schema ARGO Data Archivese Data Catlog Sync.
Natural proteins are only a tiny fraction of the possible ones • Approx. 1013 natural proteins vs 20100 possible proteins with a chain length of 100 amino acids ! • Does the subset of natural proteins has particular properties? • Do exist in principle protein scaffolds with novel structure and/or activity not yet exploited by Nature? • GRID technology allows to tackle the problem through high throughput prediction of protein structure of a large library of “never born proteins” Biology Applications: The “never born” proteins
Where grids can help addressing neglected diseases • Contribute to the development and deployment of new drugs and vaccines • Improve collection of epidemiological data for research (modeling, molecular biology) • Improve the deployment of clinical trials on plagued areas • Speed-up drug discovery process (in silico virtual screening) • Improve disease monitoring • Monitor the impact of policies and programs • Monitor drug delivery and vector control • Improve epidemics warning and monitoring system • Improve the ability of developing countries to undertake health innovation • Strengthen the integration of life science research laboratories in the world community • Provide access to resources • Provide access to bioinformatics services
In silico Drug Discovery • Scientific objectives Provide docking information helping in search for new drugs. Biological goal: propose new inhibitors (drug candidates) addressed to neglected diseases. Bioinformatics goal: in silico virtual screening of drug candidate DBs. Grid goal : demonstrate to the research communities active in the area of drug discovery the relevance of grid infrastructures through the deployment of a compute intensive application. • Method Large scale molecular dockingon malaria to compute million of potential drugs with some software and parameters settings. Docking is about computing the binding energy of a protein target to a library of potential drugs using a scoring algorithm.
The virtual screening pipeline Grid service customers DC 1& DC2 on neglected diseases DC on avian flu Check point Check point Check point Biology teams Chemist/biologist teams Selected hits hits target Grid infrastructure Docking services Annotation services MD service Grid service providers Chimioinformatics teams Bioinformatics teams
Collaborating e-infrastructures Potential for linking ~80 countries by 2008
Social Impact • A large part of the Globe has not advanced digital infrastructures yet. • The European Research Area program wants to set Europe as the most advanced region in eInfrastructures and promote the take-up to speed of other less advanced countries to alleviate as much as possible the so called Digital Divide. • eInfrastructures support wide geographically distributed communities which share problems and resources to work towards common goals -> enhance international collaboration of scientists -> promote collaboration in other fields. • Problems too big to be handled with conventional local computer clusters and time sharing computing centers can be attacked with GRIDs. • eInfrastructures are leveraging international network interconnectivity -> High Bandwidth connections will improve exchange of knowledge and be the basis for GRID Infrastructures. • Based on safe AAA (Authentication, Authorization and Accounting) architecture -> secure and dependable infrastructures. • Need of persistent software & middleware -> Software is integral part of the infrastructure.
Digital Divide http://maps.maplecroft.com/
Docking on Malaria • Of a large interest for many developing countries. • Based on Grid-enabled drug discovery process. • Data challenge proposal never done on a large scale production infrastructure and for a neglected disease • 5 different structures of the most promising target • Output Data: 16,5 million results, ~10 TB
H5 N1 First initiative on in silico drug discovery against emerging diseases • Spring 2006: drug design against H5N1 neuraminidase involved in virus propagation • impact of selected point mutations on the efficiency of existing drugs • identification of new potential drugs acting on mutated N1 • Partners: LPC, Fraunhofer SCAI, Academia Sinica of Taiwan, ITB, Unimo University, CMBA, CERN-ARDA, HealthGrid • Grid infrastructures: EGEE, Auvergrid, TWGrid • European projects: EGEE-II, Embrace, BioinfoGrid, Share, Simdat
INCO: International Scientific Cooperation Projects SUSTAINABLE WATER MANAGEMENT INMEDITERRANEAN COASTAL AQUIFERS:Recharge Assessment and Modelling Issues(SWIMED)ICA3-CT2002-10004http://www.crs4.it/EIS/SWIMED/menu/index.html • Partnership: • UGR, Spain (Coordinator) • IMFT, France • UNINE, Switzerland • CRS4, Italy • EMI, Morocco • UG, Palestinian Authority • INAT, Tunisia • UB, Spain
Paramètres hydrodynamiques Transmissivités de l’aquifère Altitude du substratum Manque remarquable de données de transmissivité !!
WHERE Archaeological data are geospatial data Archaeological GIS How many layers ? Number of layers large, depending from the complexity of the archaeological problem Archaeological sites Infrastructures Land use Aerial photo or satellite image Best resolution X,Y < 1 m Z < 0.1 m (SAR) DEM Best resolution by Differential GPS X,Y,Z ˜ few cm ArchaeoGRID (1/2)
WHEN Archaeological data must be ordered by time Archaeological GIS + time 1000 B.C. 500 B.C. 500 A.D. ~20 years, most precise resolution in time ArchaeoGRID (2/2)