310 likes | 431 Views
What do we mean by the Grid and e-research? An overview of some key aspects and technologies in 30 minutes. Jennifer M. Schopf UK National eScience Centre Argonne National Lab. Talk Outline. Definition of Grid and eResearch Globus Toolkit Provider of basic infrastructure
E N D
What do we mean by the Grid and e-research?An overview of somekey aspects and technologiesin 30 minutes Jennifer M. Schopf UK National eScience Centre Argonne National Lab
Talk Outline • Definition of Grid and eResearch • Globus Toolkit • Provider of basic infrastructure • Focus on data tools • OMII – Open Middleware Infrastructure • UK repository and distribution of eResearch tools
What is a Grid? • Many definitions – many differences especially between academics and industry • Both use the buzzword to get funding • My definition • Resource sharing • Coordinated problem solving • Dynamic, multi-institutional virtual orgs
Resource Sharing • Resources can be anything- • Computers • Storage/repositories • Sensors and Networks • People and software • Local Control of the resources, and local policies for their use • Sharing is always conditional • Issues of trust, policy • Negotiation and payment
Coordinated Problem Solving • Beyond client-server • Client Server defines a small set of well-understood interactions as the only ones that can take place • Actions in this space can include • Distributed data analysis • Computation and visualization of results • Collaboration
Dynamic, Multi-institutionalVirtual Organizations • Crossing administrative domains • No one has full control over the resources • Local policy not global • Different local policy on different sites • Community overlays on classic organizational structures • Large or small, static or dynamic
What is eScience or eResearch? • Use of distributed resources, in a coordinated way, across multiple administrative domains to do science or further your research • “Classic” eScience • Use compute and data resources at many sites to run large scale simulations for a physics or biology application • Today’s Use Cases • Replicate data across multiple sites to increase reliability, redundancy and performance • Use one common interface to access a variety of data resources at multiple sites • Look at a number of available resources to select the one that best suits the application needs at this time
Why is this hard/different? • Lack of central control • Where things run • When they run • Shared resources • Contention, variability • Off-label use • Resources or software developed for one purpose (or community) is now being used in a way that wasn’t originally planned for • Communication • Different sites implies different sys admins, users, institutional goals, and often “strong personalities”
So why do it? • Work that needs to be done with a time limit • Data that can’t fit on one site • Data owned by multiple sites • Applications that need to be run bigger, faster, more
What functionality isneeded to use a Grid? • Basics: • Run a job • Transfer a file • Find out what’s going on (service and job monitoring • All done securely • Higher-level • Replication • Higher level data movement • Workflow-scheduling
Grid2003: An Operational Grid • 28 sites (2100-2800 CPUs) & growing • 10 substantial applications + CS experiments • Running since October 2003, still up today Korea http://www.ivdgl.org/grid2003
Globus ToolkitWas Created To Help Applications • The Globus Toolkit consists of collections of solutions to problems that frequently come up when trying to build collaborative distributed applications • Heterogeneity • Focus on simplifying heterogeneity for application developers • Working towards more “vertical solutions” • Standards • Capitalize on and encourage use of existing standards (IETF, W3C, OASIS, GGF) • Reference implementations of new/proposed standards in these organizations
Globus is Service-Oriented Infrastructure Technology • Software for service-oriented infrastructure • Service enable new & existing resources • E.g., GRAM on computer, GridFTP on storage system, custom application service • Uniform abstractions & mechanisms • Tools to build applications that exploit service-oriented infrastructure • Registries, security, data management, … • Open source & open standards • Each empowers the other • eg – monitoring across different protocols is hard • Enabler of a rich tool & service ecosystem
Our Goals for Globus Toolkit v4 • Usability, reliability, scalability, … • Web service components have quality equal or superior to pre-WS components • Documentation at acceptable quality level • Consistency with latest standards (WS-*, WSRF, WS-N, etc.) and Apache platform • WS-I Basic (Security) Profile compliant • New components, platforms, languages • And links to larger Globus ecosystem
4 2 1 3 A Model Architecture for Data Grids Attribute Specification Replica Loc. Svc Metadata Catalog Application Multiple Locations Logical Collection and Logical File Name MDS Selected Replica Replica Selection Performance Information & Predictions NWS GridFTP Control Channel Disk Cache GridFTPDataChannel TapeLibrary Disk Array Disk Cache Replica Location 1 Replica Location 2 Replica Location 3
Find your data: Replica Location Service Managing ~40M files in production settings Move/access your data: GridFTP, Reliable File Transfer (RFT) High-performance striped data movement Couple data & execution management GRAM uses GridFTP and RFT for staging Access databases through standard Grid interfaces: OGSA-DAI GT4 Data Functions
GridFTP in GT4 • Basic file transfer support, and memory-to-memory copies • Underlying protocol of access can vary • Functions as a hourglass offering one interface to different resources • Allows partial file transfer support • Can have parallel streams and stripping • Greatly improve performance over most FTP implementations • On TeraGrid network achieved 27 Gbs on a 30 Gbs link (90% utilization) with 32 nodes
IPCReceiver DataChannel DataChannel MasterDSI SlaveDSI Protocol Interpreter SlaveDSI Protocol Interpreter Data Channel MasterDSI IPCReceiver Data Channel IPC Link IPC Link Reliable File Transfer:Third Party Transfer • Fire-and-forget transfer • Web services interface • Many files & directories • Integrated failure recovery RFT Client SOAP Messages Notifications(Optional) RFT Service GridFTP Server GridFTP Server
OGSA-DAI • Data access • Relational & XML Databases, semi-structured files • Data integration • Multiple data delivery mechanisms, data translation • Extensible & Efficient framework • Request documents contain multiple tasks • A task = execution of an activity • Group work to enable efficient operation • Extensible set of activities • > 30 predefined, framework for writing your own • Moves computation to data • Pipelined and streaming evaluation • Concurrent task evaluation
The ResourceManagement Challenge • Enabling secure, controlled remote access to heterogeneous computational resources and management of remote computation • Authentication and authorization • Resource discovery & characterization • Reservation and allocation • Computation monitoring and control • Addressed by a set of protocols & services • GRAM protocol as a basic building block • Resource brokering & co-allocation services • GSI for security, MDS for discovery
Execution Management (GRAM) • Common WS interface to schedulers • Unix, Condor, LSF, PBS, SGE, … • More generally: interface for process execution management • Lay down execution environment • Stage data • Monitor & manage lifecycle • Kill it, clean up • A basis for application-driven provisioning
Monitoring and Discovery Challenges • Grid Information Service • Requirements and characteristics • Uniform, flexible access to information • Scalable, efficient access to dynamic data • Access to multiple information sources • Decentralized maintenance • Secure information provision • Basic monitoring for resource selection and notification of errors
The Globus Ecosystem • Globus components address core issues relating to resource access, monitoring, discovery, security, data movement, etc. • GT4 being the latest version • A larger Globus ecosystem of open source and proprietary components provide complementary components • A growing list of components • These components can be combined to produce solutions to Grid problems • We’re building a list of such solutions
Condor-G, DAGman MPICH-G2 GRMS Nimrod-G Ninf-G Open Grid Computing Env. Commodity Grid Toolkit GriPhyN Virtual Data System Virtual Data Toolkit GridXpert Synergy Platform Globus Toolkit VOMS PERMIS GT4IDE Sun Grid Engine PBS scheduler LSF scheduler GridBus TeraGrid CTSS NEES IBM Grid Toolbox … Many Tools Build on, or Can Contribute to, GT4-Based Grids
Open MiddlewareInfrastructure Institute To be a leading provider of reliable interoperable and open-source Grid middleware components services and tools to support advanced Grid enabled solutions in academia and industry. • Formed University of Southampton (2004) • Focus on an easy to install e-Infrastructure solution • Utilise existing software & standards • Expanding with new partners in 2006 • OGSA-DAI team at Edinburgh • myGrid team at Manchester Slides compliments of Steven Newhouse
Activity • By providing a software repository of Grid components and tools from e-science projects • By re-engineering software, hardening it and providing support for components sourced from the community • By a managed programme to contract the development of “missing” software components necessary in grid middleware • By providing an integrated grid middleware release of the sourced software components Slides compliments of Steven Newhouse
The Managed Programme: Distribution and Repository • OGSA-DAI (Data Access service) • GridSAM (Job Submission & Monitoring service) • Grimoires (Registry service based on UDDI) • GeodiseLab (Matlab & Jython environments) • FINS (Notification services using WS-Eventing) • BPEL (Workflow service) • MANGO (Managing workflows with BPEL) • FIRMS (Reliable messaging) Slides compliments of Steven Newhouse
So… • eResearch is expanding in scope • Globus Toolkit provides many basic tools, and is incorporated in many projects, esp those focused on data movement • In the UK, OMII is another useful source of eInfrastructure software 2nd Edition www.mkp.com/grid2
Additional Information • Contact: • Jennifer M. Schopf • jms@mcs.anl.gov • http://www.mcs.anl.gov/~jms • Globus Alliance: • http://www.globus.org • Information about OMII: • http//www.omii.ac.uk • s.newhouse@omii.ac.uk