Gabriel Mardale - The Evolution of the Grid - The Anatomy of the Grid

Gabriel Mardale - The Evolution of the Grid - The Anatomy of the Grid

The Evolution of the Grid • In the last decade a large number of research groups have implemented libraries and tools that allow the cooperative use of geographically distributed resources unified to act as a single powerful platform. • This approach has been known by several names: • Metacomputing; • Scalable computing; • Global computing; • Internet computing; • And lately, Grid Computing.

The Evolution of the Grid • In the evolution of the grid three generations have been identified: • First Generation: the forerunners of the grid as we recognize it today. • Second Generation: with a focus on middleware to support large scale data and computation. • Third Generation: where the emphasis shifts to distributed global collaboration, a service oriented approach; the third generation grid systems are now under development.

The Evolution of the Grid: The First Generation • The objective of the early metacomputing projects was to provide computational resources to a range of high performance applications. • Two representative projects for this type of technology were FAFNER and I-WAY. FAFNER (Factoring via Network-Enabled Recursion) was set up to factor the RSA130 public key encryption algorithm by allowing even workstations with small amounts of memory to perform useful work. By contrast, I-WAY (Information Wide Area Year) was an experimental high performance network (ATM), linking many high performance computers. This network provided a wide area backbone for various experimental activities supporting both TCP/IP over ATM and direct ATM-oriented protocols.

The Evolution of the Grid: The Second Generation • The Grid is seen as a global scale, distributed infrastructure, supporting diverse applications requiring large-scale computation and data. • Interoperability has a key role in achieving large scale computations. In order to have interoperability, three main issues had to be confronted: • heterogeneity – a Grid involves multiple resources that are heterogeneous in nature; • scalability – a Grid might grow from a few resources to a few million resources; • adaptability – the maximum performance is needed from the available resources.

The Evolution of the Grid: The Second Generation • Two of the most important second generation core technologies are: Globus and Legion projects. • Globus: provides a software infrastructure that enables applications to handle distributed heterogeneous computing resources as a single virtual machine. The computational grid in this case is a hardware and software infrastructure that provides access to high end computational capabilities. • Legion is an object-based “metasystem” that provided the users with a single integrated infrastructure, regardless of scale, physical location, language and underlying operating system • Legion differed from Globus in its approach to a Grid environment: it encapsulated all of its components as objects.

The Evolution of the Grid: The Third Generation • In the Third Generation, the focus shifts towards a higher grade of automation; the humans delegate processes to deal with large-scale computation and data; this leads to autonomy within systems. • An autonomic system has the following eight properties: • Needs detailed knowledge of its components and status • Must configure and reconfigure itself dynamically • Seeks to optimize its behavior to achieve its goal • It is able to recover from malfunction • Protect itself against attack • Be aware of its environment • Implement open standards • Make optimized use of resources • Terms like “distributed collaboration” and “virtual organizations” are used to describe the Grid concept as we will see in the next set of slides.

The Anatomy of the Grid • The Grid concept is defined as: coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations. • Resource sharing is highly controlled by a set of rules that binds resource providers and resource consumers into what is called a virtual organization (VO). • The sharing is not primarily file exchange but rather direct access to computers, data and other resources; the sharing relationships • VOs vary in their purpose, size, scope, duration, structure. • Due to the specific of the Grid problem, Grid technologies do not compete with current distributed computing technologies but rather complement them.

The Anatomy of the Grid The layered Grid architecture and its relationship to the Internet protocol architecture.

The Anatomy of the Grid – The Fabric Layer • The Grid Fabric layer provides the resources to which shared access is mediated by Grid protocols. • Examples of “resources”: computational resources, storage systems, catalogs, network resources and sensors; a resource may also be a logical entity such as a distributed file system, computer cluster or distributed computer pool • Fabric components implement the local, resource-specific operations that occur at higher levels as a result of sharing operations.

The Anatomy of the Grid – The Connectivity Layer • The Connectivity Layer defines core communication protocols and authentication protocols required for Grid-specific network transactions; they enable the exchange of data between Fabric layer resources. • Communication requirements include: transport, routing and naming. • Authentication protocols build on communication services to provide security mechanisms for verifying the identity of users and resources. • Grid security solutions should also provide flexible support for communication protection.

The Anatomy of the Grid – The Resource Layer • The Resource layer builds on Connectivity layer to define protocols for the secure negotiation, initiation, monitoring, control, accounting and payment on individual resources. • Two classes of Resource layer protocols: • Information Protocols – used to obtain information about the structure an state of a resource • Management Protocols – used to negotiate access, monitor the status of an operation and control the execution of an operation.

The Anatomy of the Grid – The Collective Layer • The Collective layer contains protocols and services that are not associated with one specific resource but rather are global in nature and capture interactions across collections of resources. • It implements a wide variety of sharing behaviors without placing new requirements on resources being shared: • Directory services: allow VO participants to discover resources. • Co-allocation, scheduling and brokering services: allow VO participants to request allocation of resources and scheduling. • Monitoring and diagnostics services: monitoring of VO for failure. • Data replication services: support the management of VO storage. • Software discovery services: discover and select the best software implementation for the problem being solved. • Community accounting and payment services: gather resource usage information for the purpose of accounting, payment and/or limiting resource usage.

The Anatomy of the Grid – The Application Layer • The Application layer comprises the user applications that operate within a VO environment • Applications are constructed in terms of, and by calling upon, services defined at any layer.

The Goals of GRID Architecture The goals of GRID Architecture: • Solve large-scale problems by bringing together globally distributed heterogeneous machines to increase computing power. • Allow user to interact with a resource broker that hides the underlying complexities.

Generic View of GRID System

Responsibilities of the Resource Broker • Discover available resources • Negotiate with resources or their agents • Handles scheduling • Stages the application and data for processing • Gathers results

Three Architectural Models

Hierarchical Model

Abstract Owner Model • Purchase available resources using automated agents to negotiate.

Abstract Owner Model

Economy Market Model

Implementations of GRID Computing Computational grids are being utilized today in a variety of large scale development areas including Operations Research, Network Simulation, Ecological Modeling and Business Process Simulation. Case studies: (Performance/Security) • High Performance Parametric Modeling with Nimrod/G: Killer Application for the Global Grid. • A Security Architecture for Computational Grids.

Case Study 1: High Performance Modeling With Nimrod and Nimrod/G Parametric Modeling is becoming an important part of exploring the behavior of complex systems. Specialized Parametric Modeling systems have been developed. Nimrod (Gen 1) and Nimrod/G (Gen 2). Nimrod: Automates formulating, running, monitoring, and gathering results from multiple individual experiments. Also, it utilizes distributed scheduling to manage scheduling of idle computers in a L.A.N.. Suffers a few limitations in the context of a “Global Grid”. Nimrod/G: Large number of Computers linked globally to form a seamless supercomputer. Utilizes a dynamic set of resources as opposed to the static set of Nimrod. Can handle user deadline control as well as underlying individual performance specifications. Utilizes a more elaborate security mechanism.

Nimrod/G Architecture • The Globus Grid toolkit is a collection of software componets designed to support development of high-performance distributed applications. • Nimrod/G uses the Globus toolkit to set up and manage “services” to control the grid architechture. • One service organizes the mapping of individual computations to appropriate remote sites according to the scheduler. • The origin process operates as master. Scheduling and Monitoring are encapsulated in this origin process. • With this, the user can stop and start remote clients without effecting the origin process. This increases performance.

Sample Monitor GUI

Cost in a Global Grid Cost: Concept that resource allocation price can vary between type of machine, time of day, resource demand, communication cost and class of user. For example: User 1 favors resource 1 because 1 is cheaper than 4. The same for user 4 for resource 4. This favors local machines first. With this, a good costing scheme should be devised.

GUSTO Usage for Ionization Chamber Study

Case Study 2: Understanding Distributed Security Parallel computations acquiring multiple resources need to establish security between potentially hundreds of processes that span many dynamic domains. A good distributed security policy must be developed. This policy focuses on four points. 1: Defining the Grid Security Problem 2: Formulating a detailed Grid Security Policy 3: Creating a Grid Security Architecture 4: Utilize this security architecture via a large scale deployment to demonstrate that this security architecture is workable

Large Scale Distributed Computation

Defining the Grid Security Problem • The user population is large and dynamic: Members are from many institutions and will change frequently. • The resource pool is large and dynamic: Individual institutions and users decide whether and when to contribute resources and quantities can change rapidly. • A computation changes dynamically: Processes created may acquire, start other processes, and release resources dynamically during its lifecycle.

Formulating a Grid Security Policy Define a set of rules about the subjects(users), objects(resources) and relationships between them. Together they(subjects and objects) make up Trust Domains. • The grid environment consists of multiple trust domains. • Operations that are confined to a local trust domain are confined to a local security policy. • Both global and local subjects exist. For each domain there is a partial mapping between them. • Operations between entities in different trust domains require mutual authentication. • Processes running for the same subject within the same trust domain may share a single set of credentials.

A Computational Grid Security Architecture

Use of the GSS-API in Globus Globus Security Infrastructure(GSI) was developed as part of the Globus project. It focused on using the Generic Security Services application programming interface (GSS-API). GSS-API provides security services to callers in a generic fashion. GSS-API supports plaintext, Secure Socket Layer (SSL), Kerberos (public key cryptography), and other specific security implementations. The Globus project GUSTO, discussed earlier, was successful in deploying this GSI Security Architecture. With this, all proxies at various sites ran from their own root directory. Each remote site’s security was scrutinized at a high level because of this.

Gabriel Mardale - The Evolution of the Grid - The Anatomy of the Grid

Gabriel Mardale - The Evolution of the Grid - The Anatomy of the Grid

Presentation Transcript

INTO THE GRID A Study of the Power Grid

INTO THE GRID A Study of the Power Grid

The Grid

The Anatomy of the Grid Enabling Scalable Virtual Organizations

The Anatomy of the Grid Enabling Scalable Virtual Organizations

THE GRID

The Grid

The Grid

The Anatomy and Physiology of the Grid Revisited

The Anatomy of the Grid: An Integrated View of Grid Architecture

“The Coming of the Grid”

The Grid

The Evolution of Grid Technology

The Anatomy of the Grid

The Grid

The Grid

Experiences of the Grid…

The Anatomy of the Grid: An Integrated View of Grid Architecture

The Anatomy of the Grid

The Evolution of Grid Technology

Basics of The Grid

The ‘Promise of the Grid’