290 likes | 539 Views
Legion. Ryan Bartlett, Timothy Virgillo. What is Legion?. The Legion project was born with the determination to build, test, deploy and ultimately transfer to industry, a robust, scalable, Grid computing software infrastructure
E N D
Legion Ryan Bartlett, Timothy Virgillo
What is Legion? • The Legion project was born with the determination to build, test, deploy and ultimately transfer to industry, a robust, scalable, Grid computing software infrastructure • An object-based metasystems software that was created at the University of Virginia • A single, coherent virtual machine that addresses key grid issues such as scalability, programming ease, fault tolerance, security, and site autonomy • The user can sit at a terminal and manipulate objects on several processors, but has the illusion of working on a single powerful computer.
10 Key Design Objectives 1. Site Autonomy - resources are owned and controlled by an array of organizations 2. Extensible Core – The components that comprise Legion’s Core are designed to be replaceable and extensible 3. Scalable Architecture - Completely distributed system to allow for the scalability needed to handle millions of hosts 4. Easy to Use - hide the complexity of the system to create the illusion of working with a single powerful computer 5. High Performance– Large degree of parallelism, requiring parallelization of tasks, data and their arbitrary combinations 6. Single Persistent Name space - one name space for file and data access 7. Security - provide mechanisms to allow users to manage the security of their own objects 8. Heterogeneous Resource Management - cross platform to support different types of hardware and software 9. Multiple Language Support - integrate different types of source languages 10. Fault Tolerance - deal with host, communication links, and disk failures with the dynamic reconfiguration
Site Autonomy • The Legion Grid is comprised of an array of organizations’ resources • To ensure organizations may be willing to participate in Legion, and contribute their resources, each organization must be assured control of their own resources
Design Constraints • Cannot relace host operating systems • Encourages orginizations to contribute resouces to the grid by not requireing resources be dedicated solely for Legion • Cannot make changes to the interconnection network • Legion cannot assume that all resource networks will be within the user's control • Cannot insist that it be run as "root" • For security reasons, Legion does not require root access to its resources to function
Legion Object Everything is an object • Defined as an active process that responds to member function invocations from other objects in the system Objects are independent and abstracted from the logical address space Objects communicate with each other through non-blocking method calls Legion handles the message format and high-level protocol for object interaction, but not the programming language or communication protocol Each object maintains a local binding cache Objects may be one of two different states: Active or Inert
Object States: Inert or Active • Active Objects • Run as a process that is ready to accept member function invocations • Object state is maintained in the address space of the process • Inert Objects • represented by an object-persistent representation (OPR) • OPR is a set of associated bytes residing in stable storage in the Legion System • OPR contains the information that enables the object to move from Inert to Active
Object Replication • Replicating an object • create an Object Address with multiple physical addresses in its list • Assign address semantics • Bind the LOID of the object to this Object Address
Legion’s 3 Level Naming System • Legion Object Address (LOA) • physical address or set of addresses in the case of replicated objects • Legion Object Identifiers (LOIDs) • Context Names • Mapped by a directory service called Context Space • human readable strings • each Context Name is mapped to a LOID
Contexts • A typical context space has a well known root context which in turn “points” to other contexts, forming a directed graph • Support operations that lookup a single string, return all mappings, add a mapping, and delete a mapping
Object Addresses • An Object Address • a list of Object Address Elements • semantic information that describes how to utilize the list • Represents an arbitrary communication endpoint, such as a TCP socket • Object Address Element contains 2 basic parts • 32 bit address type field • 256 bits of address specific information
LOID’s • Every Legion object is named by a Legion Object Identifier (LOID) • Legion Object Identifiers (LOIDs) • location independent identifiers • includes an RSA public key (public-key cryptosystem) • each LOID is mapped to a LOA • LegionClass is responsible for handing out unique Class Identifiers to each new class
Bindings • Bindings are first class entities that can be passed around the system and cached within objects • A binding Consists of: • LOID • Object Address • Time that the binding becomes invalid • Binding Agents are responsible for returning a binding to an Object Address for the object that the LOID names • The persistent state of each Legion Object contains the Object Address of its Binding Agent • Bindings from LOID’s to Object Addresses are implemented as triples
Legion Core Objects • Binding Agents • objects that map LOIDs with LOAs • Context Objects • objects that map Context Names with LOIDS • Host Objects • Represents processes • Implementation Objects • Executable to handle creation or activation of an object • Is transferred from a class object to a host object to enable the host to create processes with the correct characteristics • Vault • Represents persistent storage, for the purpose of maintaining state, in OPR’s, of the inert Legion objects supported by the vault
Classes and Metaclasses • Every Legion object is defined and managed by its Class object • Class objects are given System-Level responsibility • Create new instances • Schedule them for execution • Activate/Deactivate objects • Provide information about their current location to client objects that wish to communicate with them
Scalable Architecture • Millions and MILLIONS of hosts and TRILLIONS of objects, yo • Legion is designed to be decentralized and fully distributed • Applications at the client • Legion Object shared across the Legion System
Extensible Core • Every object publishes an interface • Inheritable • Extendable • Specializable • As technology changes and improves, resources in the Legion Grid can be changed or replaced without hindering the system
High Performance • High Performance via Resource Selection • Choosing hosts with lowest load or greatest processing power • User-level scheduling agents • High Performance via Parallism • Support libraries such as MPI • Support parallel languages such as MPL • Offer wrap parallel component • Exporting the run-time library interface to library, toolkit, and compiler writers
The Legion Scheduling Model • Scheduling policies are chosen by the user • Users can create their own schedulers for specific applications
Security Solutions: • Legion does not require any special privileges or "root" access • Legion allow users to choose what types and levels of security they want for their own objects • In addition, every Legion object contains a function called "MayI" Problems: • Installing Legion without causing significant risk to the system it is installed on • How to protect and control resources
Security • Public-key cryptography based on RSAREF 2.0. • Three message-layer security modes: private (encrypted communication), protected (fast digested communication with unforgeable secrets to ensure authentic replies to message calls), and no security. • Caching secret-keys for faster encryption of multiple messages between communicating parties. • Auto-encrypted bearer credentials with free-form rights. Propagation of security modes and certificates through calling trees (e.g., if a caller demands encryption, all downstream calls will use it automatically).
Security • Drop-in addition of MayI functionality to existing objects. • Persistent authentication objects that serve as the representation for users in a trust domain. • Secure legion shell to allow users to login to their authentication objects and obtain associated credentials and environment information. • Isolation and protection of objects using local OS accounts. • Easily checked Process Control Daemon for granting limited OS privileges to Legion Host Objects. • Context space configured with access control for multiple users.
Fault Tolerance • Automatic failure detection and recovery • Hosts, jobs, and queues automaticall back up their current state to prevent loss of information • Dynamic configuration allows processes to change resources without interupting operations • If a host is lost or unavailable, the job is automatically migrated to another host
Contribution • Of the early grid-computing solutions, Legion is unique in that it took an object-orientated approach • It metamorphosed from an academic project to a commercial vendor with Avaki • Avaki pushed the LOID naming conventions as an industry secure naming protocol in 2002, which Compaq, Hewlet Packard, IBM, Platform Computing, and Sun Microsystem all welcomed • IBM adopted the System for their Life Sciences research • Though the platform in its commercial state is proprietary, it can be assumed that the Legion->Avaki->Sybase->SAS ownership chain has continued the growth and expansion of the system
Drawbacks? • Scalability claims refers to communication traffic required as part of the implementation model • LOID binding lookups from objects to Binding Agents • Binding Agent traffic required to satisfy object binding requests • Assumes that most accesses will be local • Same organization • Within a department or university campus • Inter user-level object communication inside of an application may or may not contain a bottleneck • User implementation may have a centralized object that acts as shared memory for a large number of workers • Resource starvation results in increasingly poor performance
Performance • Comparison of Globus against Legion with Matrix Multiplication using the MPI libraries
Personal Opinions for Improvements • Fault Tolerance is not explicitly covered in the available documentation, and Avaki continued to develop the code-base and likely solved these issues, but that is commercial and proprietary. • Performance measurements may be volatile as it cannot be predicted how Legion scales across hosts • Bottlenecks may occur at the application level • Too many requirements and decisions are placed on the shoulders of the resource owners and users. It contradicts the overall goal of Legion being easy to use • It aspires to be multi-organizational, but lacks easy scalability across organizations • Performance