630 likes | 834 Views
Globus Trends & Future Directions. Dr. Dan Fraser Director, Community Driven Improvement of Globus Software. Proposed Globus BOF Outline. What is Globus? Relationship to other programs UNICORE, gLite, OMII,… Where is Globus going? Community building via Incubator Projects
E N D
Globus Trends & Future Directions Dr. Dan Fraser Director, Community Driven Improvement of Globus Software
Proposed Globus BOF Outline • What is Globus? • Relationship to other programs • UNICORE, gLite, OMII,… • Where is Globus going? • Community building via Incubator Projects • Emphasis on benefits to HEP computing • GRAM (Job management) • GridFTP (Fast File Transfers) – NorduGrid, FNAL • Data Placement Service • Workspaces – STAR, Alice • Service creation (Service Oriented Science) • How can we work together?
The Grid • Access to shared resources Virtualization, allocation, management • With predictable behaviors Provisioning, quality of service • In dynamic, heterogeneous environments Standards-based interfaces and protocols Enable “coordinated resource sharing & problem solving in dynamic, multi-institutional virtual organizations.” (Source: “The Anatomy of the Grid”)
A B 1 1 9 9 Shared Distributed Infrastructure Underlying Problem:The Application-Infrastructure Gap Dynamicand/orDistributedApplications
More Specifically, I May Want To … • Create a service for use by my colleagues • Manage who is allowed to access my service (or my experimental data or …) • Ensure reliable & secure distribution of data from my lab to my partners • Run 10,000 jobs on whatever computers I can get hold of • Monitor the status of the different resources to which I have access
Grid Infrastructure Higher-Level Services and Users • Distributed management • Of physical resources • Of software services • Of communities and their policies • Unified treatment • Build on Web services framework • Use WS-RF, WS-Notification (or WS-Transfer/Man) to represent/access state • Common management abstractions & interfaces Local Heterogeneity
Globus Software: dev.globus.org Globus Projects OGSA-DAI GT4 MPICH- G2 Java Runtime MyProxy Data Rep Replica Location Delegation GridWay C Runtime CAS GSI- OpenSSH GridFTP MDS4 Incubator Mgmt Python Runtime C Sec GRAM Reliable File Transfer GT4 Docs Incubator Projects Common Runtime Security Execution Mgmt Data Mgmt Info Services Other
Globus Software: dev.globus.org Globus Projects OGSA-DAI GT4 MPICH- G2 Java Runtime MyProxy Data Rep Replica Location Delegation GridWay C Runtime CAS GSI- OpenSSH GridFTP MDS4 Incubator Mgmt Python Runtime C Sec GRAM Reliable File Transfer GT4 Docs Incubator Projects GEMLCA GAARDS MEDICUS Cog WF Virt WkSp GDTE GridShib OGRO UGP Dyn Acct Gavia JSC DDM Metrics Introduce PURSE HOC-SA LRMA WEEP Gavia MS SGGC ServMark Common Runtime Security Execution Mgmt Data Mgmt Info Services Other
Web Services • Standards for defining & accessing services • WSDL: Web Services Description Language • SOAP: Simple Object Access Protocol • Also other standards for security, state access, etc., etc. • Technology for hosting services, e.g.: • Apache Axis (Java) • Microsoft (C#) • Others in other languages (C, Python, etc.)
Bind an Interface via a definition to a specific transport (e.g. HTTP) and messaging (e.g. SOAP) protocol The network location where the service is implemented , e.g. http://localhost:8080 WSDL: Web Services Description Language Define expected messages for a service, and their (input or output parameters) An interface groups together a number of messages (operations)
Web Services:E.g., File Transfer Service User WSDLdefining “Move” operation, Etc. Move F fromA to B Interface Implementation Hosting environment/runtime (“C”, Axis, .NET, …)
“Stateless” vs. “Stateful” Services FileTransferService Client move (A to B) move • Without state, how does client: • Determine what happened (success/failure)? • Find out how many files completed? • Receive updates when interesting events arise? • Terminate a request? • Few useful services are truly “stateless”, but WS interfaces alone do not provide built-in support for state
FileTransferService (without WSRF) FileTransferService Client move (A to B) : transferID move whatHappen state tellMeWhen cancel • Developer reinvents wheel for each new service • Custom management and identification of state: transferID • Custom operations to inspect state synchronously (whatHappen) and asynchronously (tellMeWhen) • Custom lifetime operation (cancel)
Service State representation Resource Resource Property State identification Endpoint Reference State Interfaces GetRP, QueryRPs, GetMultipleRPs, SetRP Lifetime Interfaces SetTerminationTime ImmediateDestruction Notification Interfaces Subscribe Notify ServiceGroups Resource RPs WSRF in a Nutshell Service GetRP GetMultRPs EPR EPR EPR SetRP QueryRPs Subscribe SetTermTime Destroy
Transfer RPs FileTransferService (w/ WSRF) FileTransferService Client createResource createResource (A to B) : EPR getRP queryRPs destroy • Developer specifies custom method to createResource and leaves the rest to WSRF standards: • State exposed as Resource + Resource Properties and identified by Endpoint Reference (EPR) • State inspected by standard interfaces (GetRP, QueryRPs) • Lifetime management by standard interfaces (Destroy)
Tool Tool Workflow Uniform interfaces, security mechanisms, Web service transport, monitoring Registry Credent. DAIS GRAM User Svc User Svc GridFTP Host Env Host Env Bridging the Application-Resource Gap User Application Database Specialized resource Computers Storage
Mobius Globus BPEL GRAM Globus myProxy OGSA-DAI Globus Toolkit GSI CAS caCORE Globus Cancer Biomedical Informatics Grid Functions Management Schema Management Metadata Management ID Resolution Workflow Security Resource Management Service Registry Service Service Description Grid Communication Protocol Transport Spans 60 NIH cancer centers across the U.S. Slide credit: Peter Covitz, National Institutes of Health
Relationship to Other Programs • UNICORE • OMII • gLite • Supporting common standards • Additional ways of working together?
Where is Globus today? • http://incubator.globus.org/metrics • > 75, 000 GT4 downloads • > 95% are production downloads • Maintaining production quality code • Supporting the most important OGF standards • Innovating with new features • Incorporating Community Involvement
The Globus Toolkit:“Standard Plumbing” for the Grid • Not turnkey solutions, but building blocks & tools for application developers & system integrators • Some components (e.g., file transfer) go farther than others (e.g., remote job submission) toward end-user relevance • Easier to reuse than to reinvent • Compatibility with other Grid systems comes for free • Today the majority of the GT public interfaces are usable by application developers and system integrators • Relatively few end-user interfaces • In general, not intended for direct use by end users (scientists, engineers, marketing specialists)
Where is Globus Going? • Service Oriented Science (SOS) • Community Involvement • Dev.globus.org • Participate in the discussions • Incubators • Add your Innovative contributions
The Nature of SOS • End-to-end Solution Focused (by definition) • Grid technology is the plumbing • Not beautiful (to users anyway) • Yet extremely important • In the SOS world, Grid transitions into the background.
How are we getting there? • Our community is helping us! • Also through our ongoing internal development, of course.
dev.globus • Globus software is organized as several dozen “Globus Projects” • Projects release products • Each project has its own “Committers” • Committers are responsible for governance on matters relating to their products • A “Globus Management Committee” • provides overall guidance and conflict resolution • approves the creation of new Globus projects
Runtime C Core Utilities C WS Core CoG jglobus Core WS Schema Java WS Core Python Core XIO Execution GRAM MPICH-G Security C Security CAS/SAML Utilities Delegation GSI-OpenSSH MyProxy Information MDS4 Initial Globus Projects • Data • GridFTP • OGSA-DAI • Reliable Transfer • Replica Location • Replication • Distribution • Globus Toolkit • Documentation • Build a Service Tutorial • GT Release Manuals • GT Programmer's Tutorial
Globus Incubator Projects(Partial List) • CoG Workflow — Fine-grained workflow system • GEMLCA — Deploy Legacy Apps as Grid Svcs • GridShib — Integration with Shibboleth • GridWay — Meta-scheduler • gt-hs — Integration of Handle System • MEDICUS — Medical image management • Metrics — Infrastructure for usage reporting • OGCE — Portal toolkit • PURSe — Portal-based user registration service • ServMark — Grid service performance tester • Virtual Workspaces — Virtual machine mgmt
Globus Software: dev.globus.org Globus Projects OGSA-DAI GT4 MPICH- G2 Java Runtime MyProxy Data Rep Replica Location Delegation GridWay C Runtime CAS GSI- OpenSSH GridFTP MDS4 Incubator Mgmt Python Runtime C Sec GRAM Reliable File Transfer GT4 Docs Incubator Projects GEMLCA GAARDS MEDICUS Cog WF Virt WkSp GDTE GridShib OGRO UGP Dyn Acct Gavia JSC DDM Metrics Introduce PURSE HOC-SA LRMA WEEP Gavia MS SGGC ServMark Common Runtime Security Execution Mgmt Data Mgmt Info Services Other
http://dev.globus.org/wiki/Incubator/Introduce Shannon Hastings hastings@bmi.osu.edu Multiscale Computing Laboratory Department of Biomedical Informatics The Ohio State University
Introduce Goals A framework which enables fast and easy creation of Globus based grid services. Provide easy to use graphical service authoring tool. Hide all “grid-ness” from the developer. Utilize best practice layered grid service architecture. Handle all core service architecture requirements for strongly typed and highly interoperable grid services. Toolkit for creating and manipulating strongly typed grid services Command line and GUI tools for service skeleton generation and automatic service/client code generation Utilizes other core grid services and architecture components caDSR and GME for schemas of registered data types Security service architecture (Dorian, GridGrouper, CSM) Advertisement and Registration configuration and Index Service for discovery
Introduce Graphical Development Environment (GDE) GUI for creating and manipulating a grid service Provides means of simple creation of service skeleton that a developer can then implement, build, and deploy Automatic code generation of complete WSRF compliant grid service which is configured to provide: Security Advertisement Discovery Complete UnBoxed Client API Provides a set of tools which enable the developer to add/remove/modify/import methods of the service as well create sub-services/resources. Automatic code generation of all the required code, Globus grid service code/configuration, service configuration, implementation of the client, and stubbed implementation of the service
Invitation to add your Incubator • http://dev.globus.org • Leverage the Globus Infrastructure for your project.
Appln Service Create Store Advertize Discover Transfer GAR Invoke; get results Deploy RAVE • Remote Application Virtualization Environment • Builds on Introduce • Define service • Create skeleton • Discover types • Add operations • Configure security • Wrap arbitrary executables Introduce Repository Service Index service Container Ravi Madduri et al., Argonne/U.Chicago & Ohio State University
RAVE Collaboration • We are interested in collaborating… • If you have an application you want to expose as a Grid service, let us know • Questions ?
What are Virtual Workspaces? • A dynamically provisioned software environment • Environment definition: We get exactly the (software) environment me need on demand. • Resource allocation: Provision and guarantee all the resources the workspace needs to function correctly (CPU, memory, disk, bandwidth, availability), allowing for dynamic renegotiation to reflect changing requirements and conditions. • Implementation • Traditional means: automated configuration and coarse-grained enforcement • Virtual Machines Quality of Life Quality of Service
Overall Architecture Provisioning and Configuration (back-end) Application Services (front-end)
Challenges • How can we automate providing applications as services? • Integrate code into a services framework • Sharing across community members • Composition of analysis capability in workflows • How can we provision platforms for application execution in response to time-varying demand • Isolating users from details concerning resource availability • Configuring and maintaining application environments • On-demand provisioning of application platforms
GT4 Workspace Service • The GT4 Virtual Workspace Service (VWS) allows an authorized client to deploy and manage workspaces on-demand. • GT4 WSRF front-end (one per site) • Leverages GT core and services, notifications, security, etc. • Currently implements workspaces as Xen VMs • Other implementations could also be used • Implements multiple deployment modes • Best-effort, leasing, etc. • Current release 1.2.3 (April ‘07) • http://workspace.globus.org
Interacting With Workspaces (1) The workspace service allows users to deploy and manage workspaces on a pool of nodes through a WSRF interface Pool node Pool node Pool node VWS Service Pool node Pool node Pool node (3) Information on each workspace is published as WSRF Resource Properties ao that users can find out information about their workspace (e.g. what IP the workspace was bound to) or subscribe to notifications on changes Pool node Pool node Pool node Pool node Pool node Pool node (2) Each pool node requires a VMM and a lightweight management script
Dynamic Provisioning of STAR Nodes STAR GRAM STAR STAR VWS no STAR STAR no STAR no STAR GRAM
Configuring STAR Environments • Configuring STAR Image • rPath collaboration STAR VM Image Base Configuration (OS, OSG stack) ~1 GB STAR Application ~3 GB Data Partition < 1 GB Blank Space ~3 GB
Workspaces Challenges • Providing images for STAR and other communities • Collaborations: the Alice project, ATLAS community • Image configuration and maintenance • Creating a base of scientific images • Managing trust in images • Providing resources • Understanding and installing the technology • Integrating virtualization with current provisioning models • Understanding the Big Picture • Communication and education • Middleware development
Layered Approach to Data Management Data Policies Workflow Scheduling Dynamic Data Management (Data Placement Service) Data Durability (Replica Location Service) Reliability Predictability Security Scalability Reliable File Transfer (RFT) Storage Reservation (MOPS) Robust Data Mover (GridFTP)
What is GridFTP • Widely used, open source, robust, productionquality, data mover • Separate control and data channels • Parallel streams (~3-5x faster than TCP/IP) • Parallel stripes (multiple servers) • Partial file transfer • Multiple security options (GSI, SSH) • Third party control • Extensible for both file system & protocols
GridFTP Usage Stats • The number of GridFTP servers has been steadily increasing (from ~400/mo in '05) and is now more than 1000 per month. • Server installations were setup in 51 countries around the world • Over 90 Million known transfers last year • Many sites, especially in Europe do not report usage stats.
GridFTP Clients I/O Network FileSystems ? XIO Drivers -TCP -UDT (UDP) -Parallel streams -GSI -SSH Client Interfaces Globus-URL-Copy C Library RFT (3rd party) Data Storage Interfaces (DSI) -POSIX -SRB -HPSS -NEST GridFTP Server Separate control, data Striping, fault tolerance Metrics collection Access control www.gridftp.org
Memory to Memory over 30 Gigabit/s Network (San Diego — Urbana) 30 Gb/s 20 Gb/s 10 Gb/s Striping
Disk to Disk over 30 Gigabit/s Network (San Diego — Urbana) 20 Gb/s 10 Gb/s Striping
Small Files Transfer Improvements • Pipelining • Many transfer requests outstanding at once • Client sends second request before the first completes • Latency of request is hidden in data transfer time • Cached Data channel connections • Reuse established data channels (Mode E) • No additional TCP or GSI connect overhead Now in 4.1.2 !!
LOSF Optimization • 1 GB of data partitioned into equal sized files • Performance doesn't degrade for pipelining until <100KB
What Else? • Dynamic backends (using GFork) – now in 4.1.2 • Stability in the event of backend failure • Growing resource pools for peak demands • Dynamic Protocol selection TCP/UDT • Early tests very promising especially over long, busy networks • Preliminary tests show ~5x speedup vs parallel streaming GridFTP • Requires further side-effect testing & analysis • Flexibility in monitoring • Enable Individual grids to track user information • UID, DN, IP address • Transfer Analysis • Detect bottlenecks (storage vs network) • Other Improvements • Best practices (striping parameter settings) • Performance improvements • Managed Object Placement Service (MOPS)