180 likes | 201 Views
Independent Software Vendor (ISV) Remote Computing Primer. Steven Newhouse. The Problem…. Growth of compute oriented applications Research applications with source available Commercial Applications from ISVs have no source ISV: Independent Software Vendors
E N D
Independent Software Vendor (ISV) Remote Computing Primer Steven Newhouse
The Problem…. • Growth of compute oriented applications • Research applications with source available • Commercial Applications from ISVs have no source • ISV: Independent Software Vendors • Applications becoming bound by the desktop • Multi-core pushing out this boundary • But ensemble/parameter sweeps increases demand • Exploit resources provided within the enterprise • Growing adoption of HPC into ISV applications • How to deal with different schedulers, clusters & OS?
Goals • How can standards enable access to these distributed resources? • What scenarios need to be supported? • Deal with realistic network environments • Firewalls and NAT’ed networks • Move from desktop to mobile clients • Clients are not always connected to the network • Clients will not have files on shared network • Embrace different scheduler interfaces
Specification Toolbox • WS-Addressing • Security specifications and profiles • Job Submission Description Language (JSDL) • JSDL Single Process Multiple Data Application Extension • JSDL Parameter Sweep Extension • Basic Execution Service (BES) • HPCP-Application Extension • HPC Basic Profile (HPCBP) • File Staging Extension to the HPC Basic Profile • ByteIO • GLUE (Resource Description) • Distributed Resource Management Application API (DRMAA) • Resource Namespace Service (RNS)
Core Specifications • WS-Addressing • Encapsulates service in Endpoint Reference (EPR) in XML • From a client perspective • Security specifications and profiles • Builds on core standard WS-Security infrastructure • WS-Secure Addressing (GFD 131): • Profile on WS-Addressing and WS-Security Policy • Embeds into EPR how to access a service • WS-Secure Communication (GFD 132): • Profile on mechanisms to enable easier interoperability
Job Submission Description Language • Core JSDL specification (GFD 56 136) has many implementations • Describes what you want to run (the job) • Describes what you need to run it on (the resource) • Extensions • JSDL Single Process Multiple Data (SPMD) Application Extension (GFD 115) • JSDL Parameter Sweep Extension • Now entering public comment • HPCP-Application Extension (GFD 111) • Supported in HPC Basic Profile
Running jobs • Basic Execution Service (BES) (GFD 108) • Provides an interface to submit, monitor & manage an activity • HPC Basic Profile (HPCBP) (GFD 114) • Specialises BES (& other specifications) for HPC • File Staging Extension to the HPC Basic Profile (GFD 135) • Profiles JSDL to allow transfers in & out of a cluster
Other OGF Specifications • ByteIO (GFD 87) • Access through a web service to a file abstraction • Support for random access & streaming patterns • GLUE (Finishing public comment) • Information model for grids • Focus on services & resources for compute & storeage • Expose virtual organization & usage controls • Distributed Resource Management Application API (DRMAA) (GFD 22) • Client API to enable access to DRMs • Resource Namespace Service (RNS) (GFD 101) • Naming of resources in a hierarchical namespace
Application Scenarios • Run a job • Run a job through a standard API • Run a job through a web service • Run a job through a web service with file staging • Run a job through a web service and have bi-directional interaction with it
Run a job • The current ‘state of the art’ • Application calls an external script • Script invokes a local job submission command • Requires: • Installation of local scheduler library • Customization of scripts to access local scheduler
Run a job using a standard API • Application internally calls a standard API • DRMAA specification with multiple implementation • Within DRMAA invoke scheduler specific plugin • Requires: • Installation of local scheduler library • Installation of DRMAA plugin to call the library Application DRMAA Interface DRMAA plug-in Scheduler
Run a job through a web service • HPCBP client embedded in the application • Formulates the job using JSDL & HPCP Application • HPCBP service used to access the cluster • Available for many schedulers • Requires: • Installation of HPCBP service on the cluster • Integration of the HPCBP client in the application • Note: Cross-platform interoperability demonstrated at SC06 & SC07
Extension: File Staging • Use the HPCBP File Staging specification • Can include ftps, ftp, scp sftp, and other protocols • By using an intermediary file store the client can disconnect & reconnect later • Client copies the files to the intermediary store • Client then submits the job to the cluster • The cluster reads/writes files to the intermediary store • Once the job is complete the files are retrieved • Requires: • Support of standard file protocols on client, cluster & store
Extend: Run-time interaction • Use ByteIO in streaming mode to: • Send control requests from the client to the application • Return data from the application to the client • ByteIO service needs to be accessible from the client and the cluster’s compute nodes • E.g. the cluster’s head nodes • Requires: • ByteIO service and integration into application & client
Summary • Core set of established interoperable specs. • Demonstrated interoperability • Commercial and open-source implementations • ‘Toolbox’ of specifications enables applications • Allows ISVs to build applications that: • Can connect to any ‘standard’ infrastructure • Run from mobile clients in network environments
Acknowledgements • Use Case Workshop participants • University of Virginia, March 2008 • Core scenario: Narfi Stefansson, Mathworks • Vigorous discussion over the two days • Further contribution & feedback at OGF 23 • And on mailing list and OGSA calls • Resulting document is now being published • Editors: Steven Newhouse & Andrew Grimshaw