240 likes | 369 Views
Service-Oriented Science Scaling Science Services. Ian Foster Argonne National Laboratory University of Chicago Univa Corporation. APAC Conference, September 28, 2005 iGrid Workshop, September 27, 2005. Two Questions.
E N D
Service-Oriented ScienceScaling Science Services Ian Foster Argonne National Laboratory University of Chicago Univa Corporation APAC Conference, September 28, 2005 iGrid Workshop, September 27, 2005
Two Questions • How do we scale the number of scientists benefiting from computational techniques? • What should be the role of infrastructure providers in enabling this scaling?
Computational Science Computation joins theory & experiment as a third mode of scientific enquiry • Increasingly sophisticated computational approaches • Monolithic programs, databases • Inflexible & hard to evolve • Mismatch with reality of diverse & distributed teams, resources, & approaches Program & data PC or Supercomputer Genbank
Decompose over the Network • Clients can then integrate dynamically • Select & compose services • Select “best of breed” providers • Publish result as a new service • Need not know implementation details • Note: complements, not replaces, HPC Service-Oriented Architecture
Sloan vs. 2MASS Brown dwarfcandidates For Example:Virtual Observatories Surveys Observatories Missions Survey and Mission Archives Digital libraries Numerical Sim’s
Users Discovery tools Analysis tools Data Archives Having Decomposed, Integrate • For example • Registries • Value-added services • Workflows • Issues • Description • Discovery • Composition • Adaptation & evolution • Qualities of service: security, performance, reliability, …
Example Value Added Service:PUMA PUMA Knowledge Base Information about proteins analyzed against ~2 million gene sequences Analysis on Grid Involves millions of BLAST, BLOCKS, and other processes Natalia Maltsev et al.
Week 6 7 8 Operating Services or? SOA= Silo-Oriented Architecture? • What about dynamic behaviors? • Time-varying load • Dynamically instantiated services • What about operating costs? • Software deployment & maintenance • Security & other concerns
Horizontal We Need to Decompose in Two Dimensions
Horizontal Vertical We Need to Decompose in Two Dimensions
IPC Server 2 IPC Dispatcher IPC Dispatcher Globus Decomposition EnablesOn-Demand Provisioning • Aggregateresources • Deliver toservices • Separate production& consumption • Issues • Discovery • Composition • Qualities of service Provision New Worker Process SAP GlobusWorld Demo IPC = Internet Pricing Configurator
Cardiff AEI/Golm The Globus-BasedLIGO Data Grid LIGO Gravitational Wave Observatory Birmingham• Replicating >1 Terabyte/day to 8 sites >30 million replicas so far MTBF = 1 month www.globus.org/solutions
“Provide access to data D at S1, S2, S3 with performance P” S1 S2 D ServiceProvider S3 Replica catalog, User-level multicast, … “Provide storage with performance P1, network with P2, …” S1 D S2 ResourceProvider S3 Decomposition EnablesSeparation of Concerns & Roles S1 User S2 D S3
Scaling Up “Sometimes through heroism you can makesomething work. However, understandingwhy it worked, abstracting it, making it aprimitive is the key to getting to the nextorder of magnitude of scale.”Robert Calderbank We want to scale the number, robustness, & performance of services
Identifying Primitives:(1) Taking Services Seriously • Model the world as a collection of services • Computations, computers, instruments, storage, data, communities, agreements, … • Focus on what these things have in common • E.g., lifecycle management • Negotiation, deployment/creation, modeling, monitoring, management, termination • E.g., security • Authentication, authorization, audit, … Web Services-based Grid infrastructure I. Foster, S. Tuecke, Describing the Elephant: The Many Faces of IT as Service, ACM Queue, 2005
Identifying Primitives:(2) Interface Specifications Applications of the framework(Compute, network, storage provisioning,job reservation & submission, data management,application service QoS, …) WS-Agreement(Agreement negotiation) WS Distributed Management(Lifecycle, monitoring, …) WS-Resource Framework & WS-Notification*(Resource identity, lifetime, inspection, subscription, …) Web services(WSDL, SOAP, WS-Security, WS-ReliableMessaging, …) *WS-Transfer, WS-Enumeration, WS-Eventing, WS-Management define similar functions Foster, Czajkowski, Frey, et al., From OGSI to WSRF, Proc. IEEE, 93(3). 604-612. 2005
Identifying Primitives:(3) Open Source Implementation www.globus.org Data Replication CredentialMgmt Replica Location Grid Telecontrol Protocol Delegation Data Access & Integration Community Scheduling Framework WebMDS Python Runtime Reliable File Transfer CommunityAuthorization Workspace Management Trigger C Runtime Authentication Authorization GridFTP Grid Resource Allocation & Management Index Java Runtime Security Data Mgmt Execution Mgmt Info Services CommonRuntime I. Foster, Globus Toolkit Version 4: Software for Service-Oriented Systems, LNCS 3779, 2-13, 2005
Open Science Grid • 50 sites (15,000 CPUs) & growing • 400 to >1000 concurrent jobs • Many applications + CS experiments; includes long-running production operations • Up since October 2003; few FTEs central ops Jobs (2004) www.opensciencegrid.org
OSG cluster Xen hypervisors TeraGrid cluster Virtual OSG Clusters OSG
Dynamic Service Deployment Community A Community Z … • Requirements: • Community control • Persistence • Resource guarantees • Non- interference • Community scheduling logic • Data distribution • Community management • Science services • ...
Summary • How do we scale the number of scientists benefiting from computational techniques? Construct powerful science services Simplify construction by decomposing roles: content, function, resource • What should be the role of infrastructure providers in enabling this scaling? Service providers for communities wanting to deliver content Resource providers for service providers wanting to deliver services
Domain-dependent Domain-independent Enabled by Hosted by Service-Oriented Science:Scaling by Separating Concerns Simulation code Expt design Content Simulation code Expt output Certificate authority Electronic notebook Telepresence monitor Simulation server Function Portal server Data archive Metadata catalog Resources Servers, storage, networks Experimental apparatus I. Foster, Service-Oriented Science, Science, 308, May 6, 2005
Acknowledgments • NSF, DOE, NASA, IBM for financial support • Numerous fine colleagues at Argonne, U.Chicago, USC/ISI, and elsewhere • In particular: SteveTuecke KateKeahey Carl Kesselman & Bill Allcock, Ann Chervenak, Ewa Deelman, Jennifer Schopf, Mike Wilde
For More Information • Globus Alliance: www.globus.org • Papers: www.mcs.anl.gov/~foster For those at APAC: Globus Toolkit Tutorial (Thursday, Friday) For those at IGrid: Carl Kesselman’s Master Class (Thursday)