180 likes | 187 Views
iSERVO International Solid Earth Research Virtual Observatory Grid/Web Services and Portals Supporting Earthquake Science. December 15 2004 AGU Fall Meeting San Francisco Geoffrey Fox, Marlon Pierce (Community Grids Lab, Pervasive Technologies Laboratories, Indiana University)
E N D
iSERVOInternational Solid Earth Research Virtual ObservatoryGrid/Web Services and Portals Supporting Earthquake Science December 15 2004 AGU Fall Meeting San Francisco Geoffrey Fox, Marlon Pierce (Community Grids Lab, Pervasive Technologies Laboratories, Indiana University) John Rundle (UC Davis),Andrea Donnellan, Robert Granat, Greg Lyzenga, Jay Parker (JPL) Don McLeod (USC), Lisa Grant (UC Irvine)
Field Trip Data Database ? GISGrid Discovery Services RepositoriesFederated Databases Streaming Data Sensors Database Sensor Grid Database Grid Research Education SERVOGrid Compute Grid Customization Services From Researchto Education Data FilterServices ResearchSimulations Analysis and VisualizationPortal EducationGrid Computer Farm Grid of Grids: Research Grid and Education Grid
iSERVO in a nutshell • Designed to link data-sets (repositories and real time), computations and earthquake scientists in ACES (Asia Pacific) Cooperation • Australia China Japan USA • Exemplified by SERVOGrid in USA led by JPL • Supports simulation and datamining as services • Adopts conservative WS-I+ Web Service Interoperability standards • Builds full “Grid” in a library fashion as a Grid of Grids • GIS (Geographic Information System) Grid built as a set of OGC compatible Web Services “talking” GML • iSERVO federates separate Grids in each country/organization/function • A Grid is “just” a collection of Services aka distributed programs • Multi-scale simulations supported by Grid workflow • Portals based on NSF Middleware Initiative NMI Open Grid Computing Environment OGCE
Characteristics of Computing for Solid Earth Science • Widely distributed datasets in various formats • GPS, Fault data, Seismic data sets, InSAR satellite data • Many available in state of art tar files that can be FTP’d • Provenance problems: faults have controversial parameters like slip rates which have to be estimated. • Distributed models and expertise • Lots of codes with different regions of validity, ranging from cellular automata to finite element to data mining applications (HMM) • Simplest challenges are just making these codes useable for other researchers. • And hooking this codes to data sources • Some codes also have export or IP restrictions • Other codes are highly specialized to their deployment environments. • Decomposable problems requiring interoperability for linking full models • The fidelity of your fault modeling can vary considerably • Link codes (through data) to support multiple scales
(i)SERVO Web (Grid) Services • Programs: All applications wrapped as Services using proxy strategy • Job Submission: support remote batch and shell invocations • Used to execute simulation codes (VC suite, GeoFEST, etc.), mesh generation (Akira/Apollo) and visualization packages (RIVA, GMT). • File management: • Uploading, downloading, backend crossloading (i.e. move files between remote machines) • Remote copies, renames, etc. • Job monitoring • Workflow: Apache Ant-based remote service orchestration • For coupling related sequences of remote actions, such as RIVA movie generation. • Data services: support remote data bases and query construction • XML data model being adopted for common formats with translation services to “legacy” formats. • Migrating to Geography Markup Language (GML) descriptions. • Metadata Services: for archiving user session information.
SERVOGrid Applications • Codes range from simple “rough estimate” codes to parallel, high performance applications. • Disloc: handles multiple arbitrarily dipping dislocations (faults) in an elastic half-space. • Simplex: inverts surface geodetic displacements for fault parameters using simulated annealing downhill residual minimization. • GeoFEST: Three-dimensional viscoelastic finite element model for calculating nodal displacements and tractions. Allows for realistic fault geometry and characteristics, material properties, and body forces. • Virtual California: Program to simulate interactions between vertical strike-slip faults using an elastic layer over a viscoelastic half-space • RDAHMM: Time series analysis program based on Hidden Markov Modeling. Produces feature vectors and probabilities for transitioning from one class to another. • Preprocessors, mesh generators: AKIRA suite • Visualization tools: RIVA, GMT,IDL
SERVOGrid Codes, Relationships Elastic Dislocation Inversion Viscoelastic FEM Viscoelastic Layered BEM Elastic Dislocation Pattern Recognizers Fault Model BEM This linkage called Workflow in Grid/Web Service parlance
Service-1 Service-3 Role of Workflow • Programming the Grid: Workflow describes linkage between services • As distributed, linkage must be by messages • Linkage is two-way and has both control and data • Apply to multi-scale (complexity) linkage, multi-program linkage, link visualization to simulation, GIS to simulations and viz filters to each other • Microsoft-IBM specification BPEL is current preferred Web Service XML specification of workflow • SERVOGrid uses ANT (well known XML build tool) to perform workflow and this works well in our relatively simple cases) Service-2
Applications and Observational Data • Several SERVO codes work directly with observational data. • Scenarios include • GeoFEST, VirtualCalifornia, Simplex, and Disloc all depend upon fault models. • RDAHMM and Pattern Informatics codes use seismic catalogs. • RDAHMM primarily used with GPS data • Problem: We need to provide a way to integrate these codes with the online data repositories. • QuakeTables Fault Database • Existing GPS and Earthquake Catalogs • Solution: use databases to store catalog data; use XML (GML) as exchange data format; use OGC and WS-I+ Compatible Web Services for data exchanges, invoking queries, and filtering data. • Use Web Feature Service, Web Map Service from OGC • Use UDDI (Discovery), WS-DAI (Database),WS-Context (Dynamic metadata) from WS-I+
SERVOGrid and Semantic Grid • SERVOGrid has many types of metadata • We are designing RDFS descriptions for the following components: • Simulation codes, mesh generators, etc. • Visualization tools • Data types • Computing resources • … • These are easily expressed as RDFS (actually DAML) “nuggets” of information. • Create instances of these • Use properties to link instances.
Some Sample Relationships Danube installedOn installedOn Computer GMT Viz Appl Disloc visualizedBy Application createsOutput usesInput Stress Map storedIn DataFormat USC Fault DB Fault Data Storage DataType
Expanding to iSERVO Strategy • Agree on what (type of) resources and capabilities need to put on the ISERVO Grid • Computers, instruments, databases, visualization, maps, job submittal …. • Agree on interfaces to resources from OGSA-DAI (databases) to particular data structures (GML/OpenGIS) – specify in XML • Implement Resources and Capabilities as Services • User Interface should be a portlet that can be integrated by the portal into web interface • Make certain overarching Grid capabilities such as workflow, federation and metadata are sufficient • SERVO Grid is a prototype of this strategy using several US sites rather than several countries • Can be naturally extended to iSERVO, education, emergency response by extending resources • WS-I+ Web Service Architecture ensures continued interoperability and extensibility
Grid Syntax Controversies • There are several proposals for the Web Service extensions needed for Grids – OGSI (GT3), WSRF (GT4), WS-GAF (Newcastle) • We adopt a wait and see philosophy • We use WS-I+ Pure Web Services approach that adopts minimum set of ~7 Web Service specifications choosing from 60 or so proposed in last few years • Those adopted by Industry wide WS-I Web Service Interoperability group • Those declared by IBM and Microsoft • Any extra absolutely essential • This approach adopted by next phase of UK e-Science Program
Performance and Streaming WS-1 WS-2 • Web Services are meant to exchange messages using SOAP which is very interoperable but very slow • Drastically reduces effective bandwidth • Most real programs exchanges data via reading and writing binary files • Increases latency • All Control Messages should use classic SOAP • All data messages use optimal binary • Respect “SOAP Infoset” (Header and Body of Message) • Use streaming not file-based infrastructure to give better latency and same technology for files and streaming sensors • Similar to using UNIX Pipes not directly files • http://www.naradabrokering.org
Aggregate Portals Portlet User Interface Components Application Web Services and Workflow Core Web Services SERVOGrid Web Portal • Package every Web Service with its own user interface as a document fragment • Portlets are underlying technology • OGCE Open Grid Computing Environment is developing lots of useful portlets • Computing • GIS • Access Grid etc.
Portal Architecture Clients (Pure HTML, Java Applet ..) Aggregation and Rendering Portlet Class:WebForm SERVOGrid (IU) Web/Gridservice Computing Remoteor ProxyPortlets Web/Gridservice Data Stores Portlet Class GridPort etc. Portlet Class Web/Gridservice Instruments (Java) COG Kit Portlet Class Hierarchical arrangement Portal Internal Services LocalPortlets Clients Portal Portlets Libraries Services Resources
Each Service has its own portlet Individual portlet for the Proxy Manager Use tabs or choose different portlets to navigate through interfaces to different services 2 Other Portlets