240 likes | 355 Views
GML Data Models and Web Services for GPS and Earthquake Catalogs. Marlon Pierce, Galip Aydin Community Grids Lab, Indiana University mpierce@cs.indiana.edu. QuakeSim Applications. Several QuakeSim codes work directly with observational data. Examples discussed at ACES include
E N D
GML Data Models and Web Services for GPS and Earthquake Catalogs Marlon Pierce, Galip Aydin Community Grids Lab, Indiana University mpierce@cs.indiana.edu
QuakeSim Applications • Several QuakeSim codes work directly with observational data. • Examples discussed at ACES include • GeoFEST, VirtualCalifornia, Simplex, and Disloc all depend upon fault models. • RDAHMM and Pattern Informatics codes use seismic catalogs. • RDAHMM primarily used with GPS data • Problem: We need to provide a way to integrate these codes with the online data repositories. • QuakeTables Fault Database was developed • What about GPS and Earthquake Catalogs? • Many formats, data available in tars or files, not searchable, not easy to integrate with applicaitons • Solution: use databases to store catalog data; use XML (GML) as exchange data format; use WebServices for data exchanges, invoking queries, and filtering data.
What Are Web Services? • Web Services are not web pages, CGI, or Servlets • Web Services framework is a way for doing distributed computing with XML. • WSDL: Defines interfaces to functions of remote components. • SOAP: Defines the message format that you exchange between components. • XML provides cross-language support • Suitable for both human and application clients Browser Appl Web Server WSDL SOAP WSDL Web Server WSDL WSDL SOAP JDBC DB
Geographical Information Service (GIS) Data Formats and Services • OpenGIS Consortium is an international group for defining GIS data formats and services. • Main data format language is the XML-based GML. • Subdivided into schemas for drawing maps, representing features, observations, … • First Step: design GML schemas and build specialized Web Services for GPS and Earthquake data. • OGC also defines services. • Services include Web Features Services, Web Map Services, and similar. • These are currently pre-Web Service, based on HTTP Post, but they are being revised to comply with WS standards. • Next Step: Implement OGC compatible Web Services for this problem. • Also build services to interact with QuakeTables Fault DB.
GML and Existing Data Formats • GPS or seismic data used in this project are retrieved from different URLs and have different text formats. • Seismic data formats • SCSN, SCEDC, Dinger-Shearer, Haukkson • GPS data formats • JPL, SOPAC, USGS • We defined 2 GML Schemas to unify these • http://grids.ucs.indiana.edu/~gaydin/servo • A summary of all supported formats and data sources can also be found there.
So We Built It • First version of the system available • Tried XML databases but performance was awful • Currently database uses MySQL • Download results are in GML, but we can convert to appropriate text formats.
Use Ours or Set Up Your Own • URL to access our browser interface: • http://gf3.ucs.indiana.edu:6060/cce/sql/ • URL to download and set up your own • http://complexity.ucs.indiana.edu/~gaydin/cce/install/install.html
Fault Quest: QuakeTables+OGC Web Map Service Demo http://rio.ucs.indiana.edu:8080/wmsClient/
Conclusions • This is a little discussion with a big conclusion-- • If you want to build iSERVO or something like it, data access services are an important foundation.
GML Schemas as Data Models for Services • Fault and GPS Schemas are based on GML-Feature object. • Seismicity Schema is based on GML-Observation object. • Working schema available from http://grids.ucs.indiana.edu/~gaydin/schemas/
Browser Interface JSP + Client Stubs DB Service 1 Job Sub/Mon And File Services Viz Service JDBC DB Operating and Queuing Systems RIVA Host 1 Host 2 Host 3
Other Issues • We want to abstract the data storage system to allow simple federation of relational and XML databases • UK e-Science’s OGSA-DAI project is an interesting but complicated example. • We’d like to simplify this approach • Metadata is also important • Useful for capturing datapedigree and validation. • “This fault data generated with Simplex by Jay Parker using the parameters….” • “Those 1935 Fault measurements aren’t so good.” • We have developed some general applications for metadata management • Newsgroups, citations, references, glossaries as examples. • Would like to apply to scientific metadata
Future Directions • We are interested in SemanticWeb markups (particularly RDFS) to provide metadata descriptions of • Instruments • Data sets • Computing hardware • Applications/codes • We want this to form the basis for building composite services. • Infrastructure improvements: reliable, fault tolerant grid infrastructure needed as grid components come and go. • Component based portals: reuse portal interfaces between projects. • ISERVO: International collaborations with Australia, Japan, and possibly other countries • Through ACES: APEC Cooperation for Earthquake Simulation
Acknowledgements • CommunityGrids: Geoffrey Fox, Choonhan Youn, Galip Aydin, Mehmet Aktas • NASAJPL: Andrea Donnellan (PI), Jay Parker, Peggy Li, Robert Granat • UC-Davis: John Rundle • UC-Irvine: Lisa Grant • USC: Dennis Mcleod • Brown: Terry Tullis
Problems: Data Access and Sharing, Code Integration • Codes all use custom text formats for describing input and output. • Input and output data often combined with code-specific information. • Number of iterations, array sizes, etc. • Data files often created by hand from journals, online repositories • Online repositories themselves use differing formats • Challenges are to develop common data formats, access services, and client query tools.
Web Services for Data Access and Computing Service Invocation • Web services: • WSDL: Interface definition language, describes your service • “GeoFEST may be invoked with these input types” • SOAP: Transport envelope for remote procedure calls/messages • “Invoke GeoFEST with this set of input” • Together, WSDL and SOAP are useful for manipulating, returning XML data values • So GML schemas act as our data models and return values • Status: built several general purpose services • Remotely executing codes, monitoring queuing systems, manipulating/moving files around, describing applications, storing portal session values, accessing data bases of faults,… • Work underway to build data services
QuakeSim Basics • Under development in collaboration with researchers at JPL, UC-Davis, USC, and Brown University. • Geoscientists develop simulationcodes, analysis and visualizationtools. • We need a way to bind distributed codes, tools, and data sets. • We need a way to deliver it to a larger audience • Instead of downloading and installing the code, use it as a remoteservice.
What’s the Problem? • Data sources typically were provided in single downloads • Tar bundles or text • This has changed for SCEC catalogs since we developed this project. • SCIGN is adopting a Web Services approach for GPS data. • Formats defined but presented as text • Use XML to re-format the data. • Buys us investment in many XML manipulation, validation, and messaging tools. • We wanted to use databases to store and manage the information. • This makes the data queryable • Retrieve all entries > 1970 • Retrieve all entries with M>3.0
Data Sources Summary • A summary of all supported formats can be found here • http://grids.ucs.indiana.edu/~gaydin/servo • Information about supported Earthquake catalog formats can be found in http://www.data.scec.org/ • Information about supported GPS data formats can be found in http://www.scign.org
What Are Web Services? • Web services framework is a way for doing distributed computing with XML. • WSDL: Defines interfaces to functions of remote components. • SOAP: Defines the message format that you exchange between components. • XML provides cross-language support • Suitable for both human and application clients Browser Appl Web Server WSDL SOAP WSDL Web Server WSDL WSDL SOAP JDBC DB
Delivering Data for Human and Application Consumption • We still have to get the results to the (remote) client. • The client may be a user or an application. • Web Services provide a way to do this. • Note Web Services are NOT • Web pages • Servlets • CGI scripts