540 likes | 627 Views
Community Earth Science Informatics Initiatives & Their Impacts. Lee Allison, Arizona Geological Survey Association of American State Geologists. 200 million+ websites – if you don’t have a website, you don’t exist. Prediction.
E N D
Community Earth Science Informatics Initiatives & Their Impacts Lee Allison, Arizona Geological Survey Association of American State Geologists
200 million+ websites – if you don’t have a website, you don’t exist. Prediction In 5-10 years, if your data are not online in an integrated, interoperable network, you won’t exist.
1000’s of National and Regional Databases • topographic, orthoimagery, hydrography • mineral resources • water • geochemistry • geophysics (aeromag, gravity, aerorad) • earthquake catalogs • biological surveys • vegetation/speciation maps Tower of Babel
Conclusions: Growing Consensus for an NGS • Goals – interoperable, distributed, Web-service based, synoptic 4-D system • Challenges • Technical – adapting-adopting existing capabilities • Cultural –organizational – controls, recognition • How do we get there? • Agreement on standards, protocols, architecture • Geological Surveys as data archives, providers • Parallel community efforts are linking • Implementation is underway • Sustainability is an issue
Most of the technology exists Challenges are cultural and organizational
With apologies to JRR Tolkien One system to rule them all, one system to find them, one system to bring them all, and in the darkness bind them.
How do we get there? NSF to the Solid Earth Sciences: how do you build a sustainable community system? - 2-year community engagement process underway
Earth science cyberinfrastructure Early paradigm: Central databases for each topic Distributed Web-based Interoperable
Goal is making data interoperable Ian Jackson, BGS
interoperability "The capability to communicate, execute programs, or transfer data among various functional units in a manner that requires the user to have little or no knowledge of the unique characteristics of those units." ISO/IEC 2382-01 (SC36 Secretariat, 2003)
Example: the electrical utility • Simple interface– put plug in wall, get electricity Afghanistan 220 V 50 Hz Andorra 230 V 50 Hz Anguilla 110 V 60 Hz Antigua 230 V* 60 Hz Cayman Islands 120 V 60 Hz Cyprus 240 V 50 Hz Czech Republic 230 V 50 Hz ……
National Geoinformatics System • “Killer applications” • User cases & best practices in meeting stakeholder needs • Data discovery, catalogs, inventories, metadata profiles, metadata aggregation service(s) – 4D search engines, Informatics specifications, data model, interoperability, & standards • Web portal & Registry development and implementation • Accessing & licensing protocols, recognition & credit • Community of practice • Communication, dissemination, & awareness • Ontologies, vocabularies • Access to high-resolution spatial geological & applied datasets • “Big Iron” – high performance computing • Digitization of legacy data • Liaison and integration with related groups & initiatives • Sustainability
Each application has driver for each printer HP Driver1 HP printer CalcompDriver1 Word Processor Brothers Driver1 Calcomp plotter HP Driver2 Brothers printer Spreadsheet CalcompDriver2 • Now Brothers Driver2 Metafile interpreter Laserwriter Printer driver Printing service, uses Metafile= interchange format Word Processor Metafile interpreter Large format inkjet Printer driver Metafile interpreter Film writer Spreadsheet wrapper service wrapper Computer printer services • Old days • Advantages • one driver (wrapper) per application • Application need know nothing about printer—separation of concerns
G S C wrapper GSC GSC GSC sc h e m a W e b U S G S wrapper NGMDB U U SGS SGS C l ien t sc h e m a wrapper S ervice s B G S wrapper B B B GS GS GS sc h e m a G A wrapper G G G A A A sc h e m a Communication between service providers and clients takes place using XML mark up. Interoperability via web service Use of standard markup language means schema mapping only needs to be done once Wrapper implements interface to service — formulate requests, interpret results Participants implement one interface for each service Applications focus on application logic, not data access.
Mark-up language “wrapper” translates your data GeoSciML developers Cocoon Uppalla, Sweden GeoServerKeyworth, UK CocoonOttawa, Canada Ionic Orleans, France Mapserver Arizona CocoonVirginia, USA Tsukuba, Japan GeoServer Canberra GeoServer Melbourne, Australia
Using a web service – step 1 GeoSciML Web Services: Request
Web service request – step 2 GeoSciML Web Services: Request
Web service response – part 1 GeoSciML Web Services: Response
Web service response - part 2 GeoSciML Web Services: Response
ORGANIZATION: Unique missions of geological surveys - collect, archive, disseminate data Geoscience Information Network (GIN) Distributed Web-based Interoperable 2,000 – 3,000 databases 1000’s of collections 80,000+ geologic maps
We agree on a data network that: • is distributed (vs centralized) • is interoperable • uses open source standards and common protocols (OGC, GeoSciML) • respects and acknowledges data ownership • fosters communities of practice to grow • facilitates development of new web services and clients
System overview GIN
Geologic map service scenario Catalog: NGMDB? OneGeology? NDC? GEON? NGDS? Registration Survey map servers OGC CSW ArcMap OGC WMS ArcGIS
National Geologic & Geophysical Data Preservation Program • $1M per year • National inventory • Metadata catalogue • National Digital Catalogue
Data discovery - • 79,000+ maps, images, data, and products from 350+ publishers • Lexicon of Geologic Names of the United States
Defining GIN • collections of service definitions, interchange formats, and vocabularies • independent of hardware, operating system, or lower-level network protocols • new technology will only require implementation of network elements in a new environment • architecture allows for the use of multiple conventions for different user groups
http – hypertext transfer protocol (& ftp, etc) html – hypertext mark-up language url – universal resource locator browser – built by others Open source standards – Open Geospatial Consortium data interchange tool – GeoSciML distributed data catalogues (National Geologic Map DB; National Data Catalogue, etc) Web services & applications – built by others WWW GIN
Challenges to building community Who sets the standards? Who controls the system? Who makes the decisions?
The network is voluntary, not imposed from above Interoperability
Don't panic! • We won’t take your data away – they stay with you • Your participation is voluntary • Keep your formats, system, servers
Will 3,000 interoperable data bases become an 800-lb gorilla?
GIN is partnering with the global Earth science community AASG & USGS National Geoinformatics System OneGeology-Europe – 21 nations Marine Metadata Interoperability Initiative US DOE National Geothermal Data System (NGDS) US DOE Geothermal Technologies Program Energy Industry Metadata Standards Working Group - Energistics
MS SciScope – geospatial data discovery Welcome to SciScope!SciScope is a tool by Microsoft Research to help geoscientists discover data from numerous data repositories with ease through a single, intuitive interface. Users can display multiple map layers related to the scope of their study and interact with geographical features on the map including dams, rivers, water bodies, geology, aquifer systems, ecological regions and river basins.
GIN DEMO PROGRAM NSF INTEROP GIN • 3 year development of standards, services • Demos in ~6 SGSs; ~$80K subcontracts
“Circuit Riders” • Part trainer, part management consultant, part computer expert • Write GeoSciML “wrappers” • Guide server configurations • Training, short courses • $80K for demos across AASG
ADOPTION & DEPLOYMENT • US Dept. of Energy (May, 2009) • National Geothermal Data System (NGDS) • GIN architecture, standards • $5M, 5 years • Adopted by US Geothermal Technologies Program
National Geothermal Data System Distributed data sources NGDS Legacy data repository Desktop applications (GeoSciNet) Ontologies, vocabularies Discovery, access, exchange (GIN) Portals (GeoSciNet, SciScope)
National Geothermal Data System • Data discovery, access, exchange: GIN • Distributed content: geothermal community • Legacy data repository: NGDS • Desktop applications (economic modeling tool, etc): GeoSciNet • Portals: GeoSciNet, SciScope
NATIONAL DEPLOYMENT • US DOE “Geothermal Data Development, Collection, and Maintenance” • $20M, 1-5 awards • AASG proposal submitted
29 countries and European organizations are committed to create a geological map at 1:1.000.000 scale, integrated with metadata initially available in the following languages: English, French, Italian, Spanish, Swedish, Czech and Norwegian.
Network sustainability • tipping point at which users and providers will see the network as critical to their basic functions • populating and using the network becomes a necessary cost of doing business • how do we maintain network functions?