310 likes | 467 Views
CSIG 10 Survey of Emerging IT Trends and Technologies. Chaitan Baru SDSC. Cyberinfrastructure. The “cyberinfrastructure” initiative is an attempt to provide explicit investments in IT for science & engineering research and education
E N D
CSIG 10Survey of Emerging IT Trends and Technologies Chaitan Baru SDSC
Cyberinfrastructure • The “cyberinfrastructure” initiative is an attempt to provide explicit investments in IT for science & engineering research and education • From NSF’s Cyberinfrastructure Vision for 21st Century Discovery, www.nsf.gov/od/oci/ci-v7.pdf, July 20, 2006 • “The comprehensive infrastructure needed to capitalize on dramatic advances in information technology has been termed cyberinfrastructure.” • “…integrates hardware for computing, data and networks, digitally-enabled sensors, observatories and experimental facilities…: • “…an interoperable suite of software and middleware services and tools...” • Investments in interdisciplinary teams and cyberinfrastructure professionals with expertise in algorithm development, system operations, and applications development are also essential…” • “In 1999, the PITAC released the seminal report ITR-Investing in our Future, prompting new and complementary NSF investments in CI projects, such as the Grid Physics Network (GriPhyN) and international Virtual Data Grid Laboratory (iVDGL) and the Geosciences Network, known as GEON.”
Geoinformatics • A vision for Geoinformatics, from the NSF Workshop on Envisioning a National Geoinformatics System for the United States Denver, March 2007 • “…a future in which someone can sit at a terminal and have easy access to vast stores of data of almost any kind, with the easy ability to visualize, analyze and model those data.”
GeoinformaticsFrom David Lambert, NSF EAR/GEOPresentation at GEON Annual Meeting, 2005
GeoinformaticsCyberinfrastructure for the Solid Earth Sciences: Objectives • Make data, tools, applications …and communities… easily accessible online • Provide an integration environment for 3D and 4D geoscience data integration Book to be published this year by Cambridge University Press. Co-editors: Randy Keller and Chaitan Baru
A Use Case for Geoinformatics • A user request of the form: “For a given region (i.e. lat/long extent, plus depth), return a 3D structural model with accompanying physical parameters of density, seismic velocities, geochemistry, and geologic ages, using a cell size of 10km”
Portal-based Science EnvironmentsSupport for resource sharing and collaborations
EarthScope Data Portal • SDSC • San Diego • IRIS • Seattle • UNAVCO • Boulder • ICDP • Potsdam portal.earthscope.org
CUAHSI Hydrologic Information System, HIS (http://his.cuahsi.org) • Data Discovery, Data Access, Data Publication
Funded by NSF IT Research program Multi-institution collaboration between IT and Earth Science researchers GEON Cyberinfrastructure provides: Authenticated access to data and Web services Registration of data sets, tools, and services with metadata Search for data, tools, and services, using ontologies Scientific workflow environment and access to HPC Data and map integration capability Scientific data visualization and GIS mapping GEON: Geosciences Network* * The network / grid concept has been evolving over past several years
GEON: The Geosciences Network www.geongrid.org • GEON is a coalition among IT and Earth Science researchers with the goal of developing advanced information technologies to enable new modes of geosciences research • GEON is developing technologies for information integration and knowledge discovery • Project participants: 14 PI institutions, and partners including, other projects, agencies, and industry • GEON has deployed a Web services-based, distributed computing infrastructure, called the GEONgrid, across PI and partner sites • GEONgrid provides access to data collections, tools, and applications that support geosciences research • Project funding: $11.25M, 2002-2007 • RESEARCH AND EDUCATION PRODUCTS AND RESULTS • Technologies for Ontology-Based Data Registration, GIS Map Integration, Distributed Portals, and 4D Visualization • Research on • 3D Lithospheric structure • Gravity Modeling • Remote Sensing Data Integration • Cyberinfrastructure Summer Institute for Geoscientists and graduate courses in Geoinformatics
GEON Partners • 14 PI institutions • Over 20 other partners including, universities, industry, • government agencies/labs PI Institutions • Arizona State University • Bryn Mawr College • Penn State University • Rice University • San Diego State University • San Diego Supercomputer Center/UCSD • University of Arizona • University of Idaho • University of Missouri, Columbia • University of Texas at El Paso • University of Utah • Virginia Tech • UNAVCO • Digital Library for Earth System Education (DLESE) Partners • Chronos • CUAHSI-HIS • ESRI • Calit2 • Georgia State University • Geological Survey of Canada • Georeference Online • HP • IBM • Lawrence Livermore Natl Laboratory • NASA Goddard, Earth System Division • SCEC • U.S. Geological Survey (USGS) • Purdue University Affiliated Projects • EarthScope, IRIS
Key Informatics Areas • Portals • Authenticated, role-based access to cyber resources: data, tools, models, model outputs, collaboration spaces, … • Data Integration • Search, discovery and integration of data from heterogeneous information sources (“mediation” and “semantic integration”) • Use of workflow systems, and access to HPC • Ability to “program” at a higher level of abstraction • Sharing of models, along with “provenance” information • Gateways to HPC environments • Management of Geospatial Information • Using GIS capabilities, map services, geospatial data integration • Visualization of 3D, 4D geospatial data and information
GEON Portalportal.geongrid.org • Generic Capabilities: • Search • Workbench • Dynamic map services, map integration • Applications: • Paleo database integration • LiDAR data access and data processing • SYNSEIS: Online access to computational modeling system • Gravity and Magnetic database for US
GEON and Related Portals National Ecological Observatory Network Prototype Chesapeake Bay Environmental Observatory CUAHSI Hydrologic Information System Tropical Ecology Assessment and Monitoring Network EarthScope
GEON Project and Funding Structure GEON • NSF ITR NSF EAR/IF Facility (GEO, OCI, CISE) • OCI Software Development for Cyberinfrastructure (SDCI) OpenTopography OpenEarth Framework NSF CluE (GEO, CISE) NSF Geoinformatics GEON Portal CluE
Domain-specific Cybertools (software) Shared Cybertools (software) Distributed Resources (computation, storage, communication, etc.) Integrated Cyberinfrastructure System Source: Dr. Deborah Crawford, Chair, NSF CI Working Committee • Application Domains • Geosciences, Engineering, • Environmental Sciences, Physics, • Astronomy, Archaeology, • Neurosciences, Biomedicine, … DevelopmentTools & Libraries Education and Training Discovery & Innovation Middleware Services Hardware
Your Specific Tools & User Apps. Shared Tools ScienceDomains Community Cyberinfrastructure Projects Friendly Work-Facilitating Portals Authentication - Authorization – Auditing - Resource Discovery - Workflows - Visualization - Analysis DevelopmentTools & Libraries Ecological Observatories (NEON) High Enegy Physics (GriPhyN) Ocean Observing (ORION) Biomedical Informatics (BIRN) Geosciences (GEON) Earthquake Engineering (NEES) Middleware Services Hardware Source: Prof. Mark Ellisman, UC San Diego Distributed Computing, Instruments and Data Resources
Services implied by the Geoinformatics use case “For a given region (i.e. lat/long extent, plus depth), return a 3D structural model with accompanying physical parameters of density, seismic velocities, geochemistry, and geologic ages, using a cell size of 10km”
Services implied by the use case google, bing,..? • Search and discovery • Data access • Data integration, including transformations, model execution, and visualization • Result publication (and preservation—so that results can be searched and discovered) Grid computing Some database technologies supercomputers Some scientific visualization Digital libraries and archives All in a distributed environment
Data “integration” • A priori integration • Consistent metadata and data standards and data “schema”/structure, and semantics are pre-defined across a set of data resources • User simply issues a query and receives a result versus • Ad hoc integration • Consistent standards for discovery and data access, but retrieved data are visualized in a common environment and user interactively integrates the data
Evolution of distributed environments • Mainframes • with distributed “synchronous” terminals • Networked minicomputers • with proprietary computer networking protocols • The Web • Engineering workstations with open communications protocols
Evolution of distributed environments • The Grid • Distributed computational and storage resources owned by organizations, orchestrated together to form “metacomputers” • The Cloud • On-demand computational and storage resources provided as a service over the Internet, with incremental cost models
Clients in a distributed environment • “Dumb” terminals • IBM 3270, vt100 • “Thick” clients • Workstations as clients in a client-server system • “Thin” clients • Original PC desktops • Thick clients • Modern PCs with powerful capabilities (64-bit, multicore, large memory) • Thin clients • Mobile devices
Distributed environments…contd. • Service-oriented architecture, SOA • A programming style for distributed computing • Services may be distributed in wide area (Internet scale) • or local area (within a datacenter) • Data inertia • Moving data to computation vs • Computation to data
Virtual Organizations (VOs) • A socio-technical concept • A distributed collection of entities and resources that come together to solve a specific problem • Multiple participants • Distributed sites • Participants are from different “administrative domains” • Policies, rules, systems of the VO may be different than those of the participating organizations • Requires agreement on basics standards and protocols to enable resource and data sharing
Other Geoinformatics Efforts • OneGeology.org • International initiative of geological surveys to create dynamic geological map data available via the web. • USGS initiative • Presentation by Dr. Linda Gundersen, at Geoinformatics 2007, San Diego.
USGS: 1000’s of National and Regional Databases • The National Map – topographic, elevation, orthoimagery, transportation hydrography etc. • Geospatial One Stop-portal • MRDATA – Mineral Resources and Related Data • The National Geologic Map Database stnadardized community collection of geologic mapping • National Water Information System - NWISWeb • National Geochemical Survey Database (PLUTO, NURE) • National Geophysical Database (aeromag, gravity, aerorad) • Earthquake Catalogs • North American Breeding Bird Survey • National Vegetation/speciation maps • National Oil and Gas Assessment • National Coal Quality Inventory Source: Presentation by Dr. Linda Gundersen, USGS, at Geoinformatics 2007, San Diego, CA.