280 likes | 425 Views
NBII Ecoinformatics Technical Working Group. Gladys Cotter Mike Frame Ispra, Italy January 2006. Why are Data Standards Important?. Facilitate discovery of information Allow for information exchange Enhance information management. www. NBII. gov. My. NBII. gov. PORTAL.
E N D
NBII Ecoinformatics Technical Working Group Gladys Cotter Mike Frame Ispra, Italy January 2006
Why are Data Standards Important? • Facilitate discovery of information • Allow for information exchange • Enhance information management
www . NBII . gov My . NBII . gov PORTAL Integrated View Content Management Collaboration Services Integrated / Federated Search Database and Web Geospatial Services Model Services Services Distributed Services Geo - ITIS Thesaurus DIGR Catalog Mapping Geoparsing Catalog Discovery Operations Catalog referencing Resource Catalog Resource and Geospatial text Model Geospatial Dublin Core ( plus ) Database and Service Catalogs Dataset Services Services Web Services Resource Catalog Catalog Catalog Clearinghouse OGC / ISO FGDC / ISO UDDI / WSDL ?? Describe and Discover Consume Distributed Resources Distributed Applications , Databases , Websites , Tools and Models Services Overview
CSA/NBII Biocomplexity Thesaurus • Provides us with a common language to describe the scope and content of all resources in NBII system • Using a common language to characterize these resources facilitates their retrieval by users • Created as an NBII partnership with Cambridge Scientific Abstracts (CSA) • Five (5) existing thesauri combined into one Biocomplexity Thesaurus
Biocomplexity Thesaurus: Using a Controlled Vocabulary http://thesaurus.nbii.gov
Latest Thesaurus Developments • Thesaurus Web service developed by ORNL; released to the public on thesaurus.nbii.gov in Spring 2005 • Thesaurus maintenance ongoing • Evaluating ~750 terms suggested by nodes for inclusion in the Biocomplexity • 2005-2006 focus: • Adding ~3000 terms for fire ecology and management to support FRAMES and nodes addressing fire issues • Joint web-service with EPA/EU Ecoinformatics
Project Background • Project “birthed” at the DC Ecoinformatics meeting in May 2005 • Goals • Identify specific collaborative efforts beneficial to everyone • Improve access to environmental information • Leverage expertise of participants • Begin to address the multi-lingual challenges • Develop a “working tool”
Current Capabilities http://thesaurus.nbii.gov/SearchNBIIThesaurus/
Current Capabilities - NBII “endangered species” http://thesaurus.nbii.gov/SearchNBIIThesaurus/
Current Capabilities - EIONET “endangered species” http://thesaurus.nbii.gov/SearchNBIIThesaurus/
Challenges • Thesaurus scope, intent, purpose, and coverage is different • NBII = sub-discipline of environment • Endangered species • Broader Terms:Species , Special status species , Taxa • EIOINET = broad environment • Broader Terms:environmental protection
Potential areas to pursue - thesaurus • Pulling results from both sources simultaneously • Addressing “scope” and level issues • Deploying to a test audience • Implementing within cataloging tools • Implementing within querying tools
Joint SPIRE Activities • Research project in the area of Invasive species Water hyacinth clogging the Ortega River (Photo: Don Schmitz
Additional Areas • Web-services • Registries • Deployment • Geospatial Standards (OGIS) • SPIRE activities
Questions, Comments, Give us feedback on Search:: http://www.nbii.gov/search/search.html Take A look at the site: http://thesaurus.nbii.gov
What is SKOS • Simple Knowledge Organisation System (SKOS) • Sponsored by the WC3 • Specifications and standards to support the use of thesauri, classification schemes, subject heading lists, taxonomies, terminologies, glossaries and other types of controlled vocabulary within the framework of the semantic web. • An application of RDF
USGS Research Needs • Semantic Searching and Analysis • Distributed modeling • Consuming data • Model reuse • Modeling standards for sharing • Mini/multi thesaurus applications
USGS Research Needs – continued • Geospatial technologies • Gazetteer aided searching • Geo-parsing of data • Semantic models for species analysis • Adoption & intersection of registries
NBII BioBot Features • Simple single search box interface • All relevant resources are returned (user doesn’t have to know what where the resource comes from or what type of resource it is) • i.e. If the resource is housed at a university, federal agencies or if the resource contains to a mapping application, data set, teacher lesson plan, etc. • Tabbed interface allows users to view “All” results by default, or select on specific resource types (i.e. Maps, Images, Journal articles, etc.) • Scientifically creditable resources cataloged and identified by NBII partners using the Dublin Core standard are “weighted” higher than any crawled/harvested resource found on the internet. • Controlled Subjects, species names, place names given highest weights; Title, Creator, Publisher, and Description also weighted • NBII Biocomplexity Thesaurus operating in background to supplement user search terms with synonymous terms thereby not requiring the user to know all of the variations of a term.
NBII Search: with & without • Example: “alien species”
Summary • Google’s mission: Make the entire Web universe findable • Google cannot interpret context of a user’s search • What a document contains • Google’s ranking algorithm is based upon inbound links; resources more often linked to are ranked higher • NBII’s mission: Make the biological Web findable • NBII interprets context with metadata (subject, taxonomic, geographic) • What a document is about • NBII’s ranking algorithm is based upon weightings of metadata fields, and weighting of “Reliable/High Quality/Trusted sources
Questions, Comments, Give us feedback on Search:: http://www.nbii.gov/search/search.html Take A look at the site: http://thesaurus.nbii.gov