230 likes | 343 Views
NBII’s Biocomplexity Thesaurus: Adding Value to Data and Information for Biodiversity Science EPA System of Registries Conference May 20, 2009 Lisa Zolly NBII Knowledge Manager, USGS. The National Biological Information Infrastructure (NBII).
E N D
NBII’s Biocomplexity Thesaurus: Adding Value to Data and Information for Biodiversity ScienceEPA System of Registries ConferenceMay 20, 2009Lisa ZollyNBII Knowledge Manager, USGS
The National Biological Information Infrastructure (NBII) A broad, collaborative program, managed by the U.S. Geological Survey’s Biological Informatics Office, to provide access to data and information on the Nation’s biological resources, and tools for their integration and analysis.
Govspeak-free translation: • NBII is • An information system that • Allows dynamic search capabilities • For the discovery of natural resource data and information • And provides tools for interacting with data
What kinds of content does NBII provide access to? • Geospatial metadata • Datasets • Databases • Native narrative content on priority and emerging biological issues • Map services • Monitoring projects • Monitoring protocols • Human-reviewed external Web sites • Scientific publications • Images
What topical areas does this data and information cover? • Invasive species • Wildlife & zoonotic diseases • Bird conservation • Fisheries & aquatic resources • Pollinators & pollinator declines • Species data & information • Data standards • Specialized products & services addressing particular regional issues • Many other topics….
Who generates this content? • Program Office staff at USGS • Biologists, computer scientists, librarians, project managers • NBII partners • More than 200 federal, state, local, non-profit, private sector, academic, and international entities • Other external entities • Work with other data creators and data providers to access their information
How do we provide access to this content? • Host it ourselves • Harvest it from external sources • Access through Web services • RSS and other Web 2.0 services • Scripted calls / portlets to access data remotely • Map services
Whom do we serve with our data & information? • Field biologists & research scientists • Resource managers in the field • Decision-makers in agencies • Policy-makers • Educators • Citizen scientists • General public All of these groups have different information needs, different understandings of the content, and different ways of expressing their information needs.
Our Challenge How can we effectively characterize and present - and how can our users find - data and information that span: • Many file formats • Many data types • Many data standards • Many originators • Many subjects • Many physical locations • Many user types
Standards - the foundation of NBII’s success • Federal Geographic Data Committee (FGDC) Metadata Standard + Biological Data Profile • For data and map services • Dublin Core Metadata • For cataloguing of other resources • JSR168-compliant portlets • Allows external sites to use our sites’ portlets • OpenGIS • Shared map layers and map services • Integrated Taxonomic Information System (ITIS) • Authority file for species names • NBII Gazetteer • Authority file for geographic names • NBII Biocomplexity Thesaurus • Authority file for subject terms • Used to facilitate end-user searching
Data & information challenges common to many agencies • Your manager needs an existing presentation on a specific topic from your Intranet, but after searching on thousands of items in the CMS, fails to find it. • A user of your public Web site complains that she can’t find information on your Web site related to a very important topic that’s within your mission and scope. • A researcher is looking on your site for data from existing datasets related to her project, and is finding it impossible to determine if relevant data exists.
What your boss is thinking (sanitized version)… “Which !*#(!&@?! presentation is the one I need? Stupid content management system! When I find out who procured this, heads will roll!”
Applying thesaurus terms to internal documents: a shared, controlled vocabulary for searchable document metadata
Whatyour end-user is thinking….. “Stupid Web site! None of these has to do with uses of controlled burning in Alabama! They spent my tax dollars on this??”
Applying thesaurus-aided searching to your search interface maps synonyms and relationships Synonyms are ORed in the background Related subjects that also may interest the user Resource the user wanted
What the computer says: What the researcher is thinking…. “Computer! Give me data related to heat release in combustible plants found in the United States!”
Exploiting metadata beyond Title and Keywords….. Now: extract data entity and data attribute information from FGDC metadata, and map the contents to thesaurus concepts at the time of end-user search Soon: query the database itself, using our thesaurus Web service as a search mediator
NBII Biocomplexity Thesaurus: current topical coverage (and sources) • General life sciences (CSA/ProQuest) • Pollution & contaminants (CSA/ProQuest) • Fisheries & aquatic resources (CSA ProQuest) • Social sciences (CSA/ProQuest) • Ecotourism (CSA/ProQuest) • General environment (CERES/NBII) • Fire ecology & management (Tall Timbers Research Station; National Wildfire Coordinating Group; Southern Fire Encyclopedia; Lessons Learned Center; Fire Effects Information System)
What’s in the pipeline • Browse-tree view of major thematic categories • Spanish and Portuguese translations • Expansion of terminologies in specific content areas • Wildlife diseases • Information sciences • Climate change • Evolutionary biology • Collaboration towards ontology development
Don’t reinvent the wheel! • If you need a terminology system, you may not have to start from scratch to get one. • Terminology registries list what systems already exist • Great places to start: EPA, UN-FAO registries, and the CENDI terminology locators • Some terminologies are accessible via Web services • Seek out other agencies and organizations (public & private sector) in your line of business with an existing terminology system • Can you license theirs? • Can you enter into a co-management agreement?
What’s available from us online • Simple term look-up tool • Web service to our thesaurus for public use • SKOS implementation of our thesaurus, available to anyone • KEY MESSAGE: it’s free • <http://thesaurus.nbii.gov>
Thanks for your time! Lisa Zolly NBII Knowledge Manager USGS Biological Informatics Office lisa_zolly@usgs.gov