320 likes | 449 Views
Build EPA’s Synaptica in the Cloud: An Enterprise Vocabulary Catalog for Data.gov/semantic. Brand Niemann U.S. EPA July 8, 2010 http://semanticommunity.net. Disclaimer: These slides do not reflect the views of the U.S. Environmental Protection Agency
E N D
Build EPA’s Synaptica in the Cloud: An Enterprise Vocabulary Catalog for Data.gov/semantic Brand Niemann U.S. EPA July 8, 2010 http://semanticommunity.net Disclaimer: These slides do not reflect the views of the U.S. Environmental Protection Agency and does not constitute endorsement by the EPA of the standards or products mentioned.
Overview • The Challenge • EPA’s Synaptica Program • The Expert and His Advice • The Cloud Tools • The Inspiration • The Data Sources • Other Sources of Data • The Process • The Results • Comments • Acknowledgements • References
The Challenge • EPA's Synaptica (Vocabulary Catalog) is a closed-system for terminology services that contains valuable vocabularies that need to be used in Semantic Web applications. Semantic Web applications require harmonized enterprise vocabularies that are referenced by well-defined web addresses (URI's or URL's) and used in Semantic Web markup language data models called RDFS. EPA's Synaptica is part of EPA's System of Registries (SoR) that is moving towards a Semantic Web application (URIs and metadata) as well. • See jahendler Good Data.gov meeting with @georgethomas and others - lots of cool stuff in #semweb space- starting to focus on URIs and metadata • Data science and data forensics takes a systematic approach to understanding and auditing data resources so non-experts can use those data resources more easily and confidently. http://epadata.wik.is/EPA's_Synaptica#The_Challenge
EPA’s Synaptica Program • 1. Go to the System of Registries at http://www.epa.gov/sor • 2. Click on "Login for EPA and Partners" (on the left in the blue navigation bar) and provide your EPA Portal User ID and password (same as your Novell login). • Do not use your Synaptica User ID/password for this login. • 3. Once you are logged in, click on Terminology Services • 4. Go to "Manage Terminology" tab, select "Access Terminology Tool" • 5. Click "Launch Terminology Services Tool" • 6. Click "Continue to this Web Site (not recommended)" • 7. Login to Synaptica using your User ID and password http://epadata.wik.is/EPA's_Synaptica#EPA's_Synaptica_Program
EPA’s Synaptica Program http://www.epa.gov/sor
EPA’s Synaptica Program https://iaspub.epa.gov/sor_internet/registry/sysofreg/login/login.do
EPA’s Synaptica Program https://iaspub.epa.gov/sor_extranet/registry/sysofreg/home/overview/home.do
EPA’s Synaptica Program https://iaspub.epa.gov/sor_extranet/registry/termreg/home/overview/home.do
EPA’s Synaptica Program https://iaspub.epa.gov/sor_extranet/registry/termreg/manageterminology/accessterminologytool/
EPA’s Synaptica Program https://etss.epa.gov/home/homepage.asp
EPA’s Synaptica Program https://etss.epa.gov/home/homepage.asp?vtvpid=1000041
EPA’s Synaptica Program Show all importable files (.XLS, .CSV, .TXT, .XML) See next two slides https://etss.epa.gov/tools/XMLUploadForm.asp?ext=ALL
EPA’s Synaptica Program https://etss.epa.gov/incoming/AG%20101%20load%20file.xls
EPA’s Synaptica Program https://etss.epa.gov/incoming/Aquatic%20Biodiversity%20Glossary_Term_LOAD%2020091116.csv
EPA’s Synaptica Program None! ZThes, RDF/SKOS and RDF/OWL formatted XML files (.XML) https://etss.epa.gov/tools/XMLUploadForm.asp?ext=XML
The Expert and His Advice • Edward Tufte Presidential appointment announced by White House, March 5, 2010. • Tufte Comment on iPhone interface design: Better to have users looking over material adjacent in space within our eyespan rather than stacked in time. This is especially the case for statistical data, where the fundamental analytical task is to make comparisons. Also see page 159 in the book reference below. • http://epadata.wik.is/EPA's_Synaptica#References http://epadata.wik.is/EPA's_Synaptica#The_Expert_and_His_Advice
The Cloud Tools http://cloud.mindtouch.com/
The Cloud Tools http://epadata.wik.is/EPA's_Synaptica#The_Cloud_Tools
The Cloud Tools http://spotfire.tibco.com/
The Cloud Tools http://ondemand.spotfire.com/public/Help/index.htm
The Inspiration H1N1 Spread Courtesy of TIBCO Spotfire. See Web Player. http://epadata.wik.is/EPA's_Synaptica#The_Inspiration
The Data Sources http://epadata.wik.is/EPA's_Synaptica/File_Import_Manager
Other Sources of Data http://epadata.wik.is/System_of_Registries/Environmental_Terminology_System_and_Services
The Process • Linked Data Design Principles: • Use HTTP URI’s so that people can look up those names • When someone looks up a URI, provide useful information, using the standards (RDF and SPARQL) • Include links to other URI’s so that they can discover more things • Use URI’s as names for things • A nice summary of the 5 star scheme: • make your stuff available on the web (whatever format) • make it available as structured data (e.g. excel instead of image scan of a table) • non-proprietary format (e.g. csv instead of excel) • use URLs to identify things, so that people can point at your stuff • link your data to other people’s data to provide context Source: http://www.w3.org/DesignIssues/LinkedData http://epadata.wik.is/EPA's_Synaptica#The_Process
The Process • The Basic Steps: • Inventory Data Sources and Plan Application • Prepare and Import Data and Metadata • Implement Layout and Analytics • Add Bookmarks and Create Data Stories • Publish and Test in Web Player • Get Feedback and Improve • First create visualizations, faceted search (filters), and analytics for each individual data source and then look for relationships between the data sources. http://epadata.wik.is/EPA's_Synaptica#The_Process
The Results • Reproduced EPA’s Synaptica File Manager in the Wiki. • Draged-and-Droped EPA’s Synaptica File Manager Files into Spotfire to Make a Selection for Spotfire. • See right-hand side. http://epadata.wik.is/EPA's_Synaptica#The_Results
The Results Spotfire on PC http://epadata.wik.is/EPA's_Synaptica#The_Results
The Results Search Spotfire on PC http://epadata.wik.is/EPA's_Synaptica#The_Results
The Results Spotfire Web Player http://epadata.wik.is/EPA's_Synaptica#The_Results
Comments • The initial objective to see how fast one could re-create EPA's Synaptica (Vocabulary Catalog) using Spotfire's Multiple Visualizations to make it an open-system for terminology services that contains valuable vocabularies that need to be used in Semantic Web applications (URIs and metadata). Now it is ready to integrate with other vocabularies for Data.gov/semantic. • Please use the Add Comment feature at the bottom of this wiki page to provide feedback and suggest additional analyses you would like to see. To use the Add Comment feature you first need to register by providing your email address. Your privacy will be respected and your email address will not be available to others or used for any other purpose. You can also download the Spotfire File (<1 MB) from this Wiki and a 30-day free evaluation copy from http://spotfire.tibco.com/ and reuse these analyses, add your own data to this file or new Spotfire files that you create. Have fun and give us your feedback! http://epadata.wik.is/EPA's_Synaptica#Comments
Acknowledgements • The author acknowledges gratefully Dean Allemang, Cory Casanave, Sean Connors, Mills Davis, Li Ding, David Eng, Lee Feigenbaum, Aaron Fulkerson, Jim Hendler, Ralph Hodgson, Kevin Kirby, Kevin Jackson, Bob Marcus, John McMahon, Richard Murphy, Brand Niemann, Jr., Barry Nussbaum, Matthew Phoenix, Tony Shaw, Jeff Stein, George Strawn, George Thomas, Pete Tseronis, and Edward Tufte. http://epadata.wik.is/EPA's_Synaptica#Acknowledgements
References • Brand L. Niemann, Put Your Desktop in the Cloud to Support the Open Government Directive and Data.gov/semantic, April 19, 2010, Semantic Universe. • Brand L. Niemann, Build Your Own Data.gov (Spotfire) and EPA Microsite (Spotfire) with Semantics and Statistics in the Cloud, May 15, 2010. Slides. • Brand L. Niemann, Build Your Community Health Information "Design for America" Using Mindtouch and Spotfire, May 17, 2010. Slides. • Brand L. Niemann, Build EPA’s CASTNET In the Cloud, May 21 and 30, 2010. Slides, Mindtouch, and Spotfire. • Brand L. Niemann, Build Your Own Data.gov/semantic with Mindtouch and Spotfire in the Cloud: The White House Visitor Database, May 22, 2010. Slides. See Data.gov takes the 'Mumsy' test, FCW, May 26, 2010. • Brand L. Niemann, Build EPA's EPA's Facility Registry System (FRS) and Locational Reference Database with Mindtouch and Spotfire in the Cloud: Virginia, June 1, 2010. • Brand L. Niemann, Build the UK’s COINS in the Data Science Library Cloud. Mindtouch and Slides. June 9, 2010. • Brand L. Niemann, Build EPA's Envirofacts in the Cloud: Virginia FRS, NPL, and TRI. Mindtouch and Slides. June 14, 2010. • Brand L. Niemann, Build the SemTech 2010 in the Cloud (Mindtouch and Spotfire), No Slides, July 2, 2010. • Brand L. Niemann, Build the White House Staff Salaries in the Cloud (Mindtouch and Spotfire). Slides, July 3, 2010. http://epadata.wik.is/EPA's_Synaptica#References