400 likes | 543 Views
Under the hood. 3500+ source files in java J2SE, Tomcat, Pellet, MySQL, Spring 851 MB code base. How come?. Instrument. Date. Parameter. Ambitious that I am …. Open-source Project for a Network Data Access Protocol test site. Things are not that easy….
E N D
Under the hood • 3500+ source files in java • J2SE, Tomcat, Pellet, MySQL, Spring • 851 MB code base
How come? Instrument Date Parameter
Open-source Project for a Network Data Access Protocol test site
Scientific Data Portal Modelingwhat exactly is a scientific data portal and how can we generate one? Linyun Fu Computer Science 2015 Advisor: Peter Fox 2013-04-30
Anything we could reuse? Modules, themes, templates, nodes, contents, content types…
How did we build our portals? We just built them from scratch
NOAA Operational Model Archive Distribution System [Alpert AGU2009]
http://coreref.org/projects/and1-1b/viewer/ [Reed AGU2009]
http://databasin.org/search/ [Comendant AGU2009]
Gulf of Mexico Coastal Ocean Observing System Information for Boaters, Fishermen and Divers GCOOS is a Regional Association of the U.S. Integrated Ocean Observing System. Members represent the private sector, government agencies, academia, and education and outreach sectors. GCOOS is concerned with sustained observations, integration of existing observing systems, sharing of non-proprietary data, and developing products and services. BOATING FISHING DIVING Information on waves, tides, currents, wind speed and direction, and air and water temperature aid navigation and improve marine forecasts. Information on bathymetry, water temperature, chlorophyll, ocean currents, salinity and dissolved oxygen can be used to locate favorable habitats and frontal systems where food and animals aggregate. Information on surface and subsurface currents, temperature throughout the water column, water quality (e.g. red tide conditions) and water clarity can aid planning efforts and promote safer, more enjoyable diving. Multibeam map of the Pinnacles Region, Gulf of Mexico, showing a drowned reef complex along the 70 m isobath. Image credit: James Gardner, Peter Dartnell and Kenneth Sulak, USGS. Wave height and wave period data from WAVCIS Satellite false-color image showing chlorophyll (ug/l) distribution in the Gulf of Mexico. Image credit: USF Institute of Marine Remote Sensing. www.gcoos.org Ocean current data showing profiles between the surface and 21 meters. Courtesy of Bob Weisberg, USF COMPS. Poster credit: Christina Simoniello, USM Gulf Coast Research Laboratory
Then, someone said “Let’s reuse” Interoperability and reuse
Interoperability - Definition • in·ter·op·er·a·bil·i·ty: ability of a system (as a weapons system) to work with or use the parts or equipment of another system. – Merriam-Webster's Collegiate Dictionary, Eleventh Edition. • Between system and system. • Ability to work together. • Ability to share resources. • As users of the Data Systems, we are more aware of and concerned with interoperability among data products. [Kuo AGU2010] Versatile tools and standardized frameworks
But we say “Why not generate?” After we understand data portals
So a data portal is… Task fulfills User Interface Service presents supplies Dataset
How to use the model? • As a checklist of design issues for portal engineers • To compile best practices into a well-organized recipe for quick reference • Sample dishes coming soon • To facilitate choice among frameworks • As the brain of the data portal generator • Inspiring ideas to follow…
Service description spec:api a api:API; ... api:sparqlEndpoint <http://localhost:3030/cmspv/query>; #api:sparqlEndpoint <local:data/example-data.ttl> api:base "http://aquarius.tw.rpi.edu:8047/elda/cmspv"; ... api:variable [api:name "base"; api:value "http://cmspv.tw.rpi.edu/rdf"], ...
URI template spec:vocabulariesEndpoint a api:ListEndpoint; api:uriTemplate "/vocabs"; api:exampleRequestPath "/vocabs"; api:selector [ api:where "?item rdf:type skos:ConceptScheme. ?item skos:prefLabel ?label."; api:orderBy "?label"; ]; .
Default viewer and formatter http://cmspv.tw.rpi.edu/rdf/vocabs
But we want more • More data source choices • Easier specification syntax • Or specification editor/wizard • More flexible data formats • Visualization and analysis support
GRASS GIS functionalities Geographic Resources Analysis Support System [Neteler 2008]
PyWPS Web Processing Service implemented in Python
VisKo [Del Rio 2012] Open Source Visualization Knowledge System
A sample VisKo query VISUALIZE http://rio.cs.utep.edu/ciserver/ciprojects/GravityMapProvenance/gravityDataset.txt AS views:2D_ContourMap IN visko:mozilla-firefox WHERE FORMAT = formats:SPACESEPARATEDVALUES.owl#SPACESEPARATEDVALUES AND TYPE = types:d19 AND C = 10 AND A = 20
Roadmap for a portal generator • Content oriented portal software • Understand scientific data portals • Interoperability and reuse • Define portals with specification files • Embrace specialized tools • Open source • Open standards • High-level coding • Allow testing as you go
Project page: http://tw.rpi.edu/web/project/SeSF/workinggroups/ScientificDataPortalGenerator • Point of contact: Linyun Fu ful2@rpi.edu
Selected references • VSTO: Fox, P., McGuinness, D.L., Cinquini, L., West, P., Garcia, J., Benedict, J., and Middleton, D. 2009. Ontology-supported Scientific Data Frameworks: The Virtual Solar-Terrestrial Observatory Experience. Computers & Geosciences, pages 724–738. • OPeNDAP: West, P., et al. "OPeNDAP Hyrax: An extensible data access framework within the Earth System Grid Federation." AGU Fall Meeting Abstracts. Vol. 1. 2011. • SemantEco: Wang, P., Fu, L., Patton, E.W., McGuinness, D.L., Dein, J., and Bristol, S. 2012. Towards Semantically-enabled Exploration and Analysis of Environmental Ecosystems. In Proceedings of 8th IEEE International Conference on eScience (October 8-12 2012, Chicago, IL). • NOMADS: Alpert, J. C., and J. Wang. "NOAA Operational Model Archive Distribution System (NOMADS): High Availability Applications for Reliable Real Time Access to Operational Model Data." AGU Fall Meeting Abstracts. Vol. 1. 2009. • ANDRILL MIS: Reed, J., et al. "Web-based Collaboration and Visualization in the ANDRILL Program." AGU Fall Meeting Abstracts. Vol. 1. 2009. • Data Basin: Comendant, T., et al. "Data Basin: Expanding Access to Conservation Data, Tools, and People." AGU Fall Meeting Abstracts. Vol. 1. 2009.
About user needs survey: Meyer, D. J., and K. P. Gallo. "Enabling data access and interoperability at the EOS Land Processes Distributed Active Archive Center." AGU Fall Meeting Abstracts. Vol. 1. 2009. • GCOOPS: • Jochens, Ann. "Gulf of Mexico Coastal Ocean Observing System." Ocean Views: News from the Ocean. US Office. Issue 29 (2006): 1. • Howard, M. K., F. C. Gayanilo, and A. E. Jochens. "Regional Ocean Data Portal: Transforming Information to Knowledge." AGU Fall Meeting Abstracts. Vol. 1. 2009. • About interoperability: Kuo, K. "Interoperability Barriers in NASA Earth Science Data Systems from the Perspective of a Science User." AGU Fall Meeting Abstracts. Vol. 1. 2010. • About reuse: Truslove, I., et al. "A Software Architecture To Encourage Internal And External Software Reuse." AGU Fall Meeting Abstracts. Vol. 1. 2011. • CmapTools: A Knowledge Modeling and Sharing Environment, A. J. Cañas, G. Hill, R. Carff, N. Suri, J. Lott, T. Eskridge, G. Gómez, M. Arroyo, R. Carvajal (2004). In: Concept Maps: Theory, Methodology, Technology, Proceedings of the First International Conference on Concept Mapping, A.J. Cañas, J.D. Novak, and F.M. González, Editors , Universidad Pública de Navarra: Pamplona, Spain. p. 125-133. • Semantic Application Design Language (SADL): http://sadl.sourceforge.net/
ELDA: http://elda.googlecode.com/hg/deliver-elda/src/main/webapp/lda-assets/docs/E1.2.19-index.html • GRASS GIS: Neteler, Markus, and Helena Mitasova. "Open Source GIS: A GRASS GIS Approach. The International Series in Engineering and Computer Science." (2008): 406. • VisKo: Nicholas Del Rio and Paulo Pinheiro da Silva. Capturing and Using Knowledge about the Use of Visualization Toolkits. AAAI Fall Symposium on Discovery Informatics: The Role of AI Research in Innovating Scientific Processes, Arlington, Virginia, November 2, 2012. • IPython Notebook: https://github.com/ipython/ipython/wiki/A-gallery-of-interesting-IPython-Notebooks