290 likes | 379 Views
Joint agINFRA & SCI-BUS workshop agINFRA Open data infrastructures for research in agriculture. 30th of May, 2013, Budapest , Hungary. Babis Thanopoulos Agro-Know Technologies. agINFRA in a few sentences. Based on a linked open data architecture harmonizing semantics and ontologies.
E N D
Joint agINFRA & SCI-BUS workshopagINFRA Open data infrastructures for research in agriculture 30th of May, 2013, Budapest, Hungary Babis Thanopoulos Agro-Know Technologies
agINFRA in a few sentences • Based on a linked open data architecture harmonizing semantics and ontologies. • Aggregating data of existing systems and taking advantage of advanced Grid & Cloud services and infrastructure. • Devised for scalability and maximum interoperability by adapting existing widely used components. • Fostering diverse communities of heterogeneous providers and users. • Providing researcher-centric services.
Why sharing data? (1/2) • Sharing research data is “an intrincate and difficult problem” (Borgman, 2011, JASIST) • Not much data sharing may be taking place - with exceptions in some domains. • Sharing takes different forms, from private data exchange to posting on-line, and including journal supplementary materials. • There are few standards for giving shared data the required computational semantics to build automated tools.
Why sharing data? (2/2) • … however reusing data is at the core of the principles of the scientific method. • … and a major concern for scientists and policy makers.
Agricultural research (+) content • publications, theses, reports, other grey literature • educational material and content, courseware • primary data, such as measurements & observations • Structured (e.g. datasets as tables) / digitized (e.g. images) • secondary data, such as processed elaborations • e.g. dendrograms, pie charts • provenance information, incl. authors, their organizations and projects • experimental protocols & methods • social data, tags, ratings, etc.
stats gene banks gis data blogs, journals open archives raw data technologies learning objects ……….. educators’ view
stats gene banks gis data blogs, journals open archives raw data technologies learning objects ……….. researchers’ view
stats gene banks gis data blogs, journals open archives raw data technologies learning objects ……….. practioners’ view
is great …but its not the answer
we create data silos
stats gene banks gis data blogs, journals open archives raw data technologies learning objects ……….. we need data pools
Author Subject ID Title Publisher Date Catalog actually share metadata
stats gene banks gis data blogs, journals open archives raw data technologies learning objects ………..
OMEKA repository tool (1/3) An agricultural learning repository tool Based on the open-source Omeka Web content, collections and digital archives management system (http://omeka.org) Supports individuals and organisations to create their own collection(s) of educational material
OMEKA repository tool (2/3) Individual content developers • Teachers, trainers, tutors, .. • Hosting, managing and online storage of their resources • no technical background required Educational institutions • Publish and share their resources online • No additional funding required • No technical support required Aggregators • Organic.Edunet, GLN • Creation of digital repositories • Exposition of their content to the network
OMEKA repository tool (3/3) Current version (for Administrator) agINFRA empowered version
Why a data infrastructure? • Enable relating data and combining and contrasting them in novel ways. • Enable scalable processing on research data. • Provide easy to adopt and deploy services. • Support a data-centric, integrated view of research. • Give a coherent support to a variety of research objects (Bechhofer et al., 2010).
agINFRA values A | Open | Must be open and interlinked NOT subject to barriers, based on standard formats and avoiding building data silos due to lack of interrelatedness and ad-hoc APIs. B | Meaningful | Must be meaningful through explicit semantics Reusing the semantics already provided in mature terminologies and ontologies that are exposed and interlinked through the Web. C | Reliable | Must be reliable, traceable and accessible Any kind of research objects can be stored in the data infrastructure, and there are NO barriers to expressing relations between these objects to capture the context of research activities. D | Actionable | Must be actionable trough services that empower research Data is not useful without flexible and adaptable services that allow researchers to act on the data in the ways they need.
agINFRA principles Infrastructure Be sustainable in the long term Allow for heterogeneous and rich kinds of data featuring semantics Expose everything as linked open data People Know and adapt to the needs of researchers Provide out-of-the-box, easy to adopt components Foster collaboration and sharing of data, via search but also casual discovery Services Use existing components supported by strong communities Create open services that can be easily composed Adapt services to research workflows
References • Borgman, Christine L. (2011, submitted). The conundrum of sharing research data. Journal of the America Society for Information Science and Technology. • Bechhofer, S., Ainsworth, J., Bhagat, J., Buchan, I., Couch, P., Cruickshank, D., Delderfield, M., Dunlop, I., Gamble, M., Goble, C., Michaelides, D., Missier, P., Owen, S., Newman, D., De Roure, D. and Sufi, S. (2010) Why Linked Data is Not Enough for Scientists. In: Sixth IEEE e–Science conference (e-Science 2010), December 2010, Brisbane, Australia.
Thank you ! Dr. Babis Thanopoulos cthanopoulos@agroknow.gr http://aginfra.eu