160 likes | 320 Views
Tutorial on data aggregation and accessing datasets. Nikos Manolis Agro-Know Technologies. There is a lot of data. Need for data aggregation and harmonization. Objectives. This presentation aims to provide information on: How to use a service for aggregating datasets
E N D
Tutorial on data aggregation and accessing datasets Nikos Manolis Agro-Know Technologies
Objectives This presentation aims to provide information on: • How to use a service for aggregating datasets • How to get already processed datasets • How to search processed datasets with a search API • Educational – GLN API (21008 res) • Bibliographic – ABN API (451602 res)
The agDataHarvester service • Implements the OAI-PMH protocol to harvest metadata records from open data providers • REST-based API • Harvested dataset available through HTTP
agDataHarvester parameters { "document_type": "harvesting_target", "harvesting_target": { "name":"Repository name", "description":”ShortRepositoryDescription", "url":"OAI-PMH target URL", "type":"metadata format prefix", "frequency":hours } }
param.json { "document_type": "harvesting_target", "harvesting_target": { "name":"Indian Academy of Science", "description":"Indian Academy of Science", "url":"http://repository.ias.ac.in/cgi/oai2", "type":"mets", "frequency":24 } } • curl -X POST -d@param.json tp://'demo001':aginfra@agro.ipb.ac.rs/agcouchdb • { "ok": true, "id": " 5c56a3fa18fa21d2a85fd63cc9eb78ac ", "rev": "1-19ef1210376df8f1695a32b53ecb963a" } curl -X POST -d@INDUS.json http://'demo001':aginfra@agro.ipb.ac.rs/agcouchdb
Get details on the dataset http://agro.ipb.ac.rs/agcouchdb/_design/datasets/_list/search/list?dataset.process_parameter_id=5c56a3fa18fa21d2a85fd63cc9eb78ac
Get details on the dataset http://agro.ipb.ac.rs/agcouchdb/_design/datasets/_view/list_by_process?key=agdataharvester {"id": "6796259b52d79e4797e210c06e6a0aee", "key": "6796259b52d79e4797e210c06e6a0aee", "value": { "_id": "6796259b52d79e4797e210c06e6a0aee", "_rev": "1-d55d7bc90d26db64dae328c9328e4e4a", "document_type": "harvesting_target", "harvesting_target": { "name": “WorldBank", "description": "The World Bank - Open Knowledge Repository", "url": ""https://openknowledge.worldbank.org/oai/request", "type": “mets", "frequency": 24 }, "document_publisher": { "address": "83.212.96.169", "author": "demo001", "utc_datetime": "Wed Dec 11 11:58:45 2013", "utc_timestamp": 1386763125 } } }
The agWorkflow service I wantall datasets with educational resources processed by the agINFRA powered aggregation workflow ! http://agro.ipb.ac.rs/agcouchdb/_design/datasets/_list/search/list? dataset.process=agworkflow&dataset.type=oai_lom&dataset.accuracy=true I wantall datasets with bibliographic resources processed by the agINFRA powered aggregation workflow ! http://agro.ipb.ac.rs/agcouchdb/_design/datasets/_list/search/list? dataset.process=agworkflow&dataset.type=oai_agris&dataset.accuracy=true
Search API • REST-based queries over harmonized information (resultof metadata processing) • Two data models supported • akif: describing educational resources for agriculture, http://54.228.180.124:8080/search-api/v1/akif/?q=* • agrif: describing bibliographic resources for agriculture (mainly from FAO’s data), http://212.189.145.241:8080/search-api/v1/agrif/?q=*
Search options • Simple search http://BASE_URL/search-api/v1/akif/?q=tomato • Searching within specific fields http://BASE_URL/search-api/v1/akif/?languageBlocks.en.description=tomato • Temporal http://BASE_URL/search-api/v1/akif/?creationDate=2013-04-16 • Fetching specific items http://BASE_URL/search-api/v1/akif/COLLECTION/20296
Managing results • Sorting results e.g ?q=*&sort_by=creationDate&sort_order=desc • Facets e.g ?facets=set&facet_size=3 • Pagination e.g?q=sea&page_size=25&page=3 Full Documentation : 54.228.180.124:8080/search-api/
Nikos Manolis Agro-Know Technologies manolisn@agroknow.gr