310 likes | 323 Views
Georgios Larkou, Julia Metochi Georgios Chatzimilioudis and Demetrios Zeinalipour-Yazti Data Management Systems Laboratory Department of Computer Science University of Cyprus http://dmsl.cs.ucy.ac.cy/. CLODA: A C rowdsourced L inked O pen D ata A rchitecture.
E N D
Georgios Larkou, Julia Metochi Georgios Chatzimilioudis and Demetrios Zeinalipour-Yazti Data Management Systems Laboratory Department of Computer Science University of Cyprus http://dmsl.cs.ucy.ac.cy/ CLODA: A Crowdsourced Linked Open Data Architecture First IEEE Intl. Workshop on Mobile Data Management, Mining and Computing on Social Networks (MobiSocial), co-located with IEEE MDM'13, June 3, 2013, Milan, Italy.
Crowdsourcing Definitions • Crowdsourcing = Crowd + Outsourcing • Jeff Howe (2006). "The Rise of Crowdsourcing". Wired. • Definition from Wikipedia: • "Crowdsourcing refers to a distributed problem-solving model in which a crowd of undefined size is engaged in the task of solving a complex problem through an open call" “Crowdsourcing with Smartphones”, GeorgiosChatzimiloudis, Andreas Konstantinidis, Christos Laoudias, Demetrios Zeinalipour-Yazti, IEEE Internet Computing, Special Issue: Sep/Oct 2012 - Crowdsourcing, May 2012. IEEE Press, Volume 16, Pages: 36-44, 2012.
Crowdsourcing StakeHolders Open Call (Task) Solutions Requester (Crowdsourcer) Rewards Workers (Solvers) Platform
Crowdsourcing Incentives • Tangible (Monetary) Incentives • Cash, Credit or Gifts (MTurk, Kickstarter) • Unintended or as-a-by-product (reCaptchas) • Ethical Incentives • Socialize & Fun • Earn Prestige • Altruism • Learn something New • Usually a combination of several incentives
CLODA Motivation Collect & Link Open and Close Data with smartphones through an open call for ethical benefit b Closed Data: Web 2.0 APIs (Google, Twitter, Facebook) CLODA Prototype c Linked Open Data (LOD) (Freebase, DBPedia) Sensor / Geo Data a Collect Reward Verification d CLODA (LOD)
Linked Open Data (LOD) • Linked data refers to web accessible data(HTTP/URI addressable) that is structured (RDF) to allow computerlink and query (SPARQL) the data [ aka 4 principles ] • Conceived by Tim Berners-Lee and realized by the Semantic Web Community. • Tim Berners-Lee (2006). "Linked Data—Design Issues". W3C. • Linked Open Datasets (LOD): • DBPedia (3.6M things describing Wikipedia) • GeoNames (describing 7.5M Geographic features) • YAGO & YAGO2 (combining Wikipedia, GeoNames and others) • Freebase (39M things - Google's LOD project!) • FOAF (describing People – Relationships –Open Social Net!)
Traditional LOD (RDF) • DBPedia Example: LOD describing 3.64M Wikipedia things out of which 1.83M are classified in a consistent ontology • 416,000 persons, 526,000 places, 106,000 music albums, 60,000 films, 17,500 video games, 169,000 organizations, 183,000 species and 5,400 diseases Infobox example
LOD the Google (JSON) Way! • Freebase Example: another LOD by Google. Instances Relationships
LOD is Interlinked & Annotated LOD is interlinked, e.g., Freebase points to Dbpedia predicates … http://dbpedia.org/page/Manchester
LOD is Highly Interlinked! http://www.stateofsearch.com/search-in-the-knowledge-graph-era/
LOD can be Queried! LOD can be queried!, e.g., Freebase with MQL Queries (JSON encoded like Web2.0 APIs) Resembles Xpath Querying MQL Queries
CLODA Motivation Collect & Link Open and Close Data with smartphones through an open call for ethical benefit Closed Data: Web 2.0 APIs (Google, Twitter, Facebook) CLODA Prototype Linked Open Data (LOD) (Freebase, DBPedia) Sensor / Geo Data Collect Reward Verification CLODA LOD
Web1.0: The Unstructured Web http://books.google.com/ (content in HTML only apprehensible to User)
Web2.0: The Structured but Closed Web https://www.googleapis.com/books/v1/volumes?q=databases content in XML/JSON apprehensible to Computer This web is closed: Requires Keys to access (OAUTH) and has download quotas
Web2.0: The Structured but Closed Web In fact, Web2.0 Services are omnipresent! (Google, Twitter, Facebook, Youtube, Linkedin, …) http://www.programmableweb.com/- 7800 APIs!!! + 6800 Mashups! quota https://code.google.com/apis
CLODA Motivation Collect & Link Open and Close Data with smartphones through an open call for ethical benefit Closed Data: Web 2.0 APIs (Google, Twitter, Facebook) CLODA Prototype Linked Open Data (LOD) (Freebase, DBPedia) Sensor / Geo Data Collect Reward Verification CLODA LOD
CLODA: Indoor Localization • Numerous ways to localize without power-hungry GPS but most of the solutions rely on additional hardware (RFID, sensor networks, etc.) • Smartphones can nowadays localize off-the-shelf with onboard sensors and WiFi signal fingerprints (coined Hybrid Localization) • These solutions require that services acquire local data through Crowdsourcing (e.g., Google Indoor) • Building RadioMaps, MagnetometerMap, etc. • [Airplace] "The Airplace Indoor Positioning Platform for Android Smartphones", C. Laoudias et. al., Best Demo Award at IEEE MDM'12. • [HybridCywee] "Demo: the airplace indoor positioning platform", C.-L. Li, C. Laoudias, G. Larkou, Y.-K. Tsai, D. Zeinalipour-Yazti and C. G. Panayiotou, in ACM Mobisys'13. Video at: http://youtu.be/DyvQLSuI00I • Wifislam.com (bought recently by Apple for 20M)
CLODA: Scanning Items Preview LOD LOD Scan & Link
CLODA: Indoor Localization Founded on prior work Airplace and Anyplace (Navigate) seamlessly indoor or outdoor http://anyplace.cs.ucy.ac.cy/ Cywee / Airplace http://youtu.be/DyvQLSuI00I
CLODA Motivation Collect & Link Open and Close Data with smartphones through an open call for ethical benefit Closed Data: Web 2.0 APIs (Google, Twitter, Facebook) CLODA Prototype Linked Open Data (LOD) (Freebase, DBPedia) Sensor / Geo Data Collect Reward Verification CLODA LOD
NoSQL DataStore: CouchDB Document in CouchDB Map Function function(doc) { for (i in doc.authors) { author = doc.authors[i]; emit(doc._id, author); } } Results (through REST/HTTP or Futon)
NoSQL DataStore: CouchDB Export JSON Data to RDF with Sessel (CouchApp that generates RDF triples from CouchDB documents)
CLODA Motivation Collect & Link Open and Close Data with smartphones through an open call for ethical benefit Closed Data: Web 2.0 APIs (Google, Twitter, Facebook) CLODA Prototype Linked Open Data (LOD) (Freebase, DBPedia) Sensor / Geo Data Collect Reward Verification Last step essentially provided interlinking with existing LOD CLODA LOD
CLODA on the LOD Graph! CLODA http://www.stateofsearch.com/search-in-the-knowledge-graph-era/
CLODA Incentives • Incentives: • Ethical Benefit • Similar to people-centric sensing / wardriving • Enhance collaboration between users • Richer querying possibilities in the future. • Might be imposed by an Organization. • E.g., Inventory Management in a Hospital
CLODA Quality Issues • Data Quality / Data Freshness • LOD data suffers from both of these aspects • Crowdsourcing offers Freshness but still lacks explicit Quality guarantees (repeat N times then majority vote) • Possible Solutions: Integrate Location-aware Techniques to validate data added and linked • Task of Identifying the neighbors of all users continuously with Proximity, see IEEE MDM'12. • Task of Identifying similarly moving users with SmartTrace, see IEEE TKDE, June, 2013.
CLODA Testbed Issues • Currently, there are no testbeds (like motelab, planetlab) for realistically prototyping Smartphone Network applications and protocols at a large scale. • Currently applications are tested in emulators. • Sensors are not emulated. • Reprogramming is difficult. • SmartLab (http://smartlab.cs.ucy.ac.cy/) is a first-of-a-kind programmable cloud of 40+ smartphones deployed at our department enabling a new line of systems-oriented research on smartphones. "Crowdsourcing with Smartphones", Georgios Chatzimiloudis, Andreas Konstantinides, Christos Laoudias, Demetrios Zeinalipour-Yazti IEEE Internet Computing (IC '12), Special Issue: Sep/Oct 2012 - Crowdsourcing, May 2012. IEEE Press, 2012 "Demo: A Programming Cloud of Smartphones", A. Konstantinidis, C. Costa, G. Larkou and D. Zeinalipour-Yazti, "Demo at the 10th International Conference on Mobile Systems, Applications and Services" (Mobisys '12), Low Wood Bay, Lake District, UK, 2012.
CLODA Testbed Issues SmartLab: Massive smartphone simulations with our first global open smartphone IaaS cloud – http://smartlab.cs.ucy.ac.cy/ Static Androids Mobile Androids • [SmartLab] "Demo: a programming cloud of smartphones", A. Konstantinidis, C. Costa, G. Larkou, D. Zeinalipour-Yazti, In ACM Mobisys '12. [ By our Group ]
CLODA Testbed Issues http://smartlab.cs.ucy.ac.cy/
Thanks! Questions? Georgios Larkou, Julia Metochi Georgios Chatzimilioudis and Demetrios Zeinalipour-Yazti http://dmsl.cs.ucy.ac.cy/ CLODA: A Crowdsourced Linked Open Data Architecture First IEEE Intl. Workshop on Mobile Data Management, Mining and Computing on Social Networks (MobiSocial), co-located with IEEE MDM'13, June 3, 2013, Milan, Italy.