110 likes | 212 Views
Anatomy of a Semantic Virus. Peyman Nasirifard peyman.nasirifard@deri.org. Nature inspired Reasoning for the Semantic Web (NatuReS) 7th International Semantic Web Conference (ISWC 2008) Karlsruhe, Germany 27th October 2008. What Do We Have Now?.
E N D
Anatomy of a Semantic Virus Peyman Nasirifard peyman.nasirifard@deri.org Nature inspired Reasoning for the Semantic Web (NatuReS) 7th International Semantic Web Conference (ISWC 2008) Karlsruhe, Germany27th October 2008
What Do We Have Now? • We have currently Semantic-Web-Oriented applications and APIs • Semantic digital libraries • SIOC-enabled shared workspaces • Semantic URL shorten tools • Semantic Wiki • Semantic blog • Lots more... • The applications „talk“ in RDF • Importing and exporting RDF • These applications provide partially „food“ for Semantic Search engines
What Do We Have Now? (2) • Semantic-Web-Oriented researchers (including me :-) encourage others • Use RDF, Publish RDF, Talk RDF! • Sematnic search engines • Finding RDF-related materials from the Web • Indexing them • Querying and reasoning over data • Sematnic search engines are RDF-hungry • „Submit RDF to us“ • Crawl deep Web • „Tell us where you saw an RDF document“ • They monitor services like „pingthesemanticweb.com“
What Do We Have Now? (3) • Users can submit their RDF data using services like Ping The Semantic Web (PTSW) • Feeds of the PTSW are further used • Search engines follow the links and index RDF data • We have services like DBpedia • DBpedia is a community effort to extract structured information from Wikipedia and to make this information available on the Web (source: http://dbpedia.org/About) • Can be used for reasoning
Real World • Common Sense facts • Milk is white • Lions eat meat • Web (e.g. Wikipedia) is for humans, whereas Semantic Web (e.g. DBpedia) aims to be for machines. • Humans have wisdom and can distinguish ridiculous common sense facts, but machines can not detect them and will use them in reasoning. • Do you trust Wikipedia articles? • How much? • Why is not Wikipedia cited in scientific articles? • What about DBpedia? • Can we really benefit from the generated RDF? • If we can not trust Wikipedia articles, how can we use DBpedia for further reasoning?
Some Facts and Discussions • Fake knowledge can exist on the Semantic Web • Maliciously: Semantic Virus • Non maliciously: Human faults (machines will not have faults) • Semantic Web is NOT just FOAF and FOAF-based computing • Semantic Web does not grow as fast as the Web • Google has indexed one Trillion pages (source: Google official blog) • Such attacks are not feasible on the Web • Because We as humans can understand some common sense facts, but machines do not have the common sense facts that we have
Some Facts and Discussions (2) • Trust and Proof (and perhaps logic) are the layers that the possible virus target • Digital Signature can not address such issues. • Information quality issues (e.g. validity) • Trusting on RDF sources? • We trust mostly on sources (e.g. We trust on LiveJournal RDF files, because Livejournal is a trusted party) • We trust SIOC plugins that generate SIOC • But can we limit knowledge providers to just some sources? • Internet does not do it, so we needto accetp RDF from everyone!
Conclusions • Future works based on developing the virus is not really recommended! • The paper opens some research areas in the trust layer of the Semantic Web tower • How much do you trust DBpedia? • How can we ensure that RDF is not fake? • Should we revise all RDF statements using some references? • Where do we get the references? • Do we need a global, peer-reviewed and always up-to-date common sense facts repository? • Sounds very difficult or even impossible • Can we benefit from nature-inspired reasoning? • Can we use statistical approaches?
Thank you for your attention! Questions? Comments?Please contact Peyman: peyman.nasirifard@deri.org