1 / 32

The Large Knowledge Collider: A Platform for Scalable Semantic Web Reasoning

Dive into the world of semantic web technologies with The Large Knowledge Collider, a configurable platform for infinite scalability in reasoning. Join the consortium to explore innovative solutions and revolutionize data integration and search at web scale. With a mixture of logic, parallelization, and cluster computing, LarKC aims to break barriers and provide an open platform for experimentation. Join us on this journey of discovery and exploration!

sarambula
Download Presentation

The Large Knowledge Collider: A Platform for Scalable Semantic Web Reasoning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. the Large Knowledge Collider Frank van Harmelen Vrije Universiteit Amsterdam Creative Commons License: allowed to share & remix, but must attribute & non-commercial

  2. The vision • The project • The consortium • The plan Yes! Oh Shit…

  3. The Vision “a configurable platform for infinitely scalable semantic web reasoning”

  4. Why we needThe Large Knowledge Collider Gartner (May 2007): "By 2012, 70% of public Web pages will have some level of semantic markup, 20% will use more extensive Semantic Web-based ontologies” • Semantic Technologies at Web Scale? • 20% of 30 billion pages @ 1000 triples per page = 6 trillion triples • 30 billion and 1000 are underestimates, imagine in 6 years from now… • data-integration and semantic search at web-scale?

  5. 1 triple:

  6. 107 Triples [OWLIM] Suez Canal Denny Vrandečić – AIFB, Universität Karlsruhe (TH) 9http://www.aifb.uni-karlsruhe.de/WBS

  7. RDF Store subsecond querying 108 Triples [Ingenta] Moon Denny Vrandečić – AIFB, Universität Karlsruhe (TH) 10http://www.aifb.uni-karlsruhe.de/WBS

  8. ~109 Triples Earth Denny Vrandečić – AIFB, Universität Karlsruhe (TH) 11http://www.aifb.uni-karlsruhe.de/WBS

  9. [LarKC proposal] ~1010 Triples ≈ 1 triple per web-page Jupiter ≈ 1 triple per web-page Denny Vrandečić – AIFB, Universität Karlsruhe (TH) 12http://www.aifb.uni-karlsruhe.de/WBS

  10. ~1011 Triples

  11. ~1014 Triples Distance Sun – Pluto Fensel / Harmelen estimate 1014 Triples Denny Vrandečić – AIFB, Universität Karlsruhe (TH) 14http://www.aifb.uni-karlsruhe.de/WBS

  12. precision (soundness) recall (completeness) Infinitely scalable (1/2) • by giving up 100% correctness: • trading quality for size • often completeness is not needed • sometimes even correctness is not needed logic A logician’s nightmare (Dieter Fensel) Semantic Web IR

  13. Infinitely scalable (2/2) • by parallelisation: • cluster computing • wide area distribution “Thinking@home”, “self-computing semantic Web” • cloud computing? (Amazon now, Google soon?)

  14. “Configurable platform” “a configurable platform for infinitely scalable semantic web reasoning”

  15. Why “LarKC” ? • The Large Knowledge Collider A configurable platform for experimentation by others

  16. Why “LarKC” ? But also: and also: 1. a merry, carefree adventure. 2. innocent or good-natured mischief; a prank. 3. something extremely easy to accomplish

  17. The vision • The consortium • The project • The plan

  18. The consortium 50 people present

  19. The Consortium • Combining consortium competence • IR, Cognition • ML, Ontologies • Statistics, ML, Cognition,DB • Logic,DB, Probabilistic Inference • Economics, Decision Theory

  20. The Consortium

  21. The vision • The consortium • The project • The plan Oh Shit…

  22. The project • 10M€ budget • 3.5 years • 80 person years • 3 case studies • 14 partners • obtained in FP7 Call1: • overall < 10% funding rate • LarKC has highest funding, longest runtime

  23. Project Workpackages& timeline WP 9: Exploitation and standards WP 10: Project Management WP1 – Conceptual Framework & Evaluation WP 2: Retrieval and Selection WP3: Abstraction and Learning WP4: Reasoning and Deciding WP 8: Training, dissemination, community building WP5: Collider Platform WP 6: Use case: Real Time City WP 7a: Use case: Early Clinical Development WP 7b: Use case: Carcinogenesis Reference Production

  24. “Show me all liver toxicity associated with compounds with similar structure” Show me all liver toxicity associated with the target or the pathway. “Show me all liver toxicity from the public literature and internal reports that are related to the drug class, disease and patient population” Genetics Chemistry LITERATURE Current NCBI: linking but no inference Use case: Drug Discovery FDA white paper Innovation or Stagnation (March 2004): “developers have no choice but to use the tools of the last century to assess this century's candidate solutions.” “industry scientists often lack cross-cutting information about an entire product area, or information about techniques that may be used in areas other than theirs” • Problem: pharmaceutical R&D in early clinical development is stagnating “Show me any potential liver toxicity associated with the compound’s drug class, target, structure and disease.” (Q1Q2Q3)

  25. Is public transportation where the people are? • How can we redevelop existing neighborhoods and business districts to improve the quality of life? • How can we create more choices in housing, accommodatingdiverse lifestyles and all income levels? • How can we reduce traffic congestion yet stay connected? • How can we include citizens in planning their communities rather than limiting input to only those affected by the next project? • How can we fund schools, bridges, roads, and clean water while meeting short-term costs of increased security? Which landmarks attract more people? Where are people concentrating? Where is traffic moving? Use Case: City on-line • Our cities face many challenges • Urban Computingis the ICT way to address them

  26. The vision • The consortium • The project • The plan Oh Shit…

  27. Surveys (plugins, platform) • Requirements (use cases) Prototype Internal Release Public Release Final Release 0 6 10 18 33 42 Use Cases V2 Use Cases V3 Use Cases V1 Project Timeline

  28. Communication • Early Access Group • Usage Competition • “we will win if we start to loose” • We deliver: • software • publications • not “deliverables”

  29. And Finally…. • People are already looking at us: • “Damn... the EU is where all the cool semweb work is happening these days” • “This kind of infrastructure is exactly the kind of rocket fuel that is needed at this stage of semweb maturity.” • “The LarKC-inspired workshop on new forms of reasoning for the semantic web was a conference highlight for me” • “With the current growth rates of RDF on the Web, LarKC which started out as technologically possible will quickly become operationally necessary” • “this project really has it all (potentially) in terms of both science and impact” • “projects already seeking collaboration:OKKAM, MUSING “This project has the potential to change the way people work in this area”

More Related