320 likes | 332 Views
Dive into the world of semantic web technologies with The Large Knowledge Collider, a configurable platform for infinite scalability in reasoning. Join the consortium to explore innovative solutions and revolutionize data integration and search at web scale. With a mixture of logic, parallelization, and cluster computing, LarKC aims to break barriers and provide an open platform for experimentation. Join us on this journey of discovery and exploration!
E N D
the Large Knowledge Collider Frank van Harmelen Vrije Universiteit Amsterdam Creative Commons License: allowed to share & remix, but must attribute & non-commercial
The vision • The project • The consortium • The plan Yes! Oh Shit…
The Vision “a configurable platform for infinitely scalable semantic web reasoning”
Why we needThe Large Knowledge Collider Gartner (May 2007): "By 2012, 70% of public Web pages will have some level of semantic markup, 20% will use more extensive Semantic Web-based ontologies” • Semantic Technologies at Web Scale? • 20% of 30 billion pages @ 1000 triples per page = 6 trillion triples • 30 billion and 1000 are underestimates, imagine in 6 years from now… • data-integration and semantic search at web-scale?
107 Triples [OWLIM] Suez Canal Denny Vrandečić – AIFB, Universität Karlsruhe (TH) 9http://www.aifb.uni-karlsruhe.de/WBS
RDF Store subsecond querying 108 Triples [Ingenta] Moon Denny Vrandečić – AIFB, Universität Karlsruhe (TH) 10http://www.aifb.uni-karlsruhe.de/WBS
~109 Triples Earth Denny Vrandečić – AIFB, Universität Karlsruhe (TH) 11http://www.aifb.uni-karlsruhe.de/WBS
[LarKC proposal] ~1010 Triples ≈ 1 triple per web-page Jupiter ≈ 1 triple per web-page Denny Vrandečić – AIFB, Universität Karlsruhe (TH) 12http://www.aifb.uni-karlsruhe.de/WBS
~1014 Triples Distance Sun – Pluto Fensel / Harmelen estimate 1014 Triples Denny Vrandečić – AIFB, Universität Karlsruhe (TH) 14http://www.aifb.uni-karlsruhe.de/WBS
precision (soundness) recall (completeness) Infinitely scalable (1/2) • by giving up 100% correctness: • trading quality for size • often completeness is not needed • sometimes even correctness is not needed logic A logician’s nightmare (Dieter Fensel) Semantic Web IR
Infinitely scalable (2/2) • by parallelisation: • cluster computing • wide area distribution “Thinking@home”, “self-computing semantic Web” • cloud computing? (Amazon now, Google soon?)
“Configurable platform” “a configurable platform for infinitely scalable semantic web reasoning”
Why “LarKC” ? • The Large Knowledge Collider A configurable platform for experimentation by others
Why “LarKC” ? But also: and also: 1. a merry, carefree adventure. 2. innocent or good-natured mischief; a prank. 3. something extremely easy to accomplish
The vision • The consortium • The project • The plan
The consortium 50 people present
The Consortium • Combining consortium competence • IR, Cognition • ML, Ontologies • Statistics, ML, Cognition,DB • Logic,DB, Probabilistic Inference • Economics, Decision Theory
The vision • The consortium • The project • The plan Oh Shit…
The project • 10M€ budget • 3.5 years • 80 person years • 3 case studies • 14 partners • obtained in FP7 Call1: • overall < 10% funding rate • LarKC has highest funding, longest runtime
Project Workpackages& timeline WP 9: Exploitation and standards WP 10: Project Management WP1 – Conceptual Framework & Evaluation WP 2: Retrieval and Selection WP3: Abstraction and Learning WP4: Reasoning and Deciding WP 8: Training, dissemination, community building WP5: Collider Platform WP 6: Use case: Real Time City WP 7a: Use case: Early Clinical Development WP 7b: Use case: Carcinogenesis Reference Production
“Show me all liver toxicity associated with compounds with similar structure” Show me all liver toxicity associated with the target or the pathway. “Show me all liver toxicity from the public literature and internal reports that are related to the drug class, disease and patient population” Genetics Chemistry LITERATURE Current NCBI: linking but no inference Use case: Drug Discovery FDA white paper Innovation or Stagnation (March 2004): “developers have no choice but to use the tools of the last century to assess this century's candidate solutions.” “industry scientists often lack cross-cutting information about an entire product area, or information about techniques that may be used in areas other than theirs” • Problem: pharmaceutical R&D in early clinical development is stagnating “Show me any potential liver toxicity associated with the compound’s drug class, target, structure and disease.” (Q1Q2Q3)
Is public transportation where the people are? • How can we redevelop existing neighborhoods and business districts to improve the quality of life? • How can we create more choices in housing, accommodatingdiverse lifestyles and all income levels? • How can we reduce traffic congestion yet stay connected? • How can we include citizens in planning their communities rather than limiting input to only those affected by the next project? • How can we fund schools, bridges, roads, and clean water while meeting short-term costs of increased security? Which landmarks attract more people? Where are people concentrating? Where is traffic moving? Use Case: City on-line • Our cities face many challenges • Urban Computingis the ICT way to address them
The vision • The consortium • The project • The plan Oh Shit…
Surveys (plugins, platform) • Requirements (use cases) Prototype Internal Release Public Release Final Release 0 6 10 18 33 42 Use Cases V2 Use Cases V3 Use Cases V1 Project Timeline
Communication • Early Access Group • Usage Competition • “we will win if we start to loose” • We deliver: • software • publications • not “deliverables”
And Finally…. • People are already looking at us: • “Damn... the EU is where all the cool semweb work is happening these days” • “This kind of infrastructure is exactly the kind of rocket fuel that is needed at this stage of semweb maturity.” • “The LarKC-inspired workshop on new forms of reasoning for the semantic web was a conference highlight for me” • “With the current growth rates of RDF on the Web, LarKC which started out as technologically possible will quickly become operationally necessary” • “this project really has it all (potentially) in terms of both science and impact” • “projects already seeking collaboration:OKKAM, MUSING “This project has the potential to change the way people work in this area”