An architecture for peer-to-peer reasoning

An architecture for peer-to-peer reasoning George Anadiotis, Spyros Kotoulas and Ronny Siebes VU University Amsterdam

Obsolete motivation slides  Why do we need distribution… Why do we need anytime behavior… Why is should be (very) scalable… Why should we drop consistency and completeness… Why do we need trust/ontology ranking… etc

Talk outline • What is P2P? (1 slide) • Relationship between P2P and SW(3 slides) • Our Goal (1 slide) • Distributed SW stores(1 slide) • Structured P2P stores (3 slides) • Federated stores (2 slides) • Our approach (6 slides) • Future work (1 slide)

Peer-to-Peer • Class of distributed systems • Most important characteristics • Same functionality across peers • Peer autonomy • Formation of overlay networks • Common interface • They respect some agreed-upon way to organize • File-sharing networks are NOT the only Peer-to-Peer systems.

What is the relationship between Peer-to-Peer and the Semantic Web?

Peer-to-peer benefits from the Semantic Web in: Source of semantic information to self-organize Interoperability

Semantic Web benefits from Peer-to-Peer in: Common misconception: All Peer-to-Peer systems can offer the above • Scalable infrastructure for • Storage • Reasoning • Collaboration • Self-organization • Autonomy – control of data • Privacy • Scalable algorithms • Robustness • No censorship • No preferential treatment of information

Our Goal • Global-scale semantic web storage and reasoning • Scalability • Computation • Administration

Two strands of distributed semantic web stores • Structured peer-to-peer • Use DHTs • One global distributed store • Peers do not maintain their own data • Federated stores • Each peer maintains its own store • Stores are interconnected • Either global schema or mappings between schemata

Distributed Hash Tables • The mathematical abstraction for hashtables is a Map • Functionality: • put(key,value) • get(key) • Similar to normal hash-tables with the difference that each bucket now is a peer • Accessing different buckets involves network traffic • Routing to a bucket is done bothering approx. log(N) peers, N is network size

Toy DHT a b c d e f <Key=horse, Value=the horse is an animal> g h i j k l m n o p q r s t v u w x Values are stored in the peer with ID starting with the first letter of the key

RDF storage on top of DHTs Peer 1 Peer 2 <monk_seal, subClassOf, seal> <mseal1, type, monk_seal> <rabbit, subClassOf, animal> <seal, subClassOf, animal> <animal, lives_in, habitat> a b c d e f <animal, lives_in, habitat> <animal, lives_in, habitat> g h i j k l <monk_seal, subClassOf, seal> <mseal1, type, monk_seal> <rabbit, subClassOf, animal> m n o p q r <seal, subClassOf, animal> <monk_seal, subClassOf, seal> <rabbit, subClassOf, animal> <mseal1, type, monk_seal> s t v u w x <rabbit, subClassOf, animal> <seal, subClassOf, animal> <animal, lives_in, habitat>

Reasoning and inferred triples RDFS class axioms (1) <X, subClassOf, Z> <- <X, subClassOf, Y> , <Y, subClassOf, Z> (2) <X, type, Z> <- <X, type, Y>, <Y, subClassOf, Z> <rabbit, subClassOf, animal> <seal, subClassOf, animal> <animal, lives_in, habitat> a b c d e f <animal, lives_in, habitat> <animal, lives_in, habitat> g h i j k l <monk_seal, subClassOf, seal> <mseal1, type, monk_seal> <rabbit, subClassOf, animal> m n o p q r <seal, subClassOf, animal> <monk_seal, subClassOf, seal> <rabbit, subClassOf, animal> <mseal1, type, monk_seal> s t v u w x <monk_seal, subClassOf, animal> <monk_seal, subClassOf, animal> <monk_seal, subClassOf, animal>

Reasoning and inferred triples RDFS class axioms (1) FORALL O,V O[rdfs:subClassOf->V] <- EXISTS W (O[rdfs:subClassOf->W] AND W[rdfs:subClassOf->V]). (2) FORALL O,T O[rdf:type->T] <- EXISTS S (S[rdfs:subClassOf->T] AND O[rdf:type->S]). <rabbit, subClassOf, animal> <seal, subClassOf, animal> <animal, lives_in, habitat> a b c d e f <monk_seal, subClassOf, animal> <animal, lives_in, habitat> <animal, lives_in, habitat> g h i j k l <monk_seal, subClassOf, seal> <mseal1, type, monk_seal> <rabbit, subClassOf, animal> m n o p q r <monk_seal, subClassOf, animal> <mseal1, type, animal> <mseal1, type, animal> <mseal1, type, animal> <seal, subClassOf, animal> <monk_seal, subClassOf, seal> <rabbit, subClassOf, animal> <mseal1, type, monk_seal> s t v u w x <monk_seal, subClassOf, animal>

Limitations • As shown, the transitive closure has to be calculated – backwards chaining would require many DHT messages • But it does not scale to large number of ontologies. • E.g. a animal hierarchy: Adding the triple <animal, subClassOf, living_organism> means that for all triples with animal, we need to insert an additional triple. • Control over ontologies • Provenance of information • Ontologies and instance data are made public • Publishers are not in control of their ontologies/data • One super-user inserts all data

Federated Stores Each peer maintains its ontology and instance data Mappings are (manually) defined between ontologies Thus, a semantic topology is created Queries are posted according to such a schema and forwarded following these mappings Semantic Web counterpart of Federated Databases

Limitations • Bootstrapping • New peers have to manually map their ontologies to the ontology of a peer already in the network • Finding relevant ontologies requires flooding • Routing • The overlay is created according to the ontologies understood by peers, not the data they contain. Possible scalability problem. • Searching for instances requires flooding

Our approach • Effort to combine both approaches • Use a DHT to efficiently find ontologies and instance data • Exploit semantic locality by keeping ontologies local to the publisher • Whenever possible, perform reasoning peer-to-peer

Indexing Peer 1 Peer 2 <monk_seal, subClassOf, seal> <mseal1, type, monk_seal> animal:P1 a b c d e f habitat:P1 lives_in:P1 g h i j k l monk_seal:P2 mseal1:P2 rabbit:P1 m n o p q r seal:P1,P2 subClassOf:P1, P2 s t v u w x 19 <rabbit, subClassOf, animal> <seal, subClassOf, animal> <animal, lives_in, habitat>

Peer 3 Querying Query <seal, subClassOf, X?>  <Y?, subClassOf, seal> seal? Peer 1 Peer 2 <rabbit, subClassOf, animal> <seal, subClassOf, animal> <animal, lives_in, habitat> <monk_seal, subClassOf, seal> <mseal1, has_type, monk_seal> <monk_seal, subClassOf, seal> <seal, subClassOf, animal> animal:P1 a b c d e f habitat:P1 lives_in:P1 g h i j k l monk_seal:P2 mseal1:P2 rabbit:P1 m n o p q r seal:P1,P2 subClassOf:P1, P2 s t v u w x 20 20 P1, P2

Peer 3 Querying 2 Query <monk_seal, subClassOf, X?> monk_seal? Peer 1 Peer 2 <rabbit, subClassOf, animal> <seal, subClassOf, animal> <animal, lives_in, habitat> <monk_seal, subClassOf, seal> <mseal1, type, monk_seal> <seal, subClassOf, X?> <monk_seal, subClassOf, seal> <seal, subClassOf, animal> seal? animal:P1 a b c d e f habitat:P1 lives_in:P1 g h i j k l monk_seal:P2 mseal1:P2 rabbit:P1 m n o p q r P2 seal:P1,P2 subClassOf:P1, P2 s t v u w x 21 21 P1

Advantages • Control • Access Control • Select which data is published on the index • Trust – ban spammers, remember good peers • Privacy • It is possible to obfuscate descriptors stored in the DHT • Responsibility • Publisher has the responsibility to maintain own data • Scalability • DHTs can scale to millions of nodes • Data is up-to-date

Performance indicators Based on the data of swoogle, there is currently small overlap between ontologies The distribution of ontology popularity follows a power-law pattern If most answers reside on the same peer, our approach outperforms those that rely on triple distribution on top of a DHT

Current and future work Simulations using SWD from Swoogle and Watson (around 25.000) Integration of privacy in the index Selecting the right ontologies/peers

The end… ?

An architecture for peer-to-peer reasoning

An architecture for peer-to-peer reasoning

Presentation Transcript

Peer to peer

Peer to Peer

An Introduction to Peer-to-Peer networks

Peer to Peer

CSC407: Software Architecture Winter 2007 Peer to Peer

Peer-to-Peer Systems

Peer-To-Peer for Righteous Purposes

An Overview of Peer-to-Peer

Peer-to-Peer Streaming: An Hierarchical Approach

Peer-to-Peer

PEER-TO-PEER

Peer-to-Peer

Peer-to-Peer Video Solution Architecture

Peer-to-Peer

Peer to Peer

An Overview of Peer-to-Peer

Peer-to-Peer

An example of peer-to-peer application

Peer-to-Peer

7DS Peer-to-Peer Information Dissemination and Prefetching Architecture