Search in Distributed Networks

Search in Distributed Networks Lecture: Peer-to-peer networksProfessor: Dr. Robert TolksdorfElena Antonenko elena.Antonenko@web.deMalte Münchert muencher@inf.fu-berlin.deJing Zhao zhao@inf.fu-berlin.deShunfeng Zhang zhang@inf.fu-berlin.de

Language of the talk: • English instead of German! • Comment: German is also a very beautiful language! • Question can asked in German!

Structure of our talk: • Introduction • Content-Agnostic Search (Shunfeng); • Contect-Based Search (Elena); • Pastry(Malte); • JXTA Search (Jing)

Introduction • Most applications (file sharing, instant-messaging, chatting) involve • finding objects and resource of interest • exchanging resources with other peers. • Accomplished by a system of advertisements and queries

Introduction • Advertisement/query model: • Resource providers publish resource and resource consumer send • search queries; • Resource seekers advertise needs on the network and resource providers query the network for resource;

Introduction • The problem reduced to: • query a dynamic and distributed directory of • advertiesements by advertisement consumers • Distributed directory is built using a subset of all the peers in the network

Content-Agnostic Search >>>basic concept Organization of the peers not depend on the resources they index or point to;

Content-Agnostic Search >>> central mediator • Register content with the central server; • Query the central server for Information; • Roles of central server: • Matchmaker • Broker;

Content-Agnostic Search >>> central mediator as Matchmaker ASK-ALL: who can help? Reply: name1 + info1… Unadvertise Advertise STREAM-All „request“ REPLY… Matchmaker Requester Peer

Content-Agnostic Search >>> central mediator as Matchmaker • Requester: an agent with an objective that it wants to be achieved by some other agent. • Matchmaker: an agent that • knows the names of many agents • and their corresponding capabilities. • Server: an agent that has committed itself to fulfilling objectives on behalf of other agents.

Content-Agnostic Search >>> central mediator as Matchmaker

Content-Agnostic Search >>> central mediator as Broker STREAM-ALL: „Request“ REPLY Unadvertise Advertise Broker Requester Peer

Content-Agnostic Search >>>central mediator as Broker • Requester: an agent that has an objective that the agent wants to has achieved by another agent. • Broker: • an agent that knows the names of some other agents and their corresponding capabilities, • and advertises its own capabilities as some function of the capabilities of these other agents. • Brokered Server: an agent that has committed to the broker to taking on a predetermined class of objectives.

Advantages Comprehensive Fast update Minimized messages exchange Disadvantages Central point failure Non-scalabe Needing central authority Comment: Be solved with decentralized mediator Content-Agnostic Search >>>central mediator

Content-Agnostic Search

Content-Agnostic Search >>>Network forming random connected Graphs • Nodes are connected to few random neighbors • Example: Gnutella network • Already done in 2.nd Talk in the Lecture • Power Law Networks The search takes advantage of the power law link distribution of naturally occurring networks

Content-Agnostic Search >>>Power Law Networks

Content-Agnostic Search >>>Power Law Networks • Power law distribution:few nodes have very high connectivitymany nodes with very low connectivity

Content-Agnostic Search >>>Power Law Networks

Content-Agnostic Search >>>Power Law Networks Rule: Each time: one node two edges connect to node with higher degree

Content-Agnostic Search --Power Law Networks

Content-Agnostic Search >>>Power Law Networks • Power law graphs are dynamically constructed • the rewiring of nodes occurs not randomly, but preferentially attaching to the most connected nodes.

Content-Agnostic Search >>>Power Law Networks • Power law search algorithm • needs modification to the basic Gnutella approach;

the Gnutella approach Broadcasting to all neighbors Can exchange with every neighbors Modified Gnutella the neighbor with highest connechtions Exchange with the first- and second-degree neighbors Content-Agnostic Search >>>Power Law Networks

Content-Agnostic Search >>>Power Law Networks • Advantages of PLN • Networks of decentralized mediators • Broadcasting queries to all neighbors avoided • Search cost reduced

Content-Based Search: Introduction • Content of queries is used to efficiently route the messages to the most relevant peers • Search techniques include: • Content-mapping networks; • Some variations of publish/subscribe networks; Content-Based Search

Content – Mapping Search Networks • All peer in network index a „zone“ of the advertisement space • The zone is dynamic • Size of the zone depends on the number of peers • Peers map advertisement content to the space • Mapping is performed using hash functions • Examples include: CAN, Chord, Tapestry, Pastry Content-Based Search

Distributed Hash Table (DHT) • DHT provides the same functionality as traditional hash table • DHT stores key value pair • Data structure is distributed over different nodes • Provides functions: • insert(id, item); • item = query(id); • Item can be anything: a data object, document, file, pointer to a file Content-Based Search

Content Addressable Network (CAN) • CAN is based on virtual d-dimensional coordinate space • Associate to each node and item a unique idin an d-dimensional space • Goals • Scales to hundreds of thousands of nodes • Handles rapid arrival and failure of nodes Content-Based Search

Space divided between nodes All nodes cover the entire space Each node covers either a square or a rectangular area Example: Node n1: (1, 2) first node that joins  cover the entire space CAN Example: Two Dimensional Space Content-Based Search

Node n2: (4, 2) joins space is divided between n1 and n2 CAN Example: Two Dimensional Space Content-Based Search

Node n3:(3, 5) joins too CAN Example: Two Dimensional Space Content-Based Search

Nodes n4:(5, 5) and n5:(6,6) join CAN Example: Two Dimensional Space Content-Based Search

Nodes: n1:(1, 2); n2:(4,2); n3:(3, 5); n4:(5,5); n5:(6,6) Items: f1:(2,3); f2:(5,0); f3:(2,1); f4:(7,5) CAN Example: Two Dimensional Space Content-Based Search

Each item is stored by the node who owns its mapping in the space CAN Example: Two Dimensional Space Content-Based Search

Each node knows ist neighbors in the d-space Forward query to the neighbor that is closest to the query id Example: assume n1 queries f4 CAN: Query Example Content-Based Search

CAN Routing • For d dimensions with n equal zones each node has 2d neighbors • Routing table size O(d) • Guarantees that a file is found in at most d x n 1/d steps, where n is the total number of nodes • Algorithm: Choose the neighbor nearest to the destination Content-Based Search

CAN: Multi-Dimension • Increase in the dimension reduces the path length Content-Based Search

Chord: Introduction • Chord is a distributed lookup protocol • Given a key (data item), it maps the key onto a node (peer). • Hash function assigns each node and key anm-bit identifier. • A node’sidentifier is defined by hashing the node’s IP address. • A key identifier is produced by hashing the key • ID(node) = hash(196.178.0.1) • ID(key) = hash(“jingle-bells.mp3”) Content-Based Search

Chord: Data Structure • Identifiers are ordered in a virtual ring of size 2m • Each node maintains • Finger table • Entry iin the finger table of node nis the first node that succeeds or equals n + 2i: successor(id) • Predecessor node • An item identified by idis stored on the successor node of id Content-Based Search

Chord: Example • Assume an identifier space 0..7 • Node n1:(1) joins all entries in its finger table are initialized to itself Content-Based Search

Chord: Example • Nodes n2:(2), n0:(0), n6:(6) join Content-Based Search

Chord: Example Nodes: n0(0),n1:(1), n2(2), n6(6) Items: f1:(1), f7:(7) Content-Based Search

Chord: Example Upon receiving a query for item id, a node • Check whether stores the item locally • If not, forwards the query to the largest node in its successor table that does not exceed id Content-Based Search

Chord: Properties • Routing table size O(log(N)) , where N is the total number of nodes • Guarantees that a file is found in O(log(N)) steps Content-Based Search

Pastry - Introduction • Decentralized and scalable DHT-network • Designed for efficient message routing between nodes

What does DHT mean? • Distributed Hash Table • Hash value for every peer • Every peer has knowledge of some other peers (stored in a hash table) • All hash tables from all peers represent a complete map for all peers

Peers reside on a virtual circle made up from all possible addresses Blue points represents peers The Pastry namespace 2128 20

Message is sent to (known) node which is numerically closest to the target-node Procedure is repeated until target-node is reached Pastry routing Origin Closest to target Distance Destination

Message is sent to (known) node which is numerically closest to the target-node Procedure is repeated until target-node is reached Pastry routing Origin Destination

Search in Distributed Networks

Search in Distributed Networks

Presentation Transcript

Wireless Distributed Sensor Networks

Robust Distributed Services in Embedded Networks

Distributed Event Detection in Sensor Networks

Distributed Search with Rendezvous Search Systems

Communication Networks in Distributed Systems

Distributed Data Classification in Sensor Networks

CSE6809-Distributed Search Techniques

Range Queries in Distributed Networks

Distributed Perception Networks

Distributed Storage Networks

Security Issues in Distributed Sensor Networks

CSE6809-Distributed Search Techniques

Improving Search in P2P Networks

Supporting Ranked Search in Parallel Search Cluster Networks

Telecommunication Networks Distributed P arameter Networks

Improving Search in P2P Networks

Search In Small World Networks

Search in structured networks

Search in Unstructured Networks

Networks and Distributed Systems

Networks for Distributed Systems

Security Issues in Distributed Sensor Networks