140 likes | 274 Views
Intrinsic References in Distributed Systems. Presented by: Nimish Pachapurkar. Snapshot:. To contrast and compare Intrinsic References with Physical References. Storage and Retrieval mechanism using intrinsic references : Elephant Store
E N D
Intrinsic References in Distributed Systems Presented by: Nimish Pachapurkar ScaLAB seminar 21st October 2002
Snapshot: • To contrast and compare Intrinsic References with Physical References. • Storage and Retrieval mechanism using intrinsic references : Elephant Store • Use of intrinsic references in Hierarchical data structures Terminology: • Collision resistance: • Extremely difficult to find two sequences with same hash. • Implies that hash is unique (sufficiently so…) • One-way hash: • Given a hash of a sequence it is difficult to reconstruct the sequence. • Reference => Hash AND Referent => byte sequence • (ex. Memory addresses and data, URLs and web pages etc.) ScaLAB seminar 21st October 2002
Physical References – • Relationship between reference and referent is defined by state of the physical system. • Change in the state changes the referent. • All accesses to referent have to be through the system. • Bottleneck and potential failure point • Intrinsic References - • Collision resistant (unique) and one-way hash value • State Independence: The relationship between S and R depends only on the hash function. • Uniqueness: A given R refers only to a particular S from which it was obtained. • Physical storage is still required to store/retrieve the referents. ScaLAB seminar 21st October 2002
Intrinsic References and Distributed Storage – • Useful for Distributed, replicated storage mechanism. • No reference-referent inconsistency (hash gives the reference) • Simple hashing can check for the correctness of the data • Opaque Storage – • Used for storing an instance of a data structure in Elephant Store • Serialize the data structure, store the byte sequence. • Called OPAQUE representation as data structure is hidden behind the byte sequence. • Hash of the sequence is the reference (digest). • Retrieval: Retrieve the byte sequence from store, de-serialize Opaque Reference (Hash digest) Serialization (makes the structure opaque) Data Structure ScaLAB seminar 21st October 2002
HDAGs – • Hash based Acyclic Directed Graph. • Nodes are directories • arcs are directory – sub-directory relationships. • Root digest of a rooted HDAG is used as intrinsic reference to the whole HDAG. • Application: Can be used to represent a file system or mail system. • Root digest uniquely represents the state of whole directory structure and not just the root directory ScaLAB seminar 21st October 2002
Versions and Change (Problems with OR) – • For a file system, example of Opaque representation is a tarball of the directory structure. • Change in any file will cause the opaque representation to change. • Hash digest also changes. • There is no relationship between the old and new representations. • Solution: Use HDAGs • Adding a file to a directory is same as a new mail in Inbox. • The representation of all other files & directories is not changed. • Efficient than Opaque Rep. • Saves communication cost among replicas for distributed storages. ScaLAB seminar 21st October 2002
Advantages of HDAGs – • Efficient for Distributed systems (version management) • Every version is represented by a unique intrinsic reference which is independent of physical system. • Replication and caching will never lead to inconsistencies • Two versions of an object are represented by sharing majority of the storage and communication costs. • Conclusions – • HDAGs promise to be a useful mechanism for building and maintaining distributed storage systems. ScaLAB seminar 21st October 2002
OS Support for P2P Programming:a Case for TPS Presented by: Nimish Pachapurkar ScaLAB seminar 21st October 2002
Introduction – • Need for RPC-like interaction mechanism for P2P infrastructures • Must be decoupled • Anonymous and asynchronous • Layers over RPC would certainly hamper performance • Type based Publish/Subscribe as a candidate • Abstraction of low-level P2P library – JXTA • What’s in the paper: • Comparison of the implementation of TPS with pure JXTA • A “first” experience • Design and source code of applications ScaLAB seminar 21st October 2002
JXTA • Three layers • Core Layer: Several protocols ensuring basic communication between peers, message routing or peer group creation • Service Layer: Ready-made services such as content management system and wire service • Application Layer: All the code written by the programmer • Six concepts: • ID: for any resource (peer, pipe, peergroup, codat) • Peer: Any device with an electronic pulse (normal and special) • Rendez-vous and routers • Pipe: Virtual communication channel – asynchronous and uni-directional (wire for many-to-many) – independent of IP • PeerGroup: Collection of peers • Advertisement: XML msg with information about new resource • Message: Any kind of communication (using XML) ScaLAB seminar 21st October 2002
Protocols for JXTA – • PDP – Peer Discovery Protocol • Allows different peers to find each other • PRP - Peer Resolver Protocol • Just above the transport layer, dispatches JXTA message to right service • PIP – Peer Information Protocol • Know the status of a peer. (time the peer was up, channels available) • PMP – Peer Membership Protocol • Obtain group membership requirements information (credentials, password, etc.) • PBP – Peer Binding Protocol • Keeps different peers in a pipe bound together (even when they move) • ERP – Endpoint Routing Protocol • For routing messages between the peers • Enables communication between 2 peers even when they do not know how to connect to each other (due to Firewall etc.) ScaLAB seminar 21st October 2002
TPS over JXTA – • Publish/Subscribe paradigm • Time decoupling: Publisher and Subscriber do not need to be up at the same time • Space decoupling: Publisher and Subscriber do not need to know each other • Flow decoupling: Sending or receiving of messages do not block the participants. • This decoupling suits the server-less architectures. • Subscription based on Subject and Content • Type-based: Subject => Event object type Content => State of instance of that type • Type safety • Subscriber knows event type in advance ScaLAB seminar 21st October 2002
Example – • Ski renting application • Need to find ski rentals with reasonable rates • Must surf the net for a long time • Alternative: Use the TPS based P2P infrastructure • Subscribe to ski-rental type and wait for answers • Publisher: (A new shop is opened) • Search launched for ski-rental advertisement • If not found, a new one is created • Programming phases – ScaLAB seminar 21st October 2002
Performance – • Invocation time Time for sendMessage() • Publisher produces 50 evts • JXTA-WIRE is quicker • No difference between SR-JXTA and SR-TPS • Throughput: Similar trends! • Conclusion- • TPS is a viable alternative abstraction to RPC for future Internet-wide Operating Systems to support P2P applications • Simple to use, type-safe, preserves decoupled nature of P2P. • Makes programming easier than with pure JXTA. ScaLAB seminar 21st October 2002