260 likes | 355 Views
Tycho: A Resource Discovery and Messaging Framework for Distributed Applications. Matthew Grove matthew.grove@port.ac.uk Cluster 2006. Outline. Introduction and motivation, The architecture of Tycho , Implementation details, Updated benchmarking results,
E N D
Tycho: A Resource Discovery and Messaging Framework for Distributed Applications Matthew Grove matthew.grove@port.ac.ukCluster 2006
Outline • Introduction and motivation, • The architecture of Tycho, • Implementation details, • Updated benchmarking results, • Content distribution (Tycho swarm utility), • Summary.
Introduction • Tycho is a reference implementation of a combined extensible wide-area messaging framework with a built in distributed registry: • The Tycho components are: • Mediators, • Clients (Producers and Consumers). • Tycho provides services to allow clients to discover each other using a Virtual Registry (VR) made up of a network of mediators – this also aids communication over both LAN and WAN. • Tycho aims to simplify and speed application development by freeing developers from the need to use combinations of software to provide discovery and messaging services.
General Design Philosophy • Reuse existing software components, if possible, rather than reinvent services or functionality. • Try to make use of existing software infrastructure. • Make Tycho simple to install, configure and use. • Provide a ‘basic release’ with the ability to extend functionality with a further more sophisticated component (Tycho utilities). • Java was used for portability and interoperability with other distributed systems, plus rapid development.
The Two Parts of Tycho • Messaging: • Secure asynchronous communications between Consumers, Producers and Mediators. • Virtual Registry: • Boot-strapping – allows mediators to discover each other and form the VR with minimal hardwiring. • Communications - secure routing of queries between Virtual Registries. • Caching: keep a temporary local copy of some information to reduce the amount of communications between peers.
Tycho Mediator Implementation • Tycho provides a choice of implementations for each core service.
Tycho Clients & Utilities • The Tycho Connector provides the API for building producers and consumers. • Extra functionality can be added as utilities.
Tycho Core Services • Transport handler, allows different protocols to be used for communications (HTTP(S), Sockets, IRC). • Local store, for a mediator and VR information (JDBC, Java simple store). • Boot service, used by the VR within a mediator to locate and join the rest of the VR (HTTP(S), IRC). • Query parser and result annotator, to support different query and markup languages (SQL, LDIF).
Tycho Benchmarks • Three rounds of benchmarking were performed: • Communications (A) - measured the performance of inter-client and inter-mediator messaging for Tycho and NaradaBrokering. • Virtual Registry tests (B) - measured and compared the performance of the Tycho VR to MDS4 and R-GMA. • Component Tests (C) - different components of the VR were tested in various configurations – these tests are discussed elsewhere.
The latency of communication for LAN and simulated WAN messaging was measured. The tests used two clients with varying message size (ping-pong tests). An eight node cluster was used to run the tests. Communications - Latency
Communication Tests - Summary • Tycho has a lower latency and higher bandwidth than NaradaBrokering in all the tests. • With respect to scalability of producers and consumers, when either systems is saturated, the performance is stable under heavy load, however: • NaradaBrokering needs the JVM heap size to be increased as the number of clients increases (due to internal buffers): • Tycho used the default heap for all of the tests.
Virtual Registry Tests (B) • Two tests were used to measure aspects of the performance of Tycho’s VR, MDS4 and R-GMA: • Number of records in a registry (100,000 records), • Number of simultaneous client queries (1000 clients). • The tests were repeated with two different queries: • (S1) a single random record was selected, • (S2) all of the records were selected (worst case scenario).
VR Tests - Records (S1) MDS4 out of memory
VR Tests - Records (S2) MDS4 out of memory
VR Tests - Clients (S1) R-GMA out of memory
Summary of VR Tests • Tycho has a better performance and client-scalability than both R-GMA and MDS4. • The heap R-GMA and MDS4 has to be set to 1.5 Gbytes (the max we could set) to carry out the tests. • Memory management in Java is an issue: • Without limited buffering or flow control, consuming the Java heap is a problem. • Storing information internally using XML seems to be a source for some of these memory problems: • Java database solutions such as HSQDLB can provide a high-performance solution for off-loading some of the storage requirements to disk.
Tycho Core – Future Work • Some performance improvements: • Caching of local mediator queries to reduce response times, • Use of a hybrid VR-interconnect to use IRC for query routing and HTTP for transporting large responses. • Additional functionality can be added to provide advanced services: • WS-based transport handlers for interoperability.
Tycho Applications • We developed a number of applications to further validate the implementation. • These include: • Demonstrations of publishing and discovering distributed webcams, • Remote resource discovery for the VOTechBroker project, • Part of the European Virtual Observatory project, Tycho provides automatic resource discovery for job submission. • Binding components for the Semantic Log Analyser (Slogger) project together: • Here Tycho helps locate and gather distributed logs for analysis.
The Tycho Swarm Utility • The swarm utility is a tool for distributed content distribution. • The utility was developed to test the potential of Tycho utilities and also further stress test the overall infrastructure: • By simultaneously utilising the VR and messaging functions, • Storing and updating thousands of entries records in the VR, • Sending thousands of multi-megabyte messages between clients. • Its potential uses include: • Distributing files for collaboration purposes, • Staging data for computation, • Mirroring and managing large data sets.
Swarm Utility Overview • The swarm utility provides distributed content distribution similar to BitTorrent. • Content is split into ‘chunks’ and the VR is used to store chunk availability. • Peers use the VR to locate each other and decide what chunks to download. • Tycho messages are used to transfer the chunks between peers and peers cooperate to distribute the content throughout the swarm.
Summary • The initial reference implementation of Tycho has been completed. • It can be downloaded from: • http://dsg.port.ac.uk/projects/tycho/ • Both the messaging code and VR have been benchmarked and perform well. • The focus now is on developing Tycho utilities to provide more feature rich functionally.
Webcam Browser Demo http://dsg.port.ac.uk/projects/tycho/demos/web/
Links • Project Web page: • http://dsg.port.ac.uk/projects/tycho/ • The DSG Web page: • http://dsg.port.ac.uk/ • The ACET Web page: • http://acet.port.ac.uk/