300 likes | 482 Views
Going Large-Scale in P2P Experiments Using the JXTA Distributed Framework. Mathieu Jan & Sébastien Monnet Projet PARIS. Paris, 13 February 2004. Outline. How to test P2P systems at a large-scale? The JDF tool Experimenting with various network configurations
E N D
Going Large-Scale in P2P Experiments Using the JXTA Distributed Framework Mathieu Jan & Sébastien Monnet Projet PARIS Paris, 13 February 2004
Outline • How to test P2P systems at a large-scale? • The JDF tool • Experimenting with various network configurations • Experimenting with various volatility conditions • Ongoing and future work
How to test P2P systems at a large-scale? • How to reproduce and test P2P systems? • Volatility • Heterogeneous architectures • Large-scale • Many papers on Gnutella, KaZaA, etc • Behavior not yet fully understood • Experiments on CFS, PAST, etc • Mostly simulation • Real experiments up to a few tens of physical nodes • Large-scale (thousands of node) via emulation • The methodology for testing in not discussed • Deployment • How to control the volatility? • A need for infrastructures
Solutions used for testing P2P prototypes • Simulation • Results are reproducible • May require significant adaptations • Simplified model compared to the reality • Emulation • Configure network with various characteristics • Heterogeneity not fully captured • Results are not reproducible • Deployment and management • Experiments on real testbeds • Needed step when validating software • Real heterogeneity • Results are not reproducible • Deployment and management
Conducting JXTA-based experiments with JDF (1/2) • A framework for automated testing of JXTA-based systems from a single node (control node) • http://jdf.jxta.org/ • Two modes • Run one distributed test • Multiple tests called batch mode (useful with crontab) • We added the support of PBS
Conducting JXTA-based experiments with JDF (2/2) • Hypothesis • All the nodes must be “visible” by the control node • Requirements • Java Virtual Machine • Bourne shell • SSH/RSH configured to run with no password on each node • JDF: several shell scripts • Deployment of the needed resources for a test or several tests • Jar files and script used on each node • Configuration of JXTA peers • Launching peers • Collect logs and results files of each node • Analyze results on the control node • Cleanup deployed and generated files for the test • Kill remaining processes • Update resources for a test
How to define a test using JDF? • An XML description file of the JXTA-based network • Type of peers (rendezvous, edge peers) • How peers are interconnected, etc • A set of Java classes describing the behavior of each peer • Extend the JDF’s framework (start, stop JXTA, etc) • A Java class for analyzing collected results • A file containing the list of nodes and the path of the JVM on each node
Describing a simple JuxMem network (1/2) • Notion of profile • A set of peers having the same behavior • Instance attribute of profile • Specify the total number of nodes hosting this type of peer via the instance attribute • Instance attribute of peer • Specify the total number of peers of this type on 1 node • Simplest example: one cluster manager and 1 provider Cluster Manager A Provider A Cluster A group
Describing a simple JuxMem network (2/2) <profile name=“clusterManagerA" instances="1“ > <peer base-name=“clusterManagerA" instances="1“ /> <rdvs is-rdv="true“ /> <transports> <tcp enabled="true" base-port="13000“ /> </transports> <bootstrap class="juxmem.service.test.load.ClusterManager“> <jvmarg value=“xxxx” /> <arg value=“xxx“ /> </bootstrap> </profile> <profile name="providerA" instances=“1“ > <peer base-name="providerA" instances="1“ /> <rdvs is-rdv="false"> <rdv cluster=“clusterManagerA"/> </rdvs> <transports> <tcp enabled="true" base-port="13000"/> </transports> <bootstrap class="juxmem.service.test.load.Provider" > <jvmarg value=“xxx“ /> <arg value=“xxxx“ /> <!-- memory provided for example --> </bootstrap> </profile>
A more complex JuxMem network (1/2) cluster C group cluster A group juxmem group cluster B group
A more complex JuxMem network (2/2) <profile name=“clusterManagerA" instances="1"> … </profile> <profile name=“clusterManagerB" instances="1"> … </profile> <profile name=“clusterManagerC" instances="1"> … </profile> <profile name="providerA" instances=“42“ > <peer base-name="providerA" instances=“4“ /> <rdv cluster=“clusterManagerA"/> … </profile> <profile name="providerB" instances=“42“ > <peer base-name="providerC" instances=“5“ /> <rdv cluster=“clusterManagerB"/> … </profile> <profile name="providerC" instances=“35“ > <peer base-name="providerC" instances=“6“ /> <rdv cluster=“clusterManagerC"/> … </profile>
Usage of JDF’s scripts • runAll.sh [<flags>] <list-of-hosts> <network-descriptor> • -debug: show all script commands executed • -unsecure: use rsh instead of ssh • -cleanup: cleanup JDF directory on each host • -bundle: create bundle for distribution • -install: install distribution bundle • -update: update files on each peer • -config: configure JXTA network • -kill: kill existing JDF processes • -run: run test • -nohup: run and return without waiting for peers to exit • -analyze: analyze test results • -log: keep test results and log4j logs from peers • batchAll.sh [<flags>] <file-listing-each-test-directories>
Experimental results with JDF (1/2) • Experimental setup • Distributed ASCI Supercomputer 2 (DAS-2) managed by PBS (The Netherlands) • 5 clusters for a total number of 200 Dual 1-GHz Pentium-III nodes • Site mainly used: 72 nodes • SSH/SCP used • Experiments with JDF on up to 64 nodes • Deployment of JXTA + JDF + JuxMem • Configuration of JuxMem peers • Update only JuxMem
Launching peers • For each peer a JVM is started • Several JXTA can not share the same JVM • How to deal with connections between edge and rendezvous peers? • Rendezvous peers must be started before edge peers • JDF uses the notion of delay • Time to wait before launching peers • Need a mechanism for distributed synchronization
Getting the logs and the results • Framework of JDF • Start and stop JXTA (net peergroup as well as custom groups as in JuxMem) • Store the results in a property file • Retrieve log files generated on each node • Library used: Log4j • Files starting with log. • Retrieve result files on each node • The specified analyze class is called • Display results
Experimenting with various volatility conditions • Goals • Provide multiple failure conditions • Experiment various failure detection techniques • Experiment various replication strategies • Identify class of application and system states • Adapt fault tolerance mechanisms
Providing multiple failure conditions • Go large scale • Control faults upon thousands of nodes • Precision • Possibility to kill a node at a given time/state • Some nodes may be “fail-safe” • Easy to use • Changing the failure model should not affect the code being tested
Failure injection: going large scale • Using statistical distributions • Advantages • Ease of use : permit to generate multiple failure dates automatically • Suitable large scale • Which statistical distributions ? • Exponential (to model life expectancy) • Uniform (to choose between numerous nodes)
Failure injection: precision • Why ? • Play the role of the enemy • Kill a node that handles a lock • Kill multiple nodes during some data replication • Model reality • Some nodes may be almost “fail-safe” • A particular node may have a very high MTBF • How ? • Combine statistical flows and a more precise configuration file
Failure injection in JDF: design • Add a unique configuration file • Generated by a set of tools • Using “The Probability/Statistics Object Library” (http://www.math.uah.edu/psol) • Deployed on each node by JDF • Launch a new Java thread • Reads the configuration file • Sleeps for a while • Kills its node at a given time
Failure injection: execution flow New Killer().start() read suicide kill Configuration file (fi.properties) write kill Thread Killer write Main flow (test class) Result file (fi.results)
Failure injection: sample experiment • 64 peers running on 64 nodes • Creating fi.properties for an initial MTBF of 1 minute • Each node life time follows an exponential law with a rate of 1/64 • With JDF it becomes easy to use • java –cp .:PSOL.jar CreateFiProperties 60000 • new fi.Killer.start();// in the test class • runAll.sh -cleanup –with-nfs -install -config -run -analyze -log paraci_01-64 test.xml
Failure injection: ongoing work • Time deviation • Initial time (t0) • Clocks drift • Tools to precisely specify fi.properties • Suicide interface (event handler) • More flexibility
Failure detection and replication strategies • Running the same test multiple times • Failure detection • Change the failure detection techniques • Tune the Δ (delay between heartbeats) • Which Δ for which MTBF • Replication strategies • Adapt replication degree to “current” MTBF (level of risk) • Experiment multiple replication strategies in various conditions (failures/detection)
Fault tolerance in JuxMem road map • Finalize failure injection tools • Experiment Marin Bertier’s failure detectors with JXTA/JDF • Integrate of the failure detectors in JuxMem • Experiment with various replication strategies • Automatic adaptation
Ongoing work • Improving JDF • There is a lot to do • Enable concurrent tests via PBS • Submitting issues to bugzilla • Write more tests for JuxMem • Measuring the cost of elementary operations in JuxMem • Various consistency protocols at large-scale • Benchmarking other elementary steps of JDF • Launch peers • Collect result and log files • Use of emulation tools like Dummynet or NIST NET • Visit of Fabio Picconi at IRISA
Future work • Hierarchical deployment • Ka-run/Taktuk-like (ID IMAG) • Distributed synchronization mechanism • Support more complex tests • Allow the use of JDF over Globus • Support other protocols than SSH/RSH • Especially when updating resources