1 / 19

Distributed simulation with MPI in ns-3

Distributed simulation with MPI in ns-3. Joshua Pelkey and Dr. George Riley. Wns3 March 25, 2011. Overview. Standard sequential simulation techniques with substantial network traffic Lengthy execution times Large amount of computer memory

shilah
Download Presentation

Distributed simulation with MPI in ns-3

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distributed simulation with MPI in ns-3 Joshua Pelkey and Dr. George Riley Wns3 March 25, 2011

  2. Overview • Standard sequential simulation techniques with substantial network traffic • Lengthy execution times • Large amount of computer memory • Parallel and distributed discrete event simulation [1] • Allows single simulation program to run on multiple interconnected processors • Reduced execution time! Larger topologies!

  3. Overview (cont.) • Important Note • It is mandatory that distributed simulations produce the same results as identical sequential simulations

  4. Overview: terminology • Logical Process (LP) • An individual sequential simulation • Rank or system id • The unique number assigned to each LP Figure 1. Simple point-to-point topology, distributed

  5. Overview: related work • Parallel/Distributed ns (PDNS) [2] • Georgia Tech Network Simulator (GTNetS) [3] • Both use a federated approach and a conservative (blocking) mechanism

  6. Implementation Details in ns-3 • LP communication • Message Passing Interface (MPI) standard • Send/Receive time-stamped messages • MpiInterface in ns-3 • Synchronization • Conservative algorithm using lookahead • DistributedSimulator in ns-3

  7. Implementation Details in ns-3 (cont.) • Assigning rank to nodes • Handled manually in simulation script • Remote point-to-point links • Created automatically between nodes with different ranks through point-to-point helper • When a packet is set to cross a remote point-to-point link, the packet is transmitted via MPI using our interface • Merged since ns-3.8

  8. Implementation Details in ns-3: limitations • All nodes created on all LPs, regardless of rank • It is up to the user to only install applications on the correct rank • Nodes are assigned rank manually • An MpiHelper class could be used to assign rank to nodes automatically. This would enable easy distribution of existing simulation scripts. • Pure distributed wireless is currently not supported • At least one point-to-point link must exist in order to divide the simulation

  9. Performance Study • DARPA NMS campus network simulation • Using nms-p2p-nix-distributed example available in ns-3 • Allows creation of very large topologies • Any number of campus networks are created and connected together • Different campus networks can be placed on different LPs • Tested with 2 CNs, 4 CNs, 6 CNs, 8 CNs, and 10 CNs

  10. Performance Study: campus network topology 200 ms, 10 us Figure 2. Campus network topology block [4]

  11. Performance Study: Georgia Tech clusters used • Hogwarts Cluster • 6 nodes, each with 2 quad-core processors and 48GB of RAM • Ferrari Cluster • Mix of machines, including 3 quad-core nodes and 8 dual-core nodes

  12. Performance Study: simulations on Hogwarts Figure 3. Campus network simulations on Hogwarts with (A) 2 CNs (B) 4 CNs (C) 6 CNs (D) 8 CNs (E) 10 CNs

  13. Performance Study: simulations on Ferrari Figure 4. Campus network simulations on Ferrari with (A) 2 CNs (B) 4 CNs (C) 6 CNs (D) 8 CNs (E) 10 CNs

  14. Performance Study: speedup Figure 5. Speedup using distributed simulation for campus network topologies on the (A) Hogwarts cluster and (B) Ferrari cluster

  15. Performance Study: speedup (cont.) • Linear speedup for Hogwarts, not for Ferrari. Further investigation revealed Ferrari consisted of a mix of machines, with the first two nodes considerably faster Table 1: Speedup for Hogwarts and Ferrari

  16. Performance Study: changing the lookahead • By changing the delay between campus networks, the lookahead was varied (200ms to 10 µs) • For Hogwarts and Ferrari, the 10 µs simulations ran, on average, 25% and 47% slower, respectively • As expected, a smaller lookahead time decreases the potential speedup, as the simulators must synchronize with a greater frequency

  17. Future Work • MpiHelper class to facilitate creating distributed topologies • Nodes assigned rank automatically • Existing simulation scripts could be distributed easily • Distributing the topology could occur at the node level, rather than the application • Ghost nodes, save memory • Pure distributed wireless support

  18. Summary • Distributed simulation in ns-3 allows a user to run a single simulation in parallel on multiple processors • Very large-scale simulations can be run in ns-3 using the distributed simulator • Distributed simulation in ns-3 offers potentially optimal linear speedup compared to identical sequential simulations

  19. References [1] R.M. Fujimoto. Parallel and Distributed Simulation Systems. Wiley Interscience, 2000. [2] PDNS - Parallel/Distributed ns. http://www.cc.gatech.edu/computing/compass/pdns, March 2004. [3] G. F. Riley. The Georgia Tech Network Simulator. In Proceedings of the ACM SIGCOMM workshop on Models, methods and tools for reproducible network research, MoMeTools ’03, pages 5-12, New York, NY, USA, 2003 ACM. [4] Standard baseline NMS challenge topology. http://www.ssfnet.org/Exchange/gallery/baseline, July 2002

More Related