120 likes | 255 Views
Implementing Babel RMI with ARMCI. Jian Yin Khushbu Agarwal Daniel Chavarría Manoj Krishnan Ian Gorton Vidhya Gurumoorthi Patrick Nichols. Motivation. Remote Method Invocation provides a useful abstraction for distributed computing Example: event service for CCA framework
E N D
Implementing Babel RMI with ARMCI Jian Yin Khushbu Agarwal Daniel Chavarría Manoj Krishnan Ian Gorton Vidhya Gurumoorthi Patrick Nichols
Motivation • Remote Method Invocation provides a useful abstraction for distributed computing • Example: event service for CCA framework • Existing TCP/IP based implementation has performance problems • Question: can we speed up Babel RMI with high performance communication protocols 2
Objectives • Demonstrate that it is feasible to build high performance Babel RMI • Prototype a Babel RMI with ARMCI and measure its performance experimentally • Produce a quality implementation of high performance RMI 3
Outline • Motivation • Objectives • Background • Babel RMI • ARMCI • Preliminary performance results • Future works 4
Babel RMI • Babel supports Remote Method Invocation • Transparent • Flexible • Implemented with extensive code marshalling and runtime library • Existing TCP/IP based implementation incurs high overhead • Multiple copying • Context switching 5
ARMCI • Middleware for remote memory access (RMA) • Support many networks and HPC systems • Myrinet, Infiniband, Quadrics, Giganet, … • Cray XT4, XT, X1, IBM BlueGene,… • Efficient • Minimum number of copying • Truly one side communication protocol • Put, get, accumulating • Atomic read-modified-write, mutex • Blocking and non-blocking interfaces 7
Experiment Setup • Hardware • cluster with 11 nodes • 4 core 2.4 GHz Intel Xeon processor • Infiniband DDR network • Software • Babel 1.4.0 • ARMCI 1.4 • OpenMPI 1.2.6 8
Implementation • Implemented extensive set of functions in the runtime library • InstanceHandle, Server, Invocation, Response, Call, Return, … • Usage Examples • hello_World h = hello_World__createRemote(armcihandler://<process_id>:<mutex_id>, &_ex); • hello_World h2 = hello_World__connect(armcihandler://<process_id>:<mutex_id>/<object_id>&_ex); 9
Next Step • Reduce protocol overhead • Reduce function call overhead • Reduce copying • Batch RMI Call • Reduce RDMA overhead • Prefetch in the background • Preload libraries • Prefech arguments 11
Where to Use High Performance Babel RMI • Applications for high performance RMI • Fine grain distribution • Hybrid computing • Suggestions … 12