230 likes | 374 Views
Towards high-performance communication layers for JXTA on grids. Mathieu Jan. GDS meeting, Lyon, 17 February 2006. Outline. Context & JXTA Communication layers Performance improvements and new features in JXTA-C Transparent use of networks available on grid infrastructures: PadicoTM
E N D
Towards high-performance communication layers for JXTA on grids Mathieu Jan GDS meeting, Lyon, 17 February 2006
Outline • Context & JXTA Communication layers • Performance improvements and new features in JXTA-C • Transparent use of networks available on grid infrastructures: PadicoTM • Latest news on JuxMem
Context of this work • Goal: towards high performance for JuxMem • Initial performance evaluations of JXTA (C and J2SE) in grid environments • GP2PC 2005 • HPCC 2005 • Possible performance improvements have been identified • Case of direct communications between peers • Internship at Sun Microsystems • 3 months (August-October)
Endpoint service Pipe service JXTA Socket Endpoint service Pipe service JXTA Socket JXTA communications layers • - Data-stream interface • Reliability - Dynamic point-to-point communications TCP, HTTP, etc - Static point-to-point communications - Independant from underlaying network topology - Unreliable
EndpointSourceAddress EndpointDestinationAddress The bottom layer: the endpoint service • Abstracts the available underlying transport protocols (TCP, HTTP, etc) • Called endpoints • Asynchronous, unidirectional and unreliable static point-to-point communications • Endpoint address: Peer ID • Endpoint Router Protocol resolve the route • Message elements used by the endpoint service Required JXTA headers
Core communication layer: the pipe service • Illusion of a virtual endpoint independent of any single peer location and network topology • Called pipes, identified by a Pipe ID • Resolved through the use of the Pipe Binding Protocol • Asynchronous, unidirectional and unreliable dynamic communications Destination peer Pipe Source peer Destination peer
Required JXTA headers EndpointSourceAddress EndpointDestinationAddress Core communication layer: the pipe service • Illusion of a virtual endpoint independent of any single peer location and network topology • Called pipes, identified by a Pipe ID • Resolved through the use of the Pipe Binding Protocol • Asynchronous, unidirectional and unreliable dynamic communications • Secure communication available via TLS EndpointRouterMsg (XML document)
Performance improvements, why? • High latency of JXTA-J2SE in SANs • Enforced limited message size • Limited bandwidth • Poor performance and reliability issues in JXTASockets for a while • PadicoTM does not support required JVMs for JXTA-J2SE • Improvements required on JXTA-C for fully exploiting possibilities of PadicoTM
Performance improvements, how? (1/2) • Reduced size for EndpointDestinationAddress • Only the name of the local listener • Removed EndpointSourceAddress • Duplicated information with welcome message • Uneeded EndpointRouterMsg when direct connexion between peers • Contains the pipe ID • Rewritten code for many parts
Performance improvements, how? (2/2) • Rewritten code • Large patch under review from Sun JXTA team • Tools used: callgrind & kcachegrind
Zero-copy architecture • Copy when accessing data of a JXTA-C message • Callback mechanism to ask services where to store data • Used inside JuxMem-C for data chunks
Fully exploiting networks available on grids • SANs capacities: OS-bypass mode • Myrinet: 2 Gb/s and 7μs • Quadrics: • Infiniband: • WAN enhancements • Parallel streams • On the fly compression • Solution: PadicoTM • High-performance framework for multithreading and networking • Virtual sockets
JXTA-C on top of PadicoTM • Requires the use of the Marcel thread library • JXTA-C relies on Apache Portable Layer (APR) • “Predictable and consistent interface to underlying platform-specific implementations” • APR 1.2.x • Modifications inside APR to change pthread to marcel • Sed command • + patch for recursive locks • Getting a working PadicoTM is hard • Evaluation in progress. Included in PadicoTM 0.3.0beta3
New features in JXTA-C • JXTA-C 2.2 Palau • Initial rdv server support • New CM (Sqlite) and XPath queries • JXTA-C 2.3 Bali • Improved rdv server support • Dynamic loading of services • Use of private and custom peergroups • Code freeze (15/2) for next release (Kenting) • Wrapper for .Net • Improved tcp latency issue of JXTA-C still in review
Latest news on JuxMem • New version of JuxMem 0.2 • Mainly JuxMem-C/C++ • Features • New API • juxmem_malloc, juxmem_mmap, juxmem_attach, juxmem_free, etc • C++ wrapper • New memory allocation process • JuxMem managers based on JXTA-C • Use of resolver service • Improved performances • Communications layers • Consistency protocols
Large-scale deployment: ADAGE • Lessons learned from JDF • Improved description language • Deployment of JXTA-C and JXTA-J2SE based applications • Target application: JuxMem (C and J2SE) • Initial test on Grid’5000 • 1 cluster -> 1010 peers (10 cluster groups) on 50 nodes • 3 clusters -> 300 peers on 300 nodes • Evaluation of JXTA-C and JuxMem-C at a large scale
The JXTA plugin for Adage • Description of ressources (G5k.xml) • Use of OARGrid, GridPrems • Description of application • Profile of peers • Overlay • Not specific to JuxMem • Control parameters • Number of peers • Where to put peers: on which physical cluster
Latest news on DIET/JuxMem • Use of the C++ wrapper • Modifications in DIET_client and SeDImpl • Test with dmat_manip while waiting for Grid-TLSE idaA = juxmem_attach(A, lenA) local_ptrA = juxmem_mmap(NULL, lenA, idA) idaB = juxmem_attach(B, lenB) juxmem_acquire_read(local_ptrA) diet_solve(multiply, idA, idB) local_ptrB = juxmem_mmap(NULL, lenB, idB) juxmem_mmap(C, lenC, idC) juxmem_acquire_read(local_ptrB) juxmem_acquire_read(C) C = multiply(A, B) juxmem_release(C) juxmem_release(local_ptrA & local_ptrB) idC = juxmem_attach(C, lenC) • Deployment with GoDIET • Status of the patch?
Conclusion • Improved performance for JXTA-C/JuxMem-C communication layers • JXTA-C/JuxMem-C on top of PadicoTM • JuxMem-C/C++ 0.2 • On-going work • Large-scale evaluation of JXTA/JuxMem • Evaluation of JuxMem in GridRPC model (DIET)