200 likes | 309 Views
Department of Computer Science Courant Institute of Mathematical Sciences New York University. Filterfresh: Hot Replication of Java RMI Server Objects Arash Baratloo, P. Emerald Chung, Yennun Huang, Sampath Rangarajan, and Shalini Yajnik. Bell Laboratories Lucent Technologies.
E N D
Department of Computer Science Courant Institute of Mathematical Sciences New York University • Filterfresh: Hot Replication of • Java RMI Server Objects • Arash Baratloo, P. Emerald Chung, Yennun Huang, • Sampath Rangarajan, and Shalini Yajnik Bell Laboratories Lucent Technologies
Filterfresh Goals • Support highly-available RMI services in presence of failures • Handle crash failures • Transparent failure masking • Easily integrate into Java RMI
Roadmap • Goals • RMI Registry architecture & crash failures • RMI architecture & crash failures • Process group approach to fault tolerance • Highly available registry service • “Reverse lookup” for masking (state-less) servers failures • Towards highly available servers • Conclusions
RMI in a nutshell • Step 1: a server object registers with the RMI registry running on the local host • Steps 2-3: Clients get server’s remote reference by performing a lookup operation at a known registry • Step 4: Given a remote reference, clients invoke server’s methods through RMI
Limitations of RMI Registry • Single point of failure • Clients need to know a priori which registry to contact • Does not allow multiple RMI servers to register under the same service name • Not suited for replicated highly-available RMI server objects
RMI Architecture • The programmer writes the client and server application codes • The RMI compiler (rmic) generates the client stub and server skeleton • The RMI package implements the RRL and transport layers • Transparent masking of failures must occur below the stub/skeleton levels
A unified solution • Fault-tolerance based on process group approach • Non-faulty processes form a logical group • Members interact using a set of group primitives • Group primitives are guaranteed to be reliable -- all or nothing • Group primitives are guaranteed to be ordered • Group members have a consistent view of other group members • Applications built on process groups view events in a synchronous fashion: • The group view changes for all members as though it is instantaneous -- synchronous • Events (e.g, send & receive of multicasts) occur in a logical order, within the same view • Members have the same view of the group
Fortunately • Process group approach is • Well studied • Well defined protocols • Process group approach has been used in building general purpose fault-tolerant • Middle-ware systems, such as Horus/Ensemble, Transis, etc. • Services, such as FT directory and file servers • OO systems, such as ISIS+ORBIX, Electra, Orca • Java middle-ware systems such as iBus • Seems a good candidate for FT RMI services
Basis for process groups • A GroupManager Class • 100% Pure Java • built on top of UDP/IP • Implements • Group creation • Join operation (with atomic state transfer) • Leave operation • Group multicast operation • Failure detection and recovery • All events are reliable and totally ordered
Performance of group multicast • PentiumPro 200, Linux 2.030, Fast Ethernet connected by a hub • JDK1.1.1 • Thread and object serialization influenced the performance?
Roadmap • Goals • RMI Registry architecture & crash failures • RMI architecture & crash failures • Process group approach to fault tolerance • Highly available registry service • “Reverse lookup” for masking (state-less) servers failures • Towards highly available servers • Conclusions
FT Registry architecture • Embedded a GroupManager class to ensure reliable ordered events • Reliable and ordered group operations ensure consistent state • Replicated registry service for high availability • Supports dynamic joins w/state transfer • Detects and removes failed registry servers
Bind operation • Bind operations are sent to every replica • Reliable multicast ensures every replica receives the event • Ordered group operation ensures consistency even if a new replica joins
Lookup operation • Lookup operations are handled locally • Provides location transparency to clients • able to locate servers registered at unknown hosts • no need to have a priori knowledge of server’s host
Performance of FT Registry • PentiumPro 200, Linux 2.030, Fast Ethernet connected by a hub • JDK1.1.1
RMI & FT Registry • Supports multiple replicated servers to register under the same service name • Object references remain valid after the associated object has failed
In the event of server failure • The failure is detected below the stub level, and ...
Failure recovery forstate-less servers • A “reverse” lookup returns the name of a given wire connection • The old connection is patched with a connection to a non-faulty server • The operation is re-attempted • Transparent to the client: illusion of a valid object reference
FT server Architecture • Client has the illusion of a single server • In reality, a group of servers process clients requests • Operations are performed at each server, in the same order for consistency • Replicated servers for high availability
Conclusions and future work • To provide high availability there is need for • A reliable registry service • A reliable RMI architecture • Showed suitability of process group approach by • Transparently masking failures • Easily integrated our services into Java RMI • Future work • Complete work on general-purpose FT services • Address nested RMI calls for replicated servers