1 / 15

The WebLogic Messaging Kernel: Design Challenges

The WebLogic Messaging Kernel: Design Challenges. Greg Brail Senior Staff Software Engineer BEA Systems. Background. JMS has been part of WebLogic Server since version 5.1 New requirements came up for WLS 9.0 Store-and-forward messaging Asynchronous web services messaging (WS-RM)

zeph-nieves
Download Presentation

The WebLogic Messaging Kernel: Design Challenges

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The WebLogic Messaging Kernel: Design Challenges Greg Brail Senior Staff Software Engineer BEA Systems

  2. Background • JMS has been part of WebLogic Server since version 5.1 • New requirements came up for WLS 9.0 • Store-and-forward messaging • Asynchronous web services messaging (WS-RM) • Smaller-footprint messaging • Improved performance • Clean up our growing code base

  3. Messaging Kernel • The Messaging Kernel provides in-process messaging services divorced from any particular API or protocol • Used in WebLogic Server 9.0 • JMS 1.1 • WebServices-Reliable Messaging (WS-RM) • Kernel itself is invisible to end users • Implementation • Design work started in mid-2003 • Implementation in 2004 • Shipped with WebLogic Server 9.0 in August, 2005 (Credit for the kernel idea goes to “Zach” Zachwieja, now at Microsoft)

  4. Messaging Kernel Features • Message Queuing • Persistent and non-persistent queuing • Transactional queuing, with two-phase commit support • Pluggable message sort order • Message redirection for expiration and error handling • Message ordering and sequencing features • Publish/Subscribe • Topic subscribers are queues • Pluggable subscription filter • Core Features • Thresholds, quotas, statistics, pause/resume for manageability • Paging to control memory usage

  5. Three Design Challenges • #1: Persistence Mechanism • #2: Message Ordering for User Applications • #3: Cluster-Wide Message Routing

  6. Challenge #1: Persistence Mechanism • What are some ways we could handle message persistence? • RDBMS approach: • INSERT each new message as a row in the table • SELECT every time we want to get a message and flag it somehow • DELETE each message when it has been acknowledged • Log approach: • Use a linear transaction log • Append records for “send” and “acknowledge” and recover by reading the log • Truncate the log when it gets too big and move messages elsewhere • Classical approach: • Create persistent data structures on disk blocks just like a database • Use ARIES or some other well-documented algorithm • Traverse the persistent data structures to get and update each message • Heap approach: • Write each message to disk in a convenient location • Remember that location and mark messages deleted when acknowledged • Recover by reading back the whole file

  7. Persistence Pros and Cons • RDBMS • Simple to implement, at least at first • But SELECTING to find each new message is not great • WebLogic JMS 5.1 did this and it was very slow • Low memory usage, but also low throughput and high latency • Log • Transaction logs are well-understood and relatively simple to implement • But once the log is full, performance drops dramatically • Good memory usage, throughput, and latency • Recovery time depends on the (configurable) size of the log • Classical • Well-understood, although not simple to implement • Memory usage and recovery time depend on size of the cache and log • Throughput is lower due to overhead of persistence algorithms • At least for us, in Java!

  8. Heap-Based Persistence • The Heap method works well for us • More memory than Classical or RDBMS approaches • Potentially longer recovery time than Log or Classical approaches • But best throughput • Allows for both file- and RDBMS-based implementations • File-based heap used in WLS 6.1 through 8.1 • Not very sophisticated; relied too much on the filesystem • Latest version is lot smarter • Make sure we do no more than one I/O per commit • Platform-specific code to access “direct I/O” feature in most O/Ss • New records are located on disk to reduce rotational latency • Runs faster than a traditional transaction log on a single disk “A High-Performance, Transactional File Store for Application Servers”, Gallagher, Jacobs, and Langen, SIGMOD 2005

  9. File-Based Persistence Performance • Performance test results: • One JMS queue, with persistent 100-byte messages • Messages sent and received simultaneously • JMS clients on separate machines from the server • Result based on receive throughput (Hardware: Dual-CPU 3.4 GHz Intel, 4GB RAM, 15,000 RPM SCSI disk, Windows Server 2003. One such machine used for WLS server, two for clients.)

  10. Challenge #2: Message Ordering • Problem: • Applications require certain sets of messages to be processed in order • Queuing systems usually give you two choices: • One thread and one queue for all processing (poor throughput) • Or, lots of threads and lots of queues (poor manageability) • Solution: the “Unit of Order” feature • Controls concurrency of message delivery based on application requirements • Messages are tagged with a “unit of order” (UOO) name • Only one message for each UOO is delivered at a time • Next message not available until previous message has been processed • Processing ends with transaction resolution • Result: For each UOO, messages are processed in the order they arrived on the queue • Better throughput • Less lock contention in the database

  11. Unit of Order Example • When first blue message is dequeued, blue UOO has an “owner” • Next consumer skips blue messages and gets the green message • When blue message is acknowledged, next blue message available for consumption • Throughput is excellent when messages are well-interleaved (like above) • In theory, throughput drops when a consumer must skip many messages because they are not well-interleaved (like below)

  12. Unit of Order Performance Test receives non-persistent messages from a queue and sleeps 5 milliseconds per message to simulate actual processing. (*With zero units of order, messages are processed out of order. So, this number is just on the chart as a baseline.)

  13. Challenge #3: Cluster Routing • UOO is great for a single queue • What if new messages are load balanced across many queues? • Each UOO must see that all messages go to the same queue • And other problem domains have similar requirements • We implemented two solutions: • Hashing: Hash on the unit of order name to determine where in the cluster each message should go • Hashing is based on number of servers configured in the cluster (C) • “Path Service:” A persistent cluster-wide database • Maps keys (UOO names) to values (cluster members) • One master database for the whole cluster • Caches on all other cluster nodes

  14. Cluster Routing Issues • Both approaches have flaws • Hashing is fast and scales well • But if any one server is down, 1 / C units of order cannot be handled • C is the number of configured servers, not running servers • Queuing elsewhere decreases throughput and adds race conditions • If C changes, messages may fall out of order • This makes it difficult to grow and shrink the cluster based on load • Path Service is much more flexible • One server in the cluster is the master and handles all updates • So if it is down, new paths cannot be created • Future: We would like to do better • Use Paxos to elect a master, with replicated persistent state? • Generate UOO names on the server rather than on the client?

  15. Conclusion • With the messaging kernel, we have implemented some old solutions, and found a few new ones • We think “unit of order” is quite useful and bears future research • We have quite a few more problems to solve • Continuously-available cluster routing • Continuously-available messaging • Performance, performance, performance • The messaging world needs to pay more attention to the research world to help solve these kinds problems • The research world might have more to study in the messaging world too!

More Related