1 / 39

Cougaar Design Case Study: Mandelbrot GUI Application

Cougaar Design Case Study: Mandelbrot GUI Application. Todd Wright Feb 5 th , 2007. Overview. These slides present ten alternate designs for an example Cougaar application, a “Mandelbrot” fractal GUI Each design explores various tradeoffs : Design complexity

chapa
Download Presentation

Cougaar Design Case Study: Mandelbrot GUI Application

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cougaar Design Case Study:Mandelbrot GUI Application Todd Wright Feb 5th, 2007

  2. Overview • These slides present ten alternate designs for an example Cougaar application, a “Mandelbrot” fractal GUI • Each design explores various tradeoffs: • Design complexity • Modularity and how the modules (plugins/agents) interact with one another • Parallel / Distributed processing support • As we go, we’ll summarize what we’ve learned and outline basic design patterns

  3. Mandelbrot Application • Basic idea: • The user submits an image calculation request • Cougaar application code (plugins & agents) compute the image data • The image is displayed to the user in a GUI • The example image is a “Mandelbrot” fractal • Given an (x, y) range and image size, e.g. • Range: (-1.5, -1.0) to (1.5, 1.0) • Image: 1024 x 768 • Compute the image using the “Mandelbrot” algorithm • Simple math • Entirely compute-bound (possibly network-bound if we make it distributed) Nodes GUI Agents Agents I/O Plugins Plugins

  4. Design Comparison Matrix • For all the following designs, we’ll rank each design based on the following scales of 1-to-10, with 10 being ideal: • Simplicity • How easy is the code to understand? • Modularity • Can we easily replace parts of our solution with alternative implementations? • Scalability • Can we distribute our solution across multiple hosts? • Inter-job Parallelism • Can separate jobs run in parallel? • Intra-job Parallelism • Can a single job be subdivided and run in parallel? • Adaptability • Can we customize the behavior, e.g. using policies or runtime metrics? • This will allow us to better see the tradeoffs between the designs.

  5. Design 1: Just a Servlet • Design: Do everything in a self-contained Servlet: • Listens for browser HTTP requests • Computes image data in the servlet “doGet” thread • Writes the image result as a JPG • Characteristics: • Easy to implementation and configuration • Few Cougaar dependencies (no need for a blackboard or other plugins) • No synchronization or threading issues (runs in the Servlet request thread) Node Servlet public void doGet(..) { read params compute image data write image as JPG } http

  6. Analysis: Design 1

  7. Design 2: Servlet UI + Calculator Service • Design: Move the “compute()” code out of the Servlet and into a separate Component • Primarily a refactor of the prior design • Use a service to advertise the “compute()” method • This is a typical solution for wrapping library code • Characteristics: • Still fairly easy to implement and configure • Improved modularity: • Can replace UI code while keeping calculator code (e.g. make popup Swing UI) • Can replace calculator code while keeping UI code (e.g. compute different fractal design) • No threading or synchronization issues (runs in the Servlet request thread) Node Servlet public void load() { calc = getService(Calc..); } public void doGet(..) { read params calc.compute(..); write image as JPG } Calculator public void load() { advertise(Calc, this); } public byte[] compute (..) { compute image data } http

  8. Analysis: Design 2

  9. Design point: Inlined code v.s. Services • Key design points: • Design 1: • Summary: The plugin directly calls the inlined / library code • Benefit: Easy to implement, self-contained • Downside: Difficult to switch between alternate library implementations, awkward to share non-static library instances between plugins • Design 2: • Summary: One plugin advertises a service, other plugin(s) obtain and use it • Benefit: Supports shared, pluggable services, cleans up the code • Downside: Must refactor / wrap library code into service API(s), plus add new plugins to advertise these services • Example of interest: • Plugin “A” advertises a “WindowManagerService” and pops up an empty Swing Panel • Subsequent plugins obtain this service and add their Swing “JComponent” panels to the service by calling an “add(..)” method (instead of popping up their own windows) • The “window manager” plugin decides where to place the sub-frames

  10. Design 3: Servlet UI & Blackboard-based Calculator Plugin • Design: Instead of a service, publish the request on the blackboard • Use non-blocking blackboard operations (pub/sub) instead of a blocking method call • Characteristics: • The “calculate()” method runs in a separate plugin thread • We’re using the blackboard as both a communication and thread-switch layer • We no longer have a simple, blocking “calculate()” service API • We now have a blackboard representation of the Job • Defines our data-oriented “API” between our plugins • Other plugins can observe this interaction (e.g. for debugging, management, etc) Node Agent Servlet public void doGet(..) { Job job = new Job(params); publishAdd(job); job.waitForCompletion(); write image as JPG } Calculator public void setupSubs() { subscribe to Jobs } public void execute() { for all added Jobs { compute job notify of completion } } http Job Blackboard

  11. Analysis: Design 3

  12. Design Point: Services v.s. Blackboards • Key design points: • Design 2: • Summary: Plugins interact through blocking service method calls • Benefit: Easy blocking method APIs • Downside: Method calls run in caller’s thread and are blocking. Use of callbacks to support non-blocking APIs requires awkward thread switching. • Design 3: • Summary: Plugins interact through asynchronous blackboard pub/sub operations • Benefit: Non-blocking and parallelized, plugin “execute()” methods are single-threaded, “Job” state is visible on the blackboard • Downside: Must reorganize code to fit the pub/sub “execute()” pattern. This can introduce “bookkeeping” state, where a service-based design would keep this state “for free” on the method-call stack. • The prior example shows an awkward mixed design: • Servlet “doGet()” callbacks are blocking and must complete in that thread • The blackboard is an asynchronous pub/sub interaction • Hence the odd “job.waitForCompletion()” solution..

  13. Design 4: Non-Blocking UI • Design: Replace Servlet UI with Plugin “Screensaver” UI • The servlet case is odd, in that the “doGet(..)” request method is a blocking, external Thread call • As a point of comparison, create a Plugin-based UI client that uses a non-blocking Cougaar thread and standard blackboard pub/sub operations • Characteristics: • The UI plugin listens for a subscription change instead of a lock “notify()” • This is a more standard Cougaar interaction pattern • This approach isn’t applicable for our Servlet-based UI (but might fit a Swing UI) Node Agent Requestor public void setupSubs() { subscribe to Jobs publishAdd(new Job) } public void execute(..) { for all changed Jobs { write image as JPG } } Calculator public void setupSubs() { subscribe to Jobs } public void execute() { for all added Jobs { compute image data publishChange(job) } } /tmp/out.jpg Job write Blackboard

  14. Analysis: Design 4

  15. Design Point: Mixed Services/BB v.s. all BB • Key design points: • Design 3: • Summary: Servlet “doGet()” callback uses awkward lock wait/notify to detect blackboard work completion instead of an asynchronous subscription • Benefit:Required due to limitations of blocking Servlet callback API • Downside: Awkward mixed-metaphor of wait/notify + subscription changes • Design 4: • Summary: All interaction is through blackboard pub/sub options • Benefit: Easy integration via subscriptions, completely asynchronous • Downside: Not applicable in the Servlet case. • Most applications fit entirely into the blackboard-friendly pub/sub pattern • The design often gets awkward when plugins must interact with both blocking/callback services plus blackboard pub/sub operations • Typically results in awkward “todo” lists to switch threads • Ideally this can be avoided

  16. Design 5: Separate Job/Result Objects • Design: Instead of changing the Job, publish a separate Result object • This makes it clear that the result is a separate data structure • We’ll assume that we’re using the non-servlet “Requestor”, as in design 4 • Characteristics: • The subscriptions now look for different data structures (notice that arrows are “one-way”) • The Result object should have a pointer to the Job, or have a shared “unique job identifier” Node Agent Requestor public void setupSubs() { subscribe to Results publishAdd(new Job) } public void execute(..) { for all added Results { write image as JPG } } Calculator public void setupSubs() { subscribe to Jobs } public void execute() { for all added Jobs { compute image data publishAdd(new Result) } } /tmp/out.jpg Job Result write Blackboard

  17. Analysis: Design 5

  18. Design Point: Separate Results Object • Key design points: • Design 4: • Summary: Job has field for results data • Benefit:Fewer blackboard objects • Downside: Multiple writers to the same object, to fill in result slot • Design 5: • Summary: Calculator publishes a separate Results object • Benefit: Finer-grain subscriptions, “publishAdd” driven • Downside: More blackboard objects • This is more a matter of style

  19. Design 6: Remote Processing • Design: Transfer the job to a remote agent • Wrap the job in a relay • We’ll assume that the “master” knows a-priori about the single “slave” • Characteristics: • Can run the slave on a remote host (supports remote processing) • Adds layer of Relay “wrapping” and processing code to do our data transfer • Must transfer both the Job and its result-data (two-way comms instead of shared memory) Node 1 Node 2 Master Agent Slave Agent Servlet Calculator Relay Relay copy Job Job http Blackboard Blackboard Messages

  20. Analysis: Design 6

  21. Design Point: Centralized v.s. Distributed • Key design points: • Design 5: • Summary: Single agent with shared blackboard • Benefit: Plugins can assume that everything is on their local blackboard • Downside: Limited to single host • Design 6: • Summary: Wrap job in Relay, transfer to remote agent for processing • Benefit: Distributed, partitions work and memory across hosts • Downside: Clutters plugin code with Relay “wrapping” and “addressing”. No longer a shared memory, so Relays must transfer data back & forth. • Relays (or similar mechanism) are used to transfer data between blackboards • Required because agents don’t support shared-memory blackboards • Anytime you make something distributed you run into well-known distributed processing limitations (latency, robustness, etc) • The next design separates the Relay wrapping/addressing from the non-transfer-related plugin work

  22. Design 7: Remote Processing with Dispatcher • Design: Introduce concept of “Dispatcher” Plugin • Separates Servlet/Calculator code from remote transfer code • Still use relays to transfer jobs (an equivalent option is to use task/allocation) • Characteristics: • Can implement different kinds of dispatch policies as pluggable “Dispatcher”s • Adds job management control in the Dispatcher code • One more layer of thread switching & indirection (but that’s often a good thing) Node 1 Node 2 Master Agent Slave Agent Servlet Dispatcher Receiver Calculator http Relay copy Job Job Relay Blackboard Blackboard Messages

  23. Analysis: Design 7

  24. Design Point: Use of “Dispatcher” Plugins • Key design points: • Design 6: • Summary: Domain plugins do Relay wrapping and addressing • Benefit: Fewer plugins • Downside:Clutters domain code, difficult to enhance • Design 7: • Summary: Introduce “Dispatch / Receiver” plugins to handle Relay details • Benefit: Cleans up design, supports pluggable dispatch options • Downside: Adds more indirection, more objects on blackboard. • The “Dispatcher” design is often a good idea, except in trivial cases where the added flexibility would be overkill.

  25. Design 8: Load-balancing • Design: Support multiple worker agents • Dispatcher can choose between slaves • A job can be sent to any slave • This allows to balance work between our slaves • Allow multiple, dynamic slaves • Slaves “register” with the master agent via a Relay • Slave “pulls” down job, replies with results, and pulls next job • Add concept of separate relays for slave-to-master v.s. slave-to-master comms • Slave sends registration & results via its relay • Master sends new jobs via its relay • Creates more of a unified “comms channel”, for better error processing • Characteristics: • Can balance jobs between slaves (if we have more jobs than slaves) • Ideally one agent per CPU, distributed across hosts according to per-host CPU count • If we only have one job then this doesn’t help, since (in this design) we can’t reduce jobs into smaller tasks • Simple configuration via slave “register”, instead of hard-coding slave names in the master • More adaptive – we can dynamically support added/removed slaves Illustration on next slide..

  26. Design 8: Load balancing (2) Node 0 Master Agent Servlet Dispatcher http Job A from: Slave1 from: Slave2 to: Slave1 to: Slave2 Job B http Blackboard Node 1 Node 2 Slave1 Agent Slave2 Agent Calculator Receiver Calculator Receiver to: Master to: Master Job A Job B from: Master from: Master Blackboard Blackboard

  27. Analysis: Design 8

  28. Design Point: Single v.s. Load-balanced • Key design points: • Design 7: • Summary: Work is offloaded to a single remote worker • Benefit: Offloads work, relatively simple design • Downside: Only computes one job at a time, only supports a single worker • Design 8: • Summary: Work is dispatched to one of many workers • Benefit: Load-balanceswork, supports an arbitrary number of workers • Downside: More complex design, must choose which slave to send work to. Parallelism is limited to our job backlog. • The load-balanced solution is a general-purpose, parallelized “grid” computer • However, we’re still limited by the granularity of our Jobs.

  29. Design 9: Fine-Grained Parallel processing • Design: Divide the job into subtasks, allocate tasks to remote agents • Add concept of Job-to-Task decomposition • New Expander plugin decomposes Job into Tasks • These Tasks are published on the blackboard • Expanded detects when all tasks have been completed, aggregates the result, and completes the job • Can divide our Job into an arbitrary number of Tasks, but ideally this is guided by the Dispatcher’s knowledge of how many slaves we have • Characteristics: • Maximum parallelism. • We can split a single Job across an arbitrary number of slaves • We are no longer limited by our Job backlog • Note that a complex Job representation is required to support Task decomposition & incremental result updates Illustration on next slide..

  30. Design 9: Fine-Grained Parallel processing(2) Node 0 Master Agent Servlet Dispatcher http Expander Job Task 0 from: Slave1 from: Slave2 Task 1 to: Slave1 to: Slave2 Task N Blackboard Node 1 Node 2 Slave1 Agent Slave2 Agent Calculator Receiver Calculator Receiver to: Master to: Master Task Task from: Master from: Master Blackboard Blackboard

  31. Analysis: Design 9

  32. Design Point: Load-balanced v.s. Parallel • Key design points: • Design 8: • Summary:Entire jobs are load-balanced between workers • Benefit: Offloads work, relatively simple design • Downside: No intra-job parallelism (but separate jobs may run in parallel) • Design 9: • Summary: Uses task decomposition and balanced remote task allocation • Benefit: Highly parallelized, can parallelize a single job across multiple workers • Downside: More complex design, must track and re-assemble subtask results. Only works if the job can be decomposed into independent, parallelizable subtasks. • The primary tradeoff in this case is design complexity. • This also assumes that we can decompose our Jobs into arbitrarily small subtasks, which is not true for all applications.

  33. Design 10: Support Dispatch Policies • All the prior designs featured hard-coded behaviors: • Hard-coded or parameterized list “slave” agents • Simple allocation rule: allocate to next available slave • As an enhancement, we could modify our plugins to support more complex, policy-based behaviors. • Example Policies: • Timeout calculations and re-allocate to alternate slave • Send same task to multiple slaves, to reduce latency and add fail-over • Send multiple outstanding subtasks per slave, to reduce network latency effects (i.e. keep working while the results are being sent on the wire) • Allocate according to slave host metadata (e.g. CPU speed, network latency, scheduling relative to other work) • All of the above examples illustrate QoS adaptation

  34. Analysis: Design 10

  35. Design Point: Hard-coded behavior v.s. Policies • Key design points: • Design 98 (and prior): • Summary: Behavior is hard-coded or only supports trivial parameterization. • Benefit: Good enough for most applications. • Downside:Inflexible behavior. • Design 10: • Summary: Add plugin behavior options controlled through policies • Benefit: Pluggable / adaptive behavior • Downside: More complex to implement. • The introduction of policies and behavior options allows for a “smarter” application.

  36. Conclusions

  37. Design Analysis Summary

  38. Conclusions • The prior slides showed many different ways to build the same application but with different system properties • The first couple designs are relatively simple • Subsequent slides supported parallelism but are more complex • Each design is valid and ideal in certain environments • Each “split” of code/data introduces design complexity: • Splitting a plugin into multiple plugins requires data coordination between the plugins, requiring: • Coordination API (either a service or blackboard pub/sub) • Data structures (must be internally synchronized) • Splitting data across agents requires data partitioning and transfer code • Must decide which data resides on which agent(s) • Must transfer the data, typically via the blackboard (e.g. Relays)

  39. Conclusions (2) • Service-based API are useful in limited cases • Ideal for wrapping simple libraries (e.g. log4j) • Should be non-blocking and not require blackboard access • Don’t block pooled threads • Requires a thread switch, otherwise you’ll get a blackboard “nested transaction” problems • See the “todo” pattern and other (awkward) workarounds • In contrast, blackboard interactions are non-blocking • This is good in that it switches threads and avoids blocking the plugin when performing remote I/O, which increases parallelism • It’s bad in that the plugin code must support an asynchronous call and subsequent “execute()” method resume when the result is published • The result is sometimes added “bookkeeping” state in the plugin, to remember where prior async calls left off. This is effectively a “continuation”.

More Related