E N D
1. Use of SPEEDES for BMDSsim
2. 2 Overview Metron
SPEEDES
Approach to BMDSsim: Clustering
Summary
3. 3 Metron, Inc.
4. 4 SPEEDESSynchronous Parallel Environment for Emulation and Discrete-Event Simulation Powerful optimistic-processing parallel processing engine
Developed, maintained, and distributed by Metron since 1996
Open source code downloadable to qualified users
On-line documentation
On-line change request system
Primary users: MDWAR and related MDA projects
IMDSE
Conservative only
Windows NT
SPEEDES just works and lets them model
C2BMC
ABL
Other SPEEDES-related efforts:
Air Force Research Lab (Rome Labs)
Rome Labs funded iterator improvement, incorporated in Version 2
Distributed Information Enterprise Modeling and Simulation (DIEMS)
Parallel multiple-course-of-action SPEEDES enhancement
NASA KSC
Independent IV&V of the SPEEDES variant developed for JSIMS
5. 5 Metron’s SPEEDES Team
6. 6 SPEEDES Early Development and Modern Versions Chosen as framework for MDWAR in late 1996
Early beta versions concentrated on functionality rather than reliabilty
Frequently buggy, undocumented, poor performing
Version 1.0 (November 2000)
Completed the Unified API
Added the SPEEDES User’s Guide*
Added the API Reference Manual*
Much of the obsolete code was removed
Version 2.0 (September 2001)
Added object proxy attribute subscription
Added automatic lazy re-evaluation
General code optimization (size and speed)
7. 7 Modern Versions of SPEEDES Version 2.1 (September 2003)
Second port to NT: Ported to Microsoft Visual Studio
Removed novel event queue design
Resulted in net performance improvement
General code optimization
Reduced memory requirements
Reduced executable size
Fixed numerous Data Declaration Management bugs
Added optimized conservative algorithm
Version 2.2 (August 2004)
Initial implementation of shared memory host router
Standard Template Library (STL) containers
RB_map, RB_list, RB_vector, RB_multimap
All are significantly (4-10x) faster than current rollbackable containers
Function with non-rollbackable STL algorithms library
General performance improvements
8. 8 Approach to BMDSsim: Clustering Clustering can lead to high performance federations
Retains ability for easy debugging modes
Can link up through shared memory or TCP/IP
Design allows for MDWAR Standard Gateway (MSG) connections
Elements could hook together in variety of fashions
Optimistic: Full optimistic time management with rollbacks
Includes the option of connecting through shared memory on the same machine or TCP/IP for those that are remote.
Conservative: Linked through MSGs
Playback: A element could be replaced by an MSG Playback for standalone testing/debugging
Any combination of the above
9. 9 How clustering works
10. 10 Summary Main SPEEDES focus is and will be on stability, and reliability
Performance has already been proven
Continuing use in wargames provides rigorous test environment
Mature set of tools help optimize performance, minimize overhead
SPEEDES instrumentation
MDWAR simulation instrumentation and analysis tools
Rules of thumb
Continuing improvement
Changes for usability
Reduction in memory and CPU footprint
AFRL funded parallel course of action simulation (due March 2006)
MDA can have confidence in high performance, low risk for BMDSsim
11. Back-ups(Lessons Learned)
12. 12 Lessons learned SPEEDES has been extraordinarily resilient
Almost all performance problems have been due to improper modeling
Framework bugs are now rare
Significantly impacted development in the early (< v 0.8) years
Proxy mechanism is solid but tightly couples models
Use of proxy updates has decreased significantly
Proxies use often indicates incorrect modeling
Not communicating through message sets
Unnecessary or excessive notifications
Attribute subscription carries a small penalty
Often used to simply unsubscribe totally to proxy updates
13. 13 Lessons learned (cont) Performance tuning requires analysis tools
Real time performance does not come for free
Built up suite of analysis tools
SPEEDES instrumentation has been extensive and varied
MDWAR has many tools to analyze the instrumentation files
Rules of thumb learned about modeling
New APIs added to improve parallelism
SPEEDES overhead is minimal (usually 10s of micro-seconds/event)
Recent tests using MDWAR 5.0 (SPEEDES 2.1) on 1 node (optimistic) shows a ~15% framework overhead.
BTW is within 20% of sequential on SPEEDES 2.1, should be 10-15% with 2.2
I/O is a killer.
Data collection has a minor impact
Biggest problem is making sure we collect enough