140 likes | 257 Views
Writeup. Fault tolerance. Given that FT is critical, what could/should be done at hw/os/runtime/app level? ALL. Structure of Scalable OS. What are the entities? How we define local/global OS functions What is the functionality of the local OS services? Is none an answer?
E N D
Fault tolerance • Given that FT is critical, what could/should be done at hw/os/runtime/app level? • ALL
Structure of Scalable OS • What are the entities? • How we define local/global OS functions • What is the functionality of the local OS services? Is none an answer? • What are global functions? • Can we adapt PVM to the app it supports? • Protection boundaries and virtualization with OS • What’s OS/runtime split? • ALL
APIs • Runtime/OS • Application/runtime • Tool interfaces (including debugging) • Interfaces to environment info • 10
Specific functions • Process management 9 • File system 18 • Scheduling 10 • Security 2 • QoS 2 • Debugging – invariants 9
OS scalability • What OS services could/should scale • How do we define scalability? • performance nearly independent of machine size? • reliability nearly independent of machine size • 10
OS for heterogeneous hw • How do we build runtime/OS support for “crazy” architectures? • FPGAs, PIMs,… • Do we adapt one parallel OS to very different hw architectures? Do we need different OS/runtime solutions? • What is the spectrum of hw architectures that we can support with one common OS/runtime design? • 15
Interactive systems • How do we move HEC into interactive environments? • What are interactive HEC apps? • How do we do interactive debugging? Interactive tools? Interactive computational steering? Short shell commands? WS acceleration model? Visualization? • 12
Hw support for OS • Study which hw features are important to future scalable OS/runtime – so as to influence hw design; E.g. • Protection • Reliable networks • Collective ops • Atomic memory ops • Transactional memory • 16
Application requirements • What OS calls are now used by High Perf Apps? • What requirements can we derive for OS/runtime in future systems from apps? • Identify critical apps we care about • 14
OS metrics • What benchmarks and metrics we use to measure success? • 8
Programmatic • How we get organized to do research in scalable OS? • Multiple approaches • Extreme alternatives • Vendor involvement 12
Vendors • How can we use existing OS sw • Proprietary and/or open source • 8
Testbeds • How do we establish testbeds to support scalable OS/runtime research • Who funds them • What is a testbed? Architecture specific? Simulator? • 15