1 / 48

Towards Elastic Operating Systems

Towards Elastic Operating Systems. Amit Gupta Ehab Ababneh Richard Han Eric Keller. University of Colorado, Boulder. OS + Cloud Today. OS/Process. ELB/ Cloud Mgr. Resources Limited Thrashing CPUs limited I/O bottlenecks Network Storage. P resent Workarounds

enye
Download Presentation

Towards Elastic Operating Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Towards Elastic Operating Systems Amit GuptaEhab AbabnehRichard HanEric Keller University of Colorado,Boulder

  2. OS + Cloud Today OS/Process ELB/ CloudMgr • Resources Limited • Thrashing • CPUs limited • I/O bottlenecks • Network • Storage • Present Workarounds • Additional Scripting/Code changes • Extra Modules/Frameworks • Coordination • Synch/Aggregating State

  3. Stretch Process OS/Process • Advantages • Expands available Memory • Extends the scope of Multithreaded Parallelism (More CPUs available) • Mitigates I/O bottlenecks • Network • Storage

  4. ElasticOS : Our Vision

  5. ElasticOS: Our Goals • “Elasticity” as an OS Service • Elasticize all resources – Memory,CPU, Network, … • Single machine abstraction • Apps unaware whether they’re running on 1 machine or 1000 machines • Simpler Parallelism • Compatible with an existing OS (e.g Linux, …)

  6. “Stretched” Process Unified Address Space OS/Process Elastic Page Table Location

  7. Movable Execution Context OS/Process • OS handles elasticity – Apps don’t change • Partition locality across multiple nodes • Useful for single (and multiple) threads • For multiple threads, seamlessly exploit network I/O and CPU parallelism

  8. Replicate Code, PartitionData CODE CODE CODE Data 1 Data 2 • Unique copy of data (unlikeDSM) • Execution context follows data (unlikeProcess Migration, SSI )

  9. Exploiting Elastic Locality • We need an adaptive page clustering algorithm • LRU, NSWAP i.e “always pull” • Execution follows data i.e “always jump” • Hybrid (Initial): Pull pages, then Jump

  10. Status and Future Work • Complete our initial prototype • Improve our page placement algorithm • Improve context jump efficiency • Investigate Fault Tolerance issues

  11. Contact: amit.gupta@colorado.edu Thank YouQuestions ?

  12. Algorithm Performance(1)

  13. Algorithm Performance(2)

  14. Page PlacementMultinode Adaptive LRU Pulls Threshold Reached ! JumpExecution Context Pull First Mem Mem CPUs Swap CPUs Swap

  15. Locality in a Single Thread Temporal Locality Mem Mem CPUs Swap CPUs Swap

  16. Locality across Multiple Threads CPUs Swap Mem Mem CPUs Swap CPUs Swap

  17. Unlike DSM…

  18. Exploiting Elastic Locality • Assumptions • Replicate Code Pages, Place Data Pages (vs DSM) • We need an adaptive page clustering algorithm • LRU, NSWAP • Us (Initial): Pull pages, then Jump

  19. Replicate Code, Distribute Data CODE CODE CODE Data 1 Data 2 AccessingData 1 • Unique copy of data (vs DSM) • Execution context follows data (vs Process Migration) AccessingData 2 AccessingData 1

  20. Benefits • OS handles elasticity – Apps don’t change • Partition locality across multiple nodes • Useful for single (and multiple) threads • For multiple threads, seamlessly exploit network I/O and CPU parallelism

  21. Benefits (delete) • OS handles elasticity • Application ideally runs unmodified • Application is naturally partitioned … • By Page Access locality • By seamlessly exploiting multithreaded parallelism • By intelligent page placement

  22. How should we place pages ?

  23. Execution Context JumpingA single thread example Process Address Space Address Space Node 2 Node 1 TIME

  24. “Stretch” a Process Unified Address Space Process Address Space Address Space Node 2 Node 1 Page Table IP Addr

  25. Operating Systems Today • Resource Limit = 1 Node Mem Process OS Disks CPUs

  26. Cloud Applications at Scale More Queries ? Cloud Manager LoadBalancer More Resources ? Process Process Process Partitioned Data Partitioned Data Partitioned Data Framework (eg. Map Reduce)

  27. Our findings • Important Tradeoff • Data Page Pulls VsExecution Context Jumps • Latency cost is realistic • Our Algorithm: Worst case scenario • “always pull” == NSWAP • marginal improvements

  28. Advantages • Natural Groupings: Threads & Pages • Align resources with inherent parallelism • Leverage existing mechanisms for synchronization

  29. “Stretch” a Process :Unified Address Space A “Stretched” Process =Collection of Pages + Other Resources { Across Several Machines } Page Table Mem Mem IP Addr Swap Swap CPUs CPUs

  30. delete Exec. context follows Data • Replicate Code Pages • Read-Only => No Consistency burden • Smartly distribute Data Pages • Execution context can jump • Moves towards data • *Converse also allowed*

  31. Elasticity in Cloud Apps Today ~~~~ ~~~~ ~~~~ Input Data D1 D2 Dx Mem …. Disk CPUs ~~~~ ~~~~ ~~~~ Output Data

  32. Input Queries Load Balancer D1 D2 Dy Dx Mem …. Disk CPUs ~~~~ ~~~~ ~~~~ Output Data

  33. (delete)Goals : Elasticity dimensions • Extend Elasticity to • Memory • CPU • I/O • Network • Storage

  34. Thank You

  35. Bang Head Here !

  36. Stretching a Thread

  37. Overlapping Elastic Processes

  38. *Code Follows Data*

  39. Application Locality

  40. Possible Animation?

  41. Multinode Adaptive LRU

  42. Possible Animation?

  43. Open Topics • Fault tolerance • Stack handling • Dynamic Linked Libraries • Locking

  44. Elastic Page Table Local Mem Swap space Remote Mem RemoteSwap

  45. “Stretch” a Process • Move beyond resource boundaries of ONE machine • CPU • Memory • Network, I/O

  46. ~~~~ ~~~~ ~~~~ Input Data CPUs CPUs …. D1 D2 Mem Mem Disk Disk ~~~~ ~~~~ ~~~~ Output Data

  47. ~~~~ ~~~~ ~~~~ Data CPUs CPUs Mem Mem D1 D2 Disk Disk

  48. Reinventing Elasticity Wheel

More Related