480 likes | 614 Views
Towards Elastic Operating Systems. Amit Gupta Ehab Ababneh Richard Han Eric Keller. University of Colorado, Boulder. OS + Cloud Today. OS/Process. ELB/ Cloud Mgr. Resources Limited Thrashing CPUs limited I/O bottlenecks Network Storage. P resent Workarounds
E N D
Towards Elastic Operating Systems Amit GuptaEhab AbabnehRichard HanEric Keller University of Colorado,Boulder
OS + Cloud Today OS/Process ELB/ CloudMgr • Resources Limited • Thrashing • CPUs limited • I/O bottlenecks • Network • Storage • Present Workarounds • Additional Scripting/Code changes • Extra Modules/Frameworks • Coordination • Synch/Aggregating State
Stretch Process OS/Process • Advantages • Expands available Memory • Extends the scope of Multithreaded Parallelism (More CPUs available) • Mitigates I/O bottlenecks • Network • Storage
ElasticOS: Our Goals • “Elasticity” as an OS Service • Elasticize all resources – Memory,CPU, Network, … • Single machine abstraction • Apps unaware whether they’re running on 1 machine or 1000 machines • Simpler Parallelism • Compatible with an existing OS (e.g Linux, …)
“Stretched” Process Unified Address Space OS/Process Elastic Page Table Location
Movable Execution Context OS/Process • OS handles elasticity – Apps don’t change • Partition locality across multiple nodes • Useful for single (and multiple) threads • For multiple threads, seamlessly exploit network I/O and CPU parallelism
Replicate Code, PartitionData CODE CODE CODE Data 1 Data 2 • Unique copy of data (unlikeDSM) • Execution context follows data (unlikeProcess Migration, SSI )
Exploiting Elastic Locality • We need an adaptive page clustering algorithm • LRU, NSWAP i.e “always pull” • Execution follows data i.e “always jump” • Hybrid (Initial): Pull pages, then Jump
Status and Future Work • Complete our initial prototype • Improve our page placement algorithm • Improve context jump efficiency • Investigate Fault Tolerance issues
Contact: amit.gupta@colorado.edu Thank YouQuestions ?
Page PlacementMultinode Adaptive LRU Pulls Threshold Reached ! JumpExecution Context Pull First Mem Mem CPUs Swap CPUs Swap
Locality in a Single Thread Temporal Locality Mem Mem CPUs Swap CPUs Swap
Locality across Multiple Threads CPUs Swap Mem Mem CPUs Swap CPUs Swap
Exploiting Elastic Locality • Assumptions • Replicate Code Pages, Place Data Pages (vs DSM) • We need an adaptive page clustering algorithm • LRU, NSWAP • Us (Initial): Pull pages, then Jump
Replicate Code, Distribute Data CODE CODE CODE Data 1 Data 2 AccessingData 1 • Unique copy of data (vs DSM) • Execution context follows data (vs Process Migration) AccessingData 2 AccessingData 1
Benefits • OS handles elasticity – Apps don’t change • Partition locality across multiple nodes • Useful for single (and multiple) threads • For multiple threads, seamlessly exploit network I/O and CPU parallelism
Benefits (delete) • OS handles elasticity • Application ideally runs unmodified • Application is naturally partitioned … • By Page Access locality • By seamlessly exploiting multithreaded parallelism • By intelligent page placement
Execution Context JumpingA single thread example Process Address Space Address Space Node 2 Node 1 TIME
“Stretch” a Process Unified Address Space Process Address Space Address Space Node 2 Node 1 Page Table IP Addr
Operating Systems Today • Resource Limit = 1 Node Mem Process OS Disks CPUs
Cloud Applications at Scale More Queries ? Cloud Manager LoadBalancer More Resources ? Process Process Process Partitioned Data Partitioned Data Partitioned Data Framework (eg. Map Reduce)
Our findings • Important Tradeoff • Data Page Pulls VsExecution Context Jumps • Latency cost is realistic • Our Algorithm: Worst case scenario • “always pull” == NSWAP • marginal improvements
Advantages • Natural Groupings: Threads & Pages • Align resources with inherent parallelism • Leverage existing mechanisms for synchronization
“Stretch” a Process :Unified Address Space A “Stretched” Process =Collection of Pages + Other Resources { Across Several Machines } Page Table Mem Mem IP Addr Swap Swap CPUs CPUs
delete Exec. context follows Data • Replicate Code Pages • Read-Only => No Consistency burden • Smartly distribute Data Pages • Execution context can jump • Moves towards data • *Converse also allowed*
Elasticity in Cloud Apps Today ~~~~ ~~~~ ~~~~ Input Data D1 D2 Dx Mem …. Disk CPUs ~~~~ ~~~~ ~~~~ Output Data
Input Queries Load Balancer D1 D2 Dy Dx Mem …. Disk CPUs ~~~~ ~~~~ ~~~~ Output Data
(delete)Goals : Elasticity dimensions • Extend Elasticity to • Memory • CPU • I/O • Network • Storage
Open Topics • Fault tolerance • Stack handling • Dynamic Linked Libraries • Locking
Elastic Page Table Local Mem Swap space Remote Mem RemoteSwap
“Stretch” a Process • Move beyond resource boundaries of ONE machine • CPU • Memory • Network, I/O
~~~~ ~~~~ ~~~~ Input Data CPUs CPUs …. D1 D2 Mem Mem Disk Disk ~~~~ ~~~~ ~~~~ Output Data
~~~~ ~~~~ ~~~~ Data CPUs CPUs Mem Mem D1 D2 Disk Disk