Presented by: Sagnik Bhattacharya

Cellular Disco Kingshuk Govil, Dan Teodosiu, Yongjang Huang, Mendel Rosenblum Presented by: Sagnik Bhattacharya

Overview • Problems of current shared memory multiprocessors and our requirements • Cellular Disco as a solution • architecture • prototype • hardware-fault containment • CPU management • Memory management • statistics • Cellular Disco and ubiquitous environments • Conclusion

Problem • Extending modern Operating systems to run efficiently on shared memory multiprocessors. • Software development has not kept pace with hardware development. • Common operating systems fail beyond 12 processors.

What we need…. • the system should be reliable • it should be scalable • it should be fault-tolerant • it should not take too much of development time or effort.

Traditional approaches • Hardware partitioning - lacks resource sharing, makes physical clusters. • Software-centric approaches : (significant development time and cost) • modify existing OS • develop new OS

Control unit Proc Proc Proc Proc A scenario…. Smart Space (No rebooting necessary)

Solution : Cellular Disco • Extension of previous work - Disco • Uses the concept of Virtual machine monitors • Partitions the multiprocessor system into virtual clusters.

Virtual Machine Monitor OS (Win NT) OS (IRIX 6.2) VM1 VM2 Virtual Machine Virtual Machine µP1 µP2 µP3 µP1 µP3 µP5 µP8 VM1 - µP’s 1,2,3 VM2 - µP’s 1,3,5,8 Virtual Machine Monitor Hardware

VM1 µP1 µP2 µP3 OS (Win NT) I/O request OS (IRIX 6.2) VM2 µP1 µP3 µP5 µP8 VM1 - µP’s 1,2,3 VM2 - µP’s 1,3,5,8 Virtual Machine Monitor

VM1 µP1 µP2 µP3 OS (Win NT) OS (IRIX 6.2) VM2 µP1 µP3 µP5 µP8 Trap I/O request & perform I/O VM1 - µP’s 1,2,3 VM2 - µP’s 1,3,5,8 Virtual Machine Monitor

VM1 µP1 µP2 µP3 OS (Win NT) OS (IRIX 6.2) VM2 µP1 µP3 µP5 µP8 Perform I/O and send interrupt VM1 - µP’s 1,2,3 VM2 - µP’s 1,3,5,8 Virtual Machine Monitor

VM1 µP1 µP2 µP3 OS (Win NT) OS (IRIX 6.2) VM2 µP1 µP3 µP5 µP8 VM1 - µP’s 1,2,3 VM2 - µP’s 1,3,5,8 Virtual Machine Monitor

Issues it addresses • Address scalability • NUMA awareness • Hardware fault-containment • Resource management

Basic Cellular Disco Architecture

Prototype • Runs on a 32-processor SGI-Origin 2000 • Supports shared memory systems based on MIPS R1000 architecture. • The prototype runs piggybacked on IRIX 6.4 • The host OS is made dormant and is only used to invoke some device drivers.

Hardware Virtualization • Physical Resources - visible to a virtual machine • Machine Resources - actual resources; allocated by Cellular Disco • CD operates in the kernel mode of the MIPS processor • CD intercepts all system calls.

Resource Management • CPU management - Each processor maintains its own run queue • Memory Management - Memory borrowing mechanism • Each OS instance is only given as many resources as it can handle. Large applications are split and communications between the parts is established by using the shared-memory regions.

CPU Management • VCPU migration : - Intra node (37 µsec) - Inter node (520 µsec) - Inter Cell (1520 µsec)

Cellular Disco Interconnect VCPU migration Cell Cell Cell VCPU CPU CPU CPU CPU CPU CPU CPU CPU CPU Node Node Node Node Node Node

Cellular Disco Interconnect Intra Node Cell Cell Cell VCPU CPU CPU CPU CPU CPU CPU CPU CPU CPU Node Node Node Node Node Node

Cellular Disco Interconnect Inter Node Cell Cell Cell VCPU CPU CPU CPU CPU CPU CPU CPU CPU CPU Node Node Node Node Node Node

Cellular Disco Interconnect Inter Cell Cell Cell Cell VCPU CPU CPU CPU CPU CPU CPU CPU CPU CPU Node Node Node Node Node Node

CPU Management(contd.) • CPU balancing : Idle Balancer Periodic balancer Load Balancing Scenario

Idle balancer CPU0 CPU1 CPU2 CPU3 (Idle) VC A0 VC A1 Asks VC B0 VC B1 Does this have enough cache affinity to CPU2?

Idle balancer CPU0 CPU1 CPU2 CPU3 (Idle) VC A0 VC A1 Asks VC B0 VC B1 Does this have enough cache affinity to CPU2? NO!!

Idle balancer CPU0 CPU1 CPU2 CPU3 VC B1 VC A0 VC A1 VC B0 VC B1

Periodic Balancer • Does depth-first traversal of the load tree 4 1 3 Traversal 1 0 2 1

Periodic Balancer • Checks difference of 2 siblings, ignores if<2 4 1 3 Traversal 1 0 2 1 Diff=1 Diff=1

Periodic Balancer • If diff>=2 does load balancing if benefit>cost 4 1 3 Traversal Diff=2 1 0 2 1

Gang Scheduling • For all the CPU’s we select the VCPU that is to run on the physical CPU. • The VCPU selected is the highest priority be gang-runnable VCPU • all non-idle VCPU’s of that VM are either • running or, • waiting on run queues of processors running lower-priority VM’s.

Example VM1 VC’s - 1,3,8(idle) Wait Queue µP1 : VC1 VC7 VC5 VM2 VC’s - 2,4,6(idle),7 Priority µP2 : VC2 VC1 VC9 µP3 : VC5 VC3 VC4 VM3 VC’s - 5,9 Currently Executing VCPU

Example VM1 VC’s - 1,3,8 (idle) µP1 : VC1 VC7 VC5 VM2 VC’s - 2,4,6(idle),7 Priority µP2 : VC2 VC1 VC9 µP3 : VC5 VC3 VC4 VM3 VC’s - 5,9 Gang Runnable

Example VM1 VC’s - 1,3,8(idle) New Wait Queue µP1 : VC5 VC7 VC1 VM2 VC’s - 2,4,6(idle),7 Priority µP2 : VC9 VC1 VC2 µP3 : VC5 VC3 VC4 VM3 VC’s - 5,9 New Executing VCPU

Memory Management • Each cell maintains its own freelist, and allocates memory to other cells in it allocation preference list on request(RPC). • Speed - 758 µsec for 4 MB. • A threshold is set for min. amount of local free memory • As far as possible Paging is avoided.

Memory Borrowing • freelist - list of free pages in the cell • allocation preference list - list of cells from which borrowing memory is more beneficial than paging.

Memory Borrowing Freelist sizes 32 MB Lending threshold 16 MB Borrowing threshold Cell 1 Cell 2 Cell 3 Cell 4 Cell 5

Memory Borrowing Freelist sizes 32 MB Lending threshold asks 16 MB Borrowing threshold Cell 1 Cell 2 Cell 3 Cell 4 Cell 5

Memory Borrowing Freelist sizes 32 MB Lending threshold refused 16 MB Borrowing threshold Cell 1 Cell 2 Cell 3 Cell 4 Cell 5

Memory Borrowing Freelist sizes 32 MB Lending threshold cannot ask 16 MB Borrowing threshold Cell 1 Cell 2 Cell 3 Cell 4 Cell 5

Memory Borrowing Freelist sizes asks 32 MB Lending threshold 16 MB Borrowing threshold Cell 1 Cell 2 Cell 3 Cell 4 Cell 5

Memory Borrowing Freelist sizes Gives 4 MB 32 MB Lending threshold 16 MB Borrowing threshold Cell 1 Cell 2 Cell 3 Cell 4 Cell 5

Memory Borrowing Freelist sizes 32 MB Lending threshold 16 MB Borrowing threshold Cell 1 Cell 2 Cell 3 Cell 4 Cell 5

Memory Management (contd.) • Paging : Algo - Second Chance FIFO • Page sharing information by some control data structure • Cellular Disco traps all read and write requests made by the Operating Systems

Second-chance FIFO • A reference bit is added to each page in FIFO scheme • Every time the page is accessed the bit is set to 1 • If the page is selected by FIFO, and the reference bit is 1, then it is set to 0 and another page is looked for. • A page is the target page if it is selected b FIFO and the reference bit is 0

Example Page Table Page Fault 1 Oldest Page FIFO 0 Second Oldest Page RB

Example Page Table Page Fault 0 Oldest Page Second-chance FIFO 0 Second Oldest Page RB

Example Page Table 0 Oldest Page RB

Hardware fault-containment • Failure rate increases with increase in processors. • Internally structured as a set of semi-independent cells. • Failure in one cell does not impact VM’s running in other cells (localization of faults) • Assumption - CD is a trusted software layer

Cellular Structure Fault in one cell does not affect others

Hardware fault-containment (contd.) • Communication modes - Fast inter-processor RPC - Message • Side benefit - Software fault containment, i.e., individual OS crashes do not impact the system.

Presented by: Sagnik Bhattacharya

Presented by: Sagnik Bhattacharya

Presentation Transcript

Presented at Heart Rhythm 2008 in San Francisco, USA Presented by Stefan H. Hohnloser, MD

Governance and Development

Presented By Prof: P.SUKUMARAN HOD,AEI

presented by William J. Judge, JD, LL.M .

WAYS TO PROMOTE/ENFORCE THE REGISTRATION OF HIDDEN TAXPAYERS PAPER PRESENTED TO ITD AFRICA CONFERENCE ON

HAZARDS OF COMBUSTIBLE DUST PRESENTED FOR : THE SUMMIT COUNTY SAFETY COUNCIL MEETING SEPTEMBER 21, 2011

Presented By Shadi and Jingjing

PCB Workshop No. 2 Presented By;

PRESENTED BY: KEN CARKHUFF NAWCAD SMALL BUSINESS DEPUTY OFFICE OF SMALL BUSINESS PROGRAMS 29 JUNE 2010

Implementing Cultural Competent Care to Substance Users Diagnosed with HIV Presented: August 26, 2011 Updated: July 29,

Presented by

Welcome Home! ((( Learning 2 Earn )))

Computer Networks - Theory and Practice

Presented to: 2005 APPA Legal Seminar San Antonio, TX November 16, 2005 Presented by:

PRESENTED BY Mr RSS Zitha

Presented by:

Designing an Interactive System for the Disabled Users

Presented by Indiana Treasurer of State’s Office

Introduction to Manufacturing Technology (Overview of Manufacturing technologies)

LOCAL ANESTHESIA presented by deepti awasthi

Presented by: Lynda Laff Pat Laff