130 likes | 272 Views
OS support for Teraflux A Prototype. Avi M endelson Doron Shamia. System and Execution Models Data Flow Based. System is made out of clusters. Each cluster contains 16 cores ( may change) Each cluster is controlled by a single “OS kernel”; e.g., Linux , L4
E N D
OS support for TerafluxA Prototype Avi Mendelson Doron Shamia
System and Execution ModelsData Flow Based • System is made out of clusters. • Each cluster contains 16 cores (may change) • Each cluster is controlled by a single “OS kernel”; e.g., Linux, L4 • Execution is made up of tasks; each task • Has no side effects • Are scheduled with their data (may use pointers) • May return results • If fail to complete, can be reschedule on the same core/other core • Tasks can be executed on any (service) cluster and has a unified view of system memory • All resource allocation/management is done in two levels, a local one and a global one
System Overview Target Protoyped System Cores View Memory View Linux CPU CPU CPU CPU L4 CPU CPU CPU CPU Configuration Page Message Buffers CPU CPU CPU CPU Linux CPU CPU CPU CPU L4 CPU == Cluster
Target SystemOS Requirements Linux Linux (Full OS) CPU CPU CPU CPU • Each uK runs a job • Jobs sent by full OS (FOS) • Jobs have no side-effects • Failed jobs are simply restarted • Runs low level FT, reporting to FOS Single chip Multi cores CPU CPU CPU CPU L4 (uKernel) CPU CPU CPU CPU • Manages jobs on uKernel (uK) cores • Proxies uKs I/O requests • Remote debug uKs/self • Runs high level (system) FT managing uK/self faults CPU CPU CPU CPU L4
Communications (1) Buffer L4 • Ownership (L4/Linux) • Ready flag • Type • Length (bytes) • Data • Fixups (optional) Buffer Configuration Page Message Buffers Buffer Buffer Linux
Communications (2) Buffer L4 • Ownership (L4/Linux) • Ready flag • Type • Length (bytes) • Data • Fixups (optional) Buffer Configuration Page Message Buffers Buffer Buffer • Ownership: who currently uses the buffer • Ready: Signals the buffer is ready to be transferred to the other side (inverse owner) • Type: The message type • Data: simply the raw data (according to type) • Fixups: A list of fixups in case we pass pointers Linux
Current Prototype • Goal: Quick development of OS support, and applications (later to move on COTson full prototype) • Quick prototyping via VMs • Linux on both ends (Fedora 13) • Main node = Linux (host) • Service Nodes = Linux (VMs) • Using shared memory between • Host and VMs • Between VMs • Shared memory uses kernel driver (ivshmem)
Prototype Architecture Linux F13 (Host) App Linux F13 QEMU Linux F13 QEMU User space Kernel space IVSHMEM Linux F13 QEMU Linux F13 QEMU
IV Shared Memory Arch mmap to user level Exposed as a PCI BAR QEMU maps shared-memory into RAM
Communications App App Linux F13 QEMU Linux F13 (Host) Host App Logic Data Flow App Message queue API Message queue API Linux F13 QEMU User space Kernel space Shared RAM Msg Msg Msg Linux F13 QEMU Linux F13 QEMU
Demo (toy) Apps • Distributed sum app • Single work dispatcher (host) • Multiple sum-engines (VMs) • Distributed Mandelbrot • Single work dispatcher – lines (host) • Multiple compute engines – compute pixels of each line (VMs)
Futures • Single Boot • A TeraFlux chips boots a FOS • FOS boots the uKs on the other cores • Looks like a single boot process • Distributed Fault Tolerance • Allow uK/FOS to test each others health • One step beyond FOS-centric FT • Cores Repurposing • If FOS cores fail, uK cores re-boot as FOS • New FOS takes over using last valid data snapshot
References Inter-VM Shared memory