6.894: Distributed Operating System Engineering

6.894: Distributed Operating System Engineering Lecturers: Frans Kaashoek (kaashoek@mit.edu) Robert Morris (rtm@lcs.mit.edu) TA: Jinyang Li (jinyang@lcs.mit.edu) www.pdos.lcs.mit.edu/6.894

Operating System • Software that turns silicon into something useful • Provides applications with a programming interface • Manages hardware resources on behalf of applications

Distributed Operating System • The holy grail: transparency • provide applications with a virtual machine consisting of many processors distributed around the network. • Distributed OS engineering is difficult: • Failures • High-degree of concurrency • Long latencies • New classes of security attacks

Client/Server Architecture • A modular architecture to structure distributed systems • Clients request services from servers • Client and servers communicate with messages • Servers are typically trusted • Other architectures • Peer-to-peer (decentralized) • Single address space

6.894 topics • Client-server components • Remote procedure call, threads, address spaces, etc. • Storage • File systems, transactions • Security • Confidentiality, authentication, etc. • Scalable servers

6.894 is an advanced 6.033 • Perform actual systems research • Perform a research project • Study recent research papers • Design systems for real workloads • New abstractions, protocols, datastructures, algorithms, etc. • Build a real system (lab) • Real enough that you can use it

Internet video-on-demand server • Example to study issues and overview 6.894 • Requirements: • Low and high-quality video • Many users, spread around the Internet • Last mile bandwidth may be low • Access control

Client() { fd = connect(“server”); write (fd, “video.mpg”); while (!eof(fd)) { read (fd, buf); display (buf); } } Server() { while (1) { cfd = accept(); read (cfd, name); fd = open (name); while (!eof(fd)) { read(fd, block); write (cfd, block); } close (cfd); close (fd); }} Client and server structure

Performance “analysis” • Server capacity: • Network (100 Mbit/s) • Disk (20 Mbyte/s) • Obtained performance: one client stream • Server is limited by software structure • If a video is 200 Kbit/s, server should be able to support more than one client.

Better single-server performance • Goal: run at server’s hardware speed • Disk or network should be bottleneck • Method: • Pipeline blocks of each request • Multiplex requests from multiple clients • Two implementation approaches: • Multithreaded server • Asynchronous I/O

server() { while (1) { cfd = accept(); read (cfd, name); fd = open (name); while (!eof(fd)) { read(fd, block); write (cfd, block); } close (cfd); close (fd); }} for (i = 0; i < 10; i++) fork (server); Multithreaded server • When waiting for I/O, thread scheduler runs another thread • All shared data must protected by locks • Release locks when blocking

struct callback { bool (*is_ready)(); void (*cb)(arg); void *arg; } main() { while (1) { for (c = each callback) { if (c->is_ready()) c->handler(c->arg); } } } Asynchronous I/O • Code is structured as a collection of handlers • Handlers are nonblocking • Create new handlers for blocking operations • When operation completes, call handler

init() { on_accept(accept_cb); } accept_cb() { on_readable(cfd,name_cb); } on_readable(fd, fn) { c = new callback(test_readable, fn, fd); add c to callback list; } name_cb(cfd) { read(cfd,name); fd = open(name); on_readable(fd, read_cb); } read_cb(cfd, fd) { read(fd, block); on_writeeable(fd, write_cb); } write_cb(cfd, fd) { write(cfd, block); on_readable(fd, read_cb); } Asychronous server

Hard to program Locking code Need to know what blocks Coordination explicit State stored on thread’s stack Memory allocation implicit Context switch may be expensive Multiprocessors Hard to program Callback code Need to know what blocks Coordination implicit State passed around explicitly Memory allocation explicit Lightweight context switch Uniprocessors Multithreaded vs. Async

Threaded server: Thread for network interface Interrupt wakes up network thread Protected (locks and conditional variables) shared buffer shared between server threads and network thread Asynchronous I/O Poll for packets How often to poll? Or, interrupt generates an event Be careful: disable interrupts when manipulating callback queue. Coordination example

Scheduling: polling vs. interrupts • Maintain peak performance under heavy load • Interrupts model can lead to livelock • Solution: • Use interrupts under low load (good latency) • Use polling under heavy load (good throughput) • Polling is typically more efficient than interrupts • Fits naturally into asynchronous I/O model

Other design issues • Disk scheduling • Elevator algorithm • Memory management • File system buffer cache • Address spaces (VM management) • Fault isolate different servers • Efficient local communication? • Efficient transfers between disk and networks • Avoid copies

More than one processor • Problem: single machine may not scale to enough clients • Solutions: • Multiprocessors • Helps when CPU is bottleneck • Server clusters • Helps when bandwidth between server and backbone is high • Distributed server clusters • Helps when bandwidth between client and distant server is low

Clusters • Naming transparency • Server cluster transparent to client? • Server selection • Metrics: CPU load, presence of data • Consistency • Partition data • Availability • More processors can decrease reliability • Replicate data (makes consistency more difficult)

Distributed clusters • Replication policies • Data distribution • Consistency • Network monitoring and modeling • Global load balancing Tradeoff between accuracy, latency, and network load

Making it secure: access control • Redo design: don’t add on • Firewalls: insecure and break many things • CPU cycles is an issue • A secure HTTP server can do about 10-20 connections a second • Pulls in other global issues • Name to key binding • Key management infrastructure

Example summary • Pipelining of disk and network requests • Need a lot of sophisticated software infrastructure • Replication for reliability and performance • Need sophisticated protocols • Difficult: We did it for one application • What if data changes rapidly? • Lack of abstractions!

6.894 lab: real systems • Multi-finger (due next week) • Asynchronous I/O • HTTP proxy • High-performance proxy • Cache, consistency, etc. • Open-ended file system project • Research

6.894: Distributed Operating System Engineering

6.894: Distributed Operating System Engineering

Presentation Transcript

6.894 Pervasive Computing

Case Study: The E1 Distributed Operating System

Synchronization Tools for Distributed Operating System Survey Paper

Distributed Operating Systems

Distributed Operating Systems

Distributed Operating Systems

AMOEBA – A DISTRIBUTED OPERATING SYSTEM

Distributed Operating Systems

Case Study of Distributed Operating System (Week: 13)

Distributed Operating System

Distributed Systems Course Operating System Support

Distributed Operating Systems

Distributed Operating Systems

The Amoeba Distributed Operating System

Distributed Operating Systems

Distributed Operating Systems

OPERATING SYSTEMS Distributed System Structures