Lecture 3 – Networks and Distributed Systems

Lecture 3 – Networks and Distributed Systems CSE 490h – Introduction to Distributed Computing, Spring 2007 Except as otherwise noted, the content of this presentation is licensed under the Creative Commons Attribution 2.5 License.

Outline • Networking • Remote Procedure Calls (RPC) • Activity • Transaction Processing Systems • Discussion • RPC, Web 2.0, MapReduce, Reliability…

Fundamentals of Networking

Sockets: The Internet = tubes? • A socket is the basic network interface • Provides a two-way “pipe” abstraction between two applications • Client creates a socket, and connects to the server, who receives a socket representing the other side

Ports • Within an IP address, a port is a sub-address identifying a listening program • Allows multiple clients to connect to a server at once

Example: Web Server (1/3) The server creates a listener socket attached to a specific port. 80 is the agreed-upon port number for web traffic.

Example: Web Server (2/3) The client-side socket is still connected to a port, but the OS chooses a random unused port number When the client requests a URL (e.g., “www.google.com”), its OS uses a system called DNS to find its IP address.

Example: Web Server (3/3) Server chooses a randomly-numbered port to handle this particular client Listener is ready for more incoming connections, while we process the current connection in parallel

What makes this work? • Underneath the socket layer are several more protocols • Most important are TCP and IP (which are used hand-in-hand so often, they’re often spoken of as one protocol: TCP/IP) Even more low-level protocols handle how data is sent over Ethernet wires, or how bits are sent through the air using 802.11 wireless…

IP: The Internet Protocol • Defines the addressing scheme for computers • Encapsulates internal data in a “packet” • Does not provide reliability • Just includes enough information for the data to tell routers where to send it

TCP: Transmission Control Protocol • Built on top of IP • Introduces concept of “connection” • Provides reliability and ordering

Why is This Necessary? • Not actually tube-like “underneath the hood” • Unlike phone system (circuit switched), the packet switched Internet uses many routes at once

Networking Issues • If a party to a socket disconnects, how much data did they receive? • … Did they crash? Or did a machine in the middle? • Can someone in the middle intercept/modify our data? • Traffic congestion makes switch/router topology important for efficient throughput

Remote Procedure Calls (RPC)

How RPC Doesn’t Work • Regular client-server protocols involve sending data back and forth according to a shared state Client: Server: HTTP/1.0 index.html GET 200 OK Length: 2400 (file data) HTTP/1.0 hello.gif GET 200 OK Length: 81494 …

Remote Procedure Call • RPC servers will call arbitrary functions in dll, exe, with arguments passed over the network, and return values back over network Client: Server: foo.dll,bar(4, 10, “hello”) “returned_string” foo.dll,baz(42) err: no such function …

Possible Interfaces • RPC can be used with two basic interfaces: synchronous and asynchronous • Synchronous RPC is a “remote function call” – client blocks and waits for return val • Asynchronous RPC is a “remote thread spawn”

Synchronous RPC

Asynchronous RPC

Asynchronous RPC 2: Callbacks

Wrapper Functions • Writing rpc_call(foo.dll, bar, arg0, arg1..) is poor form • Confusing code • Breaks abstraction • Wrapper function makes code cleaner bar(arg0, arg1); //just write this; calls “stub”

More Design Considerations • Who can call RPC functions? Anybody? • How do you handle multiple versions of a function? • Need to marshal objects • How do you handle error conditions? • Numerous protocols: DCOM, CORBA, JRMI…

RPC Activity

Transaction Processing Systems (We’re using the blue cover sheets on the TPS reports now…)

TPS: Definition • A system that handles transactions coming from several sources concurrently • Transactions are “events that generate and modify data stored in an information system for later retrieval”* * http://en.wikipedia.org/wiki/Transaction_Processing_System

Key Features of TPS: ACID • “ACID” is the acronym for the features a TPS must support: • Atomicity – A set of changes must all succeed or all fail • Consistency – Changes to data must leave the data in a valid state when the full change set is applied • Isolation – The effects of a transaction must not be visible until the entire transaction is complete • Durability – After a transaction has been committed successfully, the state change must be permanent.

Atomicity & Durability What happens if we write half of a transaction to disk and the power goes out?

Logging: The Undo Buffer • Database writes to log the current values of all cells it is going to overwrite • Database overwrites cells with new values • Database marks log entry as committed • If db crashes during (2), we use the log to roll back the tables to prior state

Consistency: Data Types • Data entered in databases have rigorous data types associated with them, and explicit ranges • Does not protect against all errors (entering a date in the past is still a valid date, etc), but eliminates tedious programmer concerns

Consistency: Foreign Keys • Database designers declare that fields are indices into the keys of another table • Database ensures that target key exists before allowing value in source field

Isolation • Using mutual-exclusion locks, we can prevent other processes from reading data we are in the process of writing • When a database is prepared to commit a set of changes, it locks any records it is going to update before making the changes

Faulty Locking • Locking alone does not ensure isolation! • Changes to table A are visible before changes to table B – this is not an isolated transaction

Two-Phase Locking • After a transaction has released any locks, it may not acquire any new locks • Effect: The lock set owned by a transaction has a “growing” phase and a “shrinking” phase

Relationship to Distributed Comp • At the heart of a TPS is usually a large database server • Several distributed clients may connect to this server at points in time • Database may be spread across multiple servers, but must still maintain ACID

Conclusions • We’ve seen 3 layers that make up a distributed system • Designing a large distributed system involves engineering tradeoffs at each of these levels • Appreciating subtle concerns at each level requires diving past the abstractions, but abstractions are still useful in general

Discussion Distributed System Design

Next Time… • Guest speakers! • Mike Cafarella, on Nutch • Jon Nowitz, on Google Maps • New homework posted, due next Monday

Lecture 3 – Networks and Distributed Systems