280 likes | 388 Views
Computer Systems. Iterative server. HTTP request. Web client (browser). Web server. HTTP response (content). Web Servers. Clients and servers communicate using the HyperText Transfer Protocol (HTTP) Client and server establish TCP connection Client requests content
E N D
Computer Systems Iterative server Computer Systems – Iterative server
HTTP request Web client (browser) Web server HTTP response (content) Web Servers • Clients and servers communicate using the HyperText Transfer Protocol (HTTP) • Client and server establish TCP connection • Client requests content • Server responds with requested content • Client and server close connection (usually) Computer Systems – Iterative server
Web Content • Web servers return content to clients • content: a sequence of bytes with an associated MIME (Multipurpose Internet Mail Extensions) type • Example MIME types • text/html HTML document • text/plain Unformatted text • application/postscript Postcript document • image/gif Binary image encoded in GIF format • image/jpeg Binary image encoded in JPEG format Computer Systems – Iterative server
Static and Dynamic Content • The content returned in HTTP responses can be either static or dynamic. • Static content: content stored in files and retrieved in response to an HTTP request • Examples: HTML files, images, audio clips. • Dynamic content: content produced on-the-fly in response to an HTTP request • Example: content produced by a program executed by the server on behalf of the client. • Bottom line: All Web content is associated with a file that is managed by the server. Computer Systems – Iterative server
URLs • Each file managed by a server has a unique name called a URL (Universal Resource Locator) • URLs for static content: • http://www.cs.cmu.edu:80/index.html • http://www.cs.cmu.edu/index.html • http://www.cs.cmu.edu • Identifies a file called index.html, managed by a Web server at www.cs.cmu.edu that is listening on port 80. • URLs for dynamic content: • http://brooks.science.uva.nl:8008/cgi-bin/adder?33&9 • Identifies an executable file called adder, managed by a Web server runing on brooks that is listening on port 8008, that should be called with two argument strings: 33 and 9. Computer Systems – Iterative server
How Clients and Servers Use URLs • Example URL: http://www.aol.com:80/index.html • Clients use prefix(http://www.aol.com:80) to infer: • What kind of server to contact (Web server) • Where the server is (www.aol.com) • What port it is listening on (80) • Servers use suffix(/index.html) to: • Determine if request is for static or dynamic content. • No hard and fast rules for this. • Convention: executables reside in cgi-bin directory • Find file on file system. • Initial “/” in suffix denotes home directory for requested content. • Minimal suffix is “/”, which all servers expand to some default home page (e.g., index.html). Computer Systems – Iterative server
Clients • Examples of client programs • Web browsers, ftp, telnet, ssh • How does a client find the server? • The IP address in the server socket address identifies the host (more precisely, an adapter on the host) • The (well-known) port in the server socket address identifies the service, and thus implicitly identifies the server process that performs that service. • Examples of well know ports • Port 7: Echo server • Port 23: Telnet server • Port 25: Mail server • Port 80: Web server Computer Systems – Iterative server
Using Ports to Identify Services Server host 128.2.194.242 Client host Web server (port 80) Service request for 128.2.194.242:80 (i.e., the Web server) Kernel Client Echo server (port 7) Web server (port 80) Service request for 128.2.194.242:7 (i.e., the echo server) Kernel Client Echo server (port 7) Computer Systems – Iterative server
Sockets Interface • Created in the early 80’s as part of the original Berkeley distribution of Unix that contained an early version of the Internet protocols. • Provides a user-level interface to the network. • Underlying basis for all Internet applications. • Based on client/server programming model. Computer Systems – Iterative server
Internet client host Internet server host Client Server User code Sockets interface (system calls) TCP/IP TCP/IP Kernel code Hardware interface (interrupts) Network adapter Network adapter Hardware Global IP Internet Sockets • What is a socket? • To the kernel, a socket is an endpoint of IP communication. Computer Systems – Iterative server
Sockets • What is a socket? • To an application, a socket is a file descriptor that lets the application read/write from/to the network. • Remember: All Unix I/O devices, including networks, are modeled as files. • Clients and servers communicate with each by reading from and writing to socket descriptors. • The main distinction between regular file I/O and socket I/O is how the application “opens” the socket descriptors. Computer Systems – Iterative server
A server listens to a port listenfd(3) 1. Server blocks in accept, waiting for connection request on listening descriptor listenfd. Server Client clientfd connection request listenfd(3) Server Client 2. Client makes connection request by calling and blocking in connect. clientfd listenfd(3) 3. Server returns connfd from accept. Client returns from connect. Connection is now established between clientfd and connfd. Server Client clientfd connfd(4) Computer Systems – Iterative server
System calls of the Sockets Interface Client Server socket socket bind open_listenfd open_clientfd listen Connection request connect accept rio_writen rio_readlineb Await connection request from next client rio_readlineb rio_writen EOF rio_readlineb close close Computer Systems – Iterative server
Echo Client Main Routine #include "csapp.h" /* usage: ./echoclient host port */ int main(int argc, char **argv) { int clientfd, port; char *host, buf[MAXLINE]; rio_t rio; host = argv[1]; port = atoi(argv[2]); clientfd = Open_clientfd(host, port); Rio_readinitb(&rio, clientfd); while (Fgets(buf, MAXLINE, stdin) != NULL) { Rio_writen(clientfd, buf, strlen(buf)); Rio_readlineb(&rio, buf, MAXLINE); Fputs(buf, stdout); } Close(clientfd); } Computer Systems – Iterative server
Echo Server: Main Routine int main(int argc, char **argv) { int listenfd, connfd, port, clientlen; struct sockaddr_in clientaddr; struct hostent *hp; char *haddrp; port = atoi(argv[1]); /* the server listens on a port passed on the command line */ listenfd = open_listenfd(port); while (1) { clientlen = sizeof(clientaddr); connfd = Accept(listenfd, (SA *)&clientaddr, &clientlen); hp = Gethostbyaddr((const char *)&clientaddr.sin_addr.s_addr, sizeof(clientaddr.sin_addr.s_addr), AF_INET); haddrp = inet_ntoa(clientaddr.sin_addr); printf("server connected to %s (%s)\n", hp->h_name, haddrp); echo(connfd); Close(connfd); } } Computer Systems – Iterative server
Echo Server: echo void echo(int connfd) { size_t n; char buf[MAXLINE]; rio_t rio; Rio_readinitb(&rio, connfd); while((n = Rio_readlineb(&rio, buf, MAXLINE)) != 0) { printf("server received %d bytes\n", n); Rio_writen(connfd, buf, n); } } • The server uses RIO to read and echo text lines until EOF (end-of-file) is encountered. • EOF notification caused by client calling close(clientfd). • IMPORTANT: EOF is a condition, not a particular data byte. Computer Systems – Iterative server
Unix I/O vs. Standard I/O vs. RIO • Standard I/O and RIO are implemented usinglow-level Unix I/O. • Which ones should you use in your programs? fopen fdopen fread fwrite fscanf fprintf sscanf sprintf fgets fputs fflush fseek fclose C application program rio_readn rio_writen rio_readinitb rio_readlineb rio_readnb Standard I/O functions RIO functions open read write lseek stat close Unix I/O functions (accessed via system calls) Computer Systems – Iterative server
Pros and Cons of Standard I/O • Pros: • Buffering increases efficiency by decreasing the number of read and write system calls. • Short counts are handled automatically. • Cons: • Provides no function for accessing file metadata • Standard I/O is not appropriate for input and output on network sockets • There are poorly documented restrictions on streams that interact badly with restrictions on sockets Computer Systems – Iterative server
Pros and Cons of Standard I/O • Restrictions on (full-duplex) streams: • Restriction 1: input function cannot follow output function without intervening call to fflush, fseek, fsetpos, or rewind. • Latter three functions all use lseek to change file position. • Restriction 2: output function cannot follow an input function with intervening call to fseek, fsetpos, or rewind. • Restriction on sockets: • You are not allowed to change the file position of a socket. Computer Systems – Iterative server
Choosing I/O Functions • General rule: Use the highest-level I/O functions you can. • Many C programmers are able to do all of their work using the standard I/O functions. • When to use standard I/O? • When working with disk or terminal files. • When to use raw Unix I/O • When you need to fetch file metadata. • In rare cases when you need absolute highest performance. • When to use RIO? • When you are reading and writing network sockets or pipes. • Never use standard I/O or raw Unix I/O on sockets or pipes. Computer Systems – Iterative server
Testing the Echo Server With telnet bass> echoserver 5000 server established connection with KITTYHAWK.CMCL (128.2.194.242) server received 5 bytes: 123 server established connection with KITTYHAWK.CMCL (128.2.194.242) server received 8 bytes: 456789 kittyhawk> telnet bass 5000 Trying 128.2.222.85... Connected to BASS.CMCL.CS.CMU.EDU. Escape character is '^]'. 123 123 Connection closed by foreign host. kittyhawk> telnet bass 5000 Trying 128.2.222.85... Connected to BASS.CMCL.CS.CMU.EDU. Escape character is '^]'. 456789 456789 Connection closed by foreign host. kittyhawk> Computer Systems – Iterative server
Running the Echo Client and Server bass> echoserver 5000 server established connection with KITTYHAWK.CMCL (128.2.194.242) server received 4 bytes: 123 server established connection with KITTYHAWK.CMCL (128.2.194.242) server received 7 bytes: 456789 ... kittyhawk> echoclient bass 5000 Please enter msg: 123 Echo from server: 123 kittyhawk> echoclient bass 5000 Please enter msg: 456789 Echo from server: 456789 kittyhawk> Computer Systems – Iterative server
A server listens to a port listenfd(3) 1. Server blocks in accept, waiting for connection request on listening descriptor listenfd. Server Client clientfd connection request listenfd(3) Server Client 2. Client makes connection request by calling and blocking in connect. clientfd listenfd(3) 3. Server returns connfd from accept. Client returns from connect. Connection is now established between clientfd and connfd. Server Client clientfd connfd(4) Computer Systems – Iterative server
Connected vs. Listening Descriptors • Listening descriptor • End point for client connection requests. • Created once and exists for lifetime of the server. • Connected descriptor • End point of the connection between client and server. • A new descriptor is created each time the server accepts a connection request from a client. • Exists only as long as it takes to service client. • Why the distinction? • Allows for concurrent servers that can communicate over many client connections simultaneously. • E.g., Each time we receive a new request, we fork a child to handle the request. Computer Systems – Iterative server
Anatomy of an HTTP Transaction unix> telnet www.aol.com 80Client: open connection to server Trying 205.188.146.23... Telnet prints 3 lines to the terminal Connected to aol.com. Escape character is '^]'. GET / HTTP/1.1 Client: request line host: www.aol.com Client: required HTTP/1.1 HOST header Client: empty line terminates headers. HTTP/1.0 200 OK Server: response line MIME-Version: 1.0 Server: followed by five response headers Date: Mon, 08 Jan 2001 04:59:42 GMT Server: NaviServer/2.0 AOLserver/2.3.3 Content-Type: text/html Server: expect HTML in the response body Content-Length: 42092 Server: expect 42,092 bytes in the resp body Server: empty line (“\r\n”) terminates hdrs <html> Server: first HTML line in response body ... Server: 766 lines of HTML not shown. </html> Server: last HTML line in response body Connection closed by foreign host. Server: closes connection unix> Client: closes connection and terminates Computer Systems – Iterative server
The add.com Experience input URL host port CGI program args Output page Computer Systems – Iterative server
Serving Dynamic Content With GET • Question: How does the client pass arguments to the server? • Answer: The arguments are appended to the URI • Can be encoded directly in a URL typed to a browser or a URL in an HTML link • http://add.com/cgi-bin/adder?1&2 • adder is the CGI program on the server that will do the addition. • argument list starts with “?” • arguments separated by “&” • spaces represented by “+” or “%20” • Can also be generated by an HTML form <form method=get action="http://add.com/cgi-bin/postadder"> Computer Systems – Iterative server
Assignment • Adder.com: Make and Start your own tiny webserver http://carol.science.uva.nl/~arnoud/onderwijs/CS/conc/tiny.tar.gz • Problem 2.10Access the adder-code with a formSee http://carol.science.uva.nl/~arnoud/onderwijs/CS/Evaluation.html <form method=get action="http://add.com/cgi-bin/postadder"> Computer Systems – Iterative server