160 likes | 312 Views
Web Client/Server Communication. A290/A590, Fall 2014 09 /09/ 2014. Fixing permissions. Execute these commands on Silo: $ acl_open –r ~/a290 dnikolov $ acl_open –r ~/a290 vsupe $ acl_open –r ~/a290 tdshah This will only give me and the Ais access to your a290 folders.
E N D
Web Client/Server Communication A290/A590, Fall 2014 09/09/2014
Fixing permissions • Execute these commands on Silo: $ acl_open –r ~/a290 dnikolov $ acl_open –r ~/a290 vsupe $ acl_open –r ~/a290 tdshah • This will only give me and the Ais access to your a290 folders
Structure of a URL • URL = Uniform Resource Locator • General form: protocol://domain:port/path/to/file • How? The Application layer protocol. For example, browsers usually use HTTP. • Where on the Internet? The named address of the server is translated to an IP address for Internet travel. • Where on the server? The port number is used to find the Web server on the target machine, and then locate the virtual space (folder) on the Web server • What? Path to a specific file in the virtual space • For example: http://homes.soic.indiana.edu/classes/fall2014/csci/a290-web-dnikolov/index.html • port is 80 by default and is usually omitted
HTTP Transactions – The Big Picture • HTTP = HyperText Transfer Protocol • The rules clients and servers on the Web follow to communicate with each other • What does a simple HTTP transaction involve? • Let's say we are loading the Web page for Lab 1 • The browser resolves the host name homes.soic.indiana.edu to an IP address • An HTTPrequest is sent to the IP address corresponding to homes.soic.indiana for the file /fall2014/csci/a290-web-dnikolov/lab1.html • A copy of the HTML document is sent back in the HTTPresponse and stored on the client • The HTML file is parsed and further requests are sent to the server to retrieve images, CSS, JavaScript, etc. • The additional files retrieved are also stored on the client. • We can observe an HTTP transaction using the browser's Web Developer tool
Domain Name Service (DNS) • DNS = Domain Name Service • Provides a mapping between host names and IP addresses • Let's take a step back… How do web sites come in existence? • A web site needs hosting – a machine with a static IP Address on which to install the web server, Python, etc. • A web site also needs a domain – a unique name to be mapped to its IP address, e.g. indiana.edu • The hosting provider and domain provider need not be the same • IP Addresses and domains • ICANN (Internet Corporation for Assigned Names and Numbers) assigns a range of IP addresses to RIRs (Regional Internet Registries), which assign IP addresses to ISPs (Internet Service Providers), e.g. IU, GoDaddy, etc. • ICANN also manages top-level domains such as .com, .net, etc. and provides a service ensuring domain name uniqueness • A registrar (usually an ISP) can assign domain names under specific top-level domains to its customers, but has to register them with ICANN first
Domain Name Service (DNS) • Registering a domain name • The ISP manages its own DNS server(s), which are updated once you register a host name • DNS is a hierarchy – the ISP's DNS server propagates mapping to DNS servers upwards in the hierarchy • Usually takes a few hours after you register a domain for it to be available • An example of a DNS SOA (Start of Authority) record that may get propagated
SOA Record example ; $Id: knownspace,v 1.10 2012/01/05 18:25:20 root Exp $ ; $TTL 86400 dimitarnikolov.org. IN SOA moose.cs.indiana.edu. rawlins.cs.indiana.edu. ( 2012010501 ; serial 10800 ; refresh (3 hours) 3600 ; retry (1 hour) 604800 ; expire (7 days) 86400 ) ; minimum (1 day) IN NS dns1 IN NS dns2 IN TXT "Dimitar Nikolov" IN TXT "Indiana University" IN MX 0 mail.cs.indiana.edu. dns1 IN A 129.79.247.191 dns2 IN A 129.79.247.195 www IN CNAME www.cs.indiana.edu. developer IN CNAME www.cs.indiana.edu.
HTTP Transactions – The Big Picture • HTTP = HyperText Transfer Protocol • The rules clients and servers on the Web follow to communicate with each other • What does a simple HTTP transaction involve? • Let's say we are loading the Web page for Lab 1 • The browser resolves the host name homes.soic.indiana.edu to an IP address • An HTTPrequest is sent to the IP address corresponding to homes.soic.indiana for the file /fall2014/csci/a290-web-dnikolov/lab1.html • A copy of the HTML document is sent back in the HTTPresponse and stored on the client • The HTML file is parsed and further requests are sent to the server to retrieve images, CSS, JavaScript, etc. • The additional files retrieved are also stored on the client. • We can observe an HTTP transaction using the browser's Web Developer tool
HTTP Requests & Responses • The request consists of a header and a body (separated by a new line) • The response consists of a start line, header and a body • Let's look at this with telnet…
HTTP Requests • Remember the first line of the HTTP header: GET /test/hi-there.txt • Possible HTTP methods • GET: Send resource from the server to the client • POST: Send client data to a (CGI) application on the server • HEAD: Send just the HTTP headers from the response for a given resource • DELETE: Delete the resource from the server
File paths on the server • The request for a directory loads a default file, e.g. index.html. • If there isn't one, the directory contents may be listed. • Usually a security hazard • The request for a file (usually) sends the file back to the client ├── index.html ├── … ├── classes │ ├── fall2014 │ │ ├── ... │ │ ├── csci │ │ │ ├── ... │ │ │├── a290-web-dnikolov │ │ │ │ ├── homepageinfo.html │ │ │ │ ├── index.html │ │ │ │ ├── index.html~ │ │ │ │ ├── lab1.html │ │ │ │ ├── lab1.html~ │ │ │ │ ├── lab2.html │ │ │ │ ├── lab2.html~ │ │ │ │ ├── lab2.temp.html~ │ │ │ │ ├── style.css │ │ │ │ ├── style.css~ │ │ │ │ └── syllabus.pdf ├── … 1 Web Server (e.g. Apache) • http://homes.soic.indiana.edu • http://homes/soic.indiana.edu/classes/fall2014/csci/a290-web-nikolov • http://homes/soic.indiana.edu/classes/fall2014/csci/a290-web-nikolov/lab2.html 2 3
Notes About Apache • ~/local/conf/httpd.conf is where you can specify a lot of relevant options, such as • where the Apache virtual space is • AKA htdocsAKA wwwroot • what port number to listen for requests on • default HTML file to serve if a directory is requests, e.g. index.html • default pages to serve when an error occurs, e.g. 404 – File Not Found, 500 – Internal Server Error, etc. • Important: Every time you change httpd.conf, you need to restart Apache for the changes to take effect!
CGI • CGI = Common Gateway Interface • A way for the server to execute a binary script instead of simply delivering a static HTML file. • Binaries files are placed in a special directory in the Web server's virtual space, usually cgi-bin • When the server receives a request for the file in the CGI directory, instead of delivering the file to the client, it executes it and delivers the output • Before the script is executed, the web server puts any input data for the script received from the client in environmental variables • The HTTP action for executing a CGI script can be either GET or POST
What a CGI Program Looks Like import cgi html = """Content-Type: text/html\n <html> <head> <title>A First CGI Program</title> </head> <body> <h1>Hello, %(name)s!</h1> </body> </html> """ form = cgi.FieldStorage() name = form['name'].value print(html % name)
Next Time • Processing HTML forms with Python and CGI