800 likes | 958 Views
WWW and Security CS587x Lecture Department of Computer Science Iowa State University. What to Cover. WWW HTTP/1.0 Protocol highlights Problems HTTP/1.1 Highlights of improvement Security Internet security issues Introduction to cryptography Secured Socket Layer (SSL).
E N D
WWW and Security CS587x Lecture Department of Computer Science Iowa State University
What to Cover • WWW • HTTP/1.0 • Protocol highlights • Problems • HTTP/1.1 • Highlights of improvement • Security • Internet security issues • Introduction to cryptography • Secured Socket Layer (SSL)
World Wide Web (WWW) • Core Components • Servers • Store files and execute remote commands • Browsers (i.e., clients) • Retrieve and display “pages” of content linked by hypertext • Networks • Send information back and forth upon request • Problems • How to identify an object • How to retrieve an object • How to interpret an object
Semantic Parts of WWW • URI (Uniform Resource Identifier) • protocol://hostname:port/directory/object • http://www.cs.iastate.edu/index.html • ftp://popeye.cs.iastate.edu/welcome.txt • https://finance.yahoo.com/q/cq?s=ibm&d=v1 • Implementation: extend hierarchical namespace to include • anything in a file system • server side processing • HTTP (Hyper Text Transfer Protocol) • An application protocol for information sending/receiving • HTML (Hypertext Markup Language) • An language specification used to interpret the information received from server
HTTP Properties • Request-response exchange • Server runs over TCP, Port 80 • Client sends HTTP requests and gets responses from server • Synchronous request/reply protocol • Stateless • No state is maintained by clients or servers across requests and responses • Each pair of request and response is treated as an independent message exchange • Resource metadata • Information about resources are often included in web transfers and can be used in several ways
HTTP Commands • GET • Transfer resource from given URL • HEAD • Get resource metadata (headers) only • PUT • Store/modify resource under a given URL • DELETE • Remove resource • POST • Provide input for a process identified by the given URL (usually used to post CGI parameters)
Response Codes of HTTP 1.0 • 2xx success • 3xx redirection • 4xx client error in request • 5xx server error; can’t satisfy the request
Steps of Processing an HTTP Requesthttp://www.cs.iastate.edu/index.html • The client • Contact its local DNS to find out the IP address of www.cs.iastate.edu • Initiate a TCP connection on port 80 • Send the get request via the established socket GET /index.html HTTP/1.0 • The server • Send its response containing the required file • Tell TCP to terminate connection • The browser • Parse the file and display it accordingly • Repeat the same steps in the presence of any embedded objects
Server Response HTTP/1.0 200 OK Content-Type: text/html Content-Length: 1234 Last-Modified: Mon, 19 Nov 2001 15:31:20 GMT <HTML> <HEAD> <TITLE>CS Home Page</TITLE> </HEAD> … </BODY> </HTML>
HTTP/1.0 Example Server Client Request file 1 Transfer file 1 Request file 2 Transfer file 2 Request file n Transfer file n Finish display page
HTTP Server Implementation public WebServerDemo(String[] args) { public static void main(String[] args) { ServerSocket ss = new ServerSocket(80); for (;;) { // accept connection Socket accept = ss.accept(); // Start a thread to process the request new Handler(accept).start(); } }
HTTP Server Implementation class Handler extends Thread { // Handler for a HTTP request Socket socket; BufferedReader br; PrintWriter pw; public Handler(Socket _socket) { socket=_socket; } public void run() { br = new BufferedReader(new InputStreamReader(socket.getInputStream())); pw = new PrintWriter(new OutputStreamWriter(bos)); String line = br.readLine(); // Read HTTP request from user if(line.toUpperCase().startsWith("GET")) { // parse the string to find the file name // locate the file and send it back ::::: } //other commands: post, delete, put, etc. } }
HTTP/1.0 Caching • CLIENT • GET request: • If-modified-since – return a “not modified” response if resource was not modified since specified time • Request header • No-cache – ignore all caches and get resource directly from server • SERVER • Response header: • Expires – specify to the client for how long it is safe to cache the resource
Issues with HTTP/1.0 • Each resource requires a new connection • Large number of embedded objects in a web page • Many short lived connections • Serial vs. parallel connections • Serial connection downloads one object at a time (e.g., MOSAIC) causing long latency to display a whole page • Parallel connection (e.g., NETSCAPE) opens several connections (typically 4) contributing to network congestion • HTTP uses TCP as the transport protocol • TCP is not optimized for the typical short-lived connections • Most Internet traffic fit in 10 packets (overhead: 7 out of 17) • Too slow for small object • May never exit slow-start phase
Highlights of HTTP/1.1 • Persistent connections • Pipelined requests/responses • Support for virtual hosting • More explicit support on caching • Internet Caching Protocol (ICP) • Content negotiation/adaptation • Range Request
Persistent Connections • The basic idea was • reducing the number of TCP connections opened and closed • reducing TCP connection costs • reducing latency by avoiding multiple TCP slow-starts • avoid bandwidth wastage and reducing overall congestion • A longer TCP connection knows better about networking condition (Why?) • New GET methods • GETALL • GETLIST
Pipelined Requests/Responses • Buffer requests and responses to reduce the number of packets • Multiple requests can be contained in one TCP segment • Note: order of responses has to be maintained Server Client Request 1 Request 2 Request 3 Transfer 1 Transfer 2 Transfer 3
Support for Virtual Hosting • Problem – outsourcing web content to some company • http://www.hostmany.com/Ahttp://www.A.com • http://www.hostmany.com/B http://www.B.com • In HTTP/1.0, a request forhttp://www.A.com/index.htmlhas in its header only: • GET /index.html HTTP/1.0 • It is not possible to run two web servers at the same IP address, because GET is ambiguous • HTTP/1.1 addresses this by adding “Host” header GET /index.html HTTP/1.1 Host: www.A.com
Content Negotiation/Adaptation • A resource may have more than one representation • Different languages • Different size of images, etc. Example GET /index.html HTTP/1.1 Host: www.getbelix.com Accept-Language: en-us, fr-BE • Two approaches • Agent-driven: the client receives a set of alternative representation of the response, chooses the best representation and indicates in the second request • Server-driven: the server chooses the representation based on what is available at the server, the headers in the request messages, or information about the client, such as its IP
Range Request • A user may want to load only some portion of content • E.g., retrieve only the newly appended portion • E.g., load some pages of a PDF file GET bigfile.html HTTP/1.1 Host: www.justwhatiwant.com Range: 2000-3999 Range: -1000 Range: 2000-
Cache-Control Request Directives • no-cache: forcible revalidation with origin server • only-if-cached: obtain resource only from cache • no-store: don’t allow caches to store request/response • max-age: response’s should be no greater than this value • max-stale: expired response OK but not older than staled value • min-fresh: response should remain fresh for at least stated value • no-transform: proxy should not change media type
Cache-Control Response Directives • public: OK to cache response anywhere • private: response for specific user only • no-cache: do not serve from cache without prior revalidation • Must revalidate regardless of when the response becomes stale • no-store: caches are not permitted to store response, request • no-transform: proxy should not change media type • must-revalidate: can be cached but revalidate if stale • A file may be associated with an age (expiration) • proxy-revalidate: force shared user agent caches to revalidate cached response • max-age: response’s age should be no greater than this value • s-maxage: shared caches use value as response’s maximum age (overide max-age)
Factors to Consider for Cache Replacement • Cost of storing the resource (size) • Cost of fetching the resource (size+distance) • The time since the last modification of the resource • The number of accesses to the resource in the past • The probability of the resource being accessed in the near future • May be a known priori or based on the past access pattern • The heuristic expiration time • If there is no server-specified expiration time, the cache decides on a heuristic expiration time. • If no expired resource are available as candidates, then resource that are close to their expiration time are prioritized as candidates for replacement
Summary • HTTP 1.0 • HTTP 1.1
What covered so far DNS HTTP TCP UDP IP Ethernet FDDI Token Etc.
HTTP Server (1) import java.io.*; import java.net.*; import java.util.*; public class WebServerDemo { protected String docroot; // Directory of HTML pages and other files protected int port; // Port number of web server protected ServerSocket ss; // Socket for the web server class Handler extends Thread { // Handler for a HTTP request protected Socket socket; protected PrintWriter pw; protected BufferedOutputStream bos; protected BufferedReader br; protected File docroot; public Handler(Socket _socket, String _docroot) throws Exception { socket=_socket; docroot=new File(_docroot).getCanonicalFile(); // Absolute dir of the filepath }
HTTP Server (2) public void run() { try { // Prepare our readers and writers br = new BufferedReader(new InputStreamReader(socket.getInputStream())); bos = new BufferedOutputStream(socket.getOutputStream()); pw = new PrintWriter(new OutputStreamWriter(bos)); String line = br.readLine(); // Read HTTP request from user socket.shutdownInput(); // Shutdown any further input if(line == null) { socket.close(); return; } if(line.toUpperCase().startsWith("GET")) { // Eliminate any trailing ? data, such as for a CGI GET request StringTokenizer tokens = new StringTokenizer(line," ?"); tokens.nextToken(); String req = tokens.nextToken(); String name; // ... form a full filename if(req.startsWith("/") || req.startsWith("\\")) name = this.docroot+req; else name = this.docroot+File.separator+req; File file = new File(name).getCanonicalFile(); // Get absolute file path // Check to see if request doesn't start with our document root .... if(!file.getAbsolutePath().startsWith(this.docroot.getAbsolutePath())) { pw.println("HTTP/1.0 403 Forbidden"); pw.println(); }
HTTP Server (3) // run() continued else if(!file.canRead()) { // No access pw.println("HTTP/1.0 403 Forbidden"); pw.println(); } else if(file.isDirectory()) { // Directory, not file sendDir(bos,pw,file,req); } else { sendFile(bos, pw, file.getAbsolutePath()); } } else { // Unsupported command pw.println("HTTP/1.0 501 Not Implemented"); pw.println(); } pw.flush(); bos.flush(); } catch(Exception e) { e.printStackTrace(); } try { socket.close(); } catch(Exception e) { e.printStackTrace(); } } // run() protected void sendFile(BufferedOutputStream bos, PrintWriter pw, String filename) throws Exception { try { BufferedInputStream bis = new BufferedInputStream(new FileInputStream(filename)); byte[] data = new byte[10*1024]; int read = bis.read(data); pw.println("HTTP/1.0 200 Okay"); pw.println(); pw.flush(); bos.flush(); while(read != -1) { bos.write(data,0,read); read = bis.read(data); } bos.flush(); } catch(Exception e) { pw.flush(); bos.flush(); } }
HTTP Server (4) protected void sendDir(BufferedOutputStream bos, PrintWriter pw, File dir, String req) throws Exception { try { pw.println("HTTP/1.0 200 Okay"); pw.println(); pw.flush(); pw.print("<html><head><title>Directory of " + req + "</title></head><body><h1>Directory of “ + req + "</h1><table border=\"0\">"); File[] contents=dir.listFiles(); for(int i=0;i<contents.length;i++) { pw.print("<tr><td><a href=\"" + req + contents[i].getName()); if(contents[i].isDirectory()) pw.print("/"); pw.print("\">"); if(contents[i].isDirectory()) pw.print("Dir -> "); pw.println(contents[i].getName() + "</a></td></tr>"); } pw.println("</table></body></html>"); pw.flush(); } catch(Exception e) { pw.flush(); bos.flush(); } } } protected void parseParams(String[] args) throws Exception { switch(args.length) { // Check that a filepath has been specified and a port number case 1: case 0: System.err.println ("Syntax: <jvm> "+this.getClass().getName()+" docroot port"); System.exit(0); default: this.docroot = args[0]; this.port = Integer.parseInt(args[1]); break; } }
HTTP Server (5) public WebServerDemo(String[] args) throws Exception { System.out.println ("Checking for paramters"); parseParams(args); // Check for command line parameters System.out.print ("Starting web server...... "); this.ss = new ServerSocket(this.port); // Create a new server socket System.out.println ("OK"); for (;;) { // Forever Socket accept = ss.accept(); // Accept connection via server socket // Start a new handler instance to process the request new Handler(accept, docroot).start(); } } // Start an instance of the web server public static void main(String[] args) throws Exception { WebServerDemo webServerDemo = new WebServerDemo(args); } }
Internet Security Issues • A TCP/IP packet could go through many intermediate computers and separate networks • Possible ways for communication interference • Eavesdropping • Information remains intact, but its privacy is compromised. For example, someone could learn your credit card number, etc. • Tampering • Information in transit is changed or replaced and then sent on to the recipient. For example, someone could alter an order of goods • Impersonation • Information passes to a person who poses as the intended recipient. For example, a person can pretend to have the email address jdoe@mozilla.com or a computer can identify itself as www.mozilla.com while it is not
Public-Key Cryptography • The goals of developing this standard • Encryption and decryption • Allow two communication parties to disguise information they send to each other • Tamper detection • Allows the recipient of information to verify that it has not been modified in transit • Authentication • Allows the recipient of information to determine its origin, i.e., confirm the sender’s identity • Nonrepudiation • Prevents the sender of information from claiming at a later date that the information was never sent
Encryption and Decryption • Encryption is a process of transforming information so it is intelligible to anyone but the intended recipient • Decryption is a process of transforming encrypted information so it is intelligible again • A cryptography algorithm (also called cipher) is a mathematical function used for encryption or decryption. • In most cases, two related functions are employed, one for encryption and the other for decryption • Cryptography algorithms are widely known • The ability to keep encrypted information secret is based not on the cryptography, but on a number called key • Key is used with the algorithm to produce an encrypted result or to decrypt previously encrypted information
Symmetric-Key Encryption • With symmetric-key encryption, the encryption key can be calculated from the decryption key and vice versa • With most symmetric-key encryption, the same key is used for both encryption and decryption
Symmetric-Key Encryption • Advantages • Highly efficient implementation • fast encryption and decryption • Provides some degree of authentication • information encrypted with one symmetric key cannot be decrypted with any other symmetric key. • Disadvantages • Effective only if the key is kept secret by the two parties involved • If anyone else discovers the key, it affects both confidentiality and authentication • The person not only can decrypt messages sent with that key, but can encrypt new messages and send them as if they came from one of the two parties who were originally using the key
Public-Key Encryption • Public-key encryption (also called asymmetric encryption) involves a pair of keys – public key and private key • Public key is published and could be well-known • Private key is associated with an entity that needs to authenticate its identity electronically or to sign or encrypt data • Data encrypted with a public key can be decrypted only with some corresponding private key • To send data to someone, you encrypt the data with his public key, and the person receiving the encrypted data decrypts it with the corresponding private key • Data encrypted with private key can be decrypted only with corresponding public key (more details later)
Public-Key Encryption • Advantage • Allow to freely distribute public key to the sender • Private key can be kept in secret • Disadvantage • Compared with symmetric-key encryption, public-key encryption requires more computation and is therefore not always appropriate for large amounts of data • The way to leverage the advantage and minimize the disadvantage • Use public-key encryption to send a symmetric key, which can be then be used to encrypt additional data. This is the approach used by the SSL protocol
Temper Detection • Encryption and decryption solves only the problem of eavesdropping • The problem of tampering and impersonation remains • Tamper detection is done by using public-key encryption for digital signature • Impersonation can be addressed by certification and authentication
Digital Signature • Tamer detection replies on a mathematical function called a one-way hash (also called a message digest) • A one-way hash is a number of fixed length with the following characteristics • Ideally, the value of the hash is unique for the hashed data. Any change in the data, even deleting or altering a single character, results in different value • The content of the hashed data cannot, for all practical purposes, be deduced from the hash – which is why it is called “one-way”
Digital Signature • Public-key encryption allows you to use your private key for encryption and your public key for decryption • This feature can be used to digitally signing any data • The signing software creates a one-way hash of the data, then uses your private key to encrypt the hash • The encrypted hash, along with other information, such as the hashing algorithm, is known as a digital signature
Digital Signature • The source sends data as follows • One-way hash the original data is one-way hashed • Encrypt it with your private key • Send both the original data and digital signature to the recipient • The recipient validates the data integrity as follows • Decrypt the digital signature using the public key • Use the same hash algorithm to one-way hash the received data • The data has not been tempered if the two sets of data are the same
A Certificate Identifies an Entity • What is certificate? • A certificate is an electronic document used to identify an individual, a server, a company, or some other entity • Just like a driver license identifies a person • Who issues certificate? • Certificate Authorities (CA) • can be either independent third party or organizations running their certificate-issuing server software • Before issuing a certificate, CA must go through certain verification procedures, depending on the CA’s policies
Certificate Content • Each certificate always • binds a particular public key to the certified entity • Only the public key certified by the certificate will work with the corresponding private key possessed by the owner of the certificate • includes the digital signature of the issuing CA • For tempering detection - you cannot change a certificate • The signature allows the certificate to function as a “letter of introduction” for users who know and trust the CA but don’t know the entity identified by the certificate • Of course, a certificate also includes the name of the entity it identifies, an expiration date, the name the of CA that issued the certificate
Sample Certificate Content Basic CA/Email=personal-basic@thawte.com Subject Public Key Info: Public Key Algorithm: rsaEncryption RSA Public Key: (1024 bit) Modulus (1024 bit): 00:bc:bc:93:53:6d:c0:50:4f:82:15:e6:48:94: a6:5a:be:6f:42:fa:0f:47:ee:77:75:72:dd:8d:49: 9b:96:57:a0:78:d4:ca:3f:51:b3:69:0b:91:76:17: 22:07:97:6a:c4:51:93:4b:e0:8d:ef:37:95:a1:0c: 4d:da:34:90:1d:17:89:97:e0:35:38:57:4a:c0:f4: 08:70:e9:3c:44:7b:50:7e:61:9a:90:e3:23:d3:88: 11:46:27:f5:0b:07:0e:bb:dd:d1:7f:20:0a:88:b9: 56:0b:2e:1c:80:da:f1:e3:9e:29:ef:14:bd:0a:44: fb:1b:5b:18:d1:bf:23:93:21 Exponent: 65537 (0x10001) X509v3 extensions: X509v3 Basic Constraints: critical CA:TRUE Signature Algorithm: md5WithRSAEncryption 2d:e2:99:6b:b0:3d:7a:89:d7:59:a2:94:01:1f:2b:dd:12:4b: 53:c2:ad:7f:aa:a7:00:5c:91:40:57:25:4a:38:aa:84:70:b9: d9:80:0f:a5:7b:5c:fb:73:c6:bd:d7:8a:61:5c:03:e3:2d:27: a8:17:e0:84:85:42:dc:5e:9b:c6:b7:b2:6d:bb:74:af:e4:3f: cb:a7:b7:b0:e0:5d:be:78:83:25:94:d2:db:81:0f:79:07:6d: 4f:f4:39:15:5a:52:01:7b:de:32:d6:4d:38:f6:12:5c:06:50: df:05:5b:bd:14:4b:a1:df:29:ba:3b:41:8d:f7:63:56:a1:df: 22:b1 openssl x509 -noout -text -in thawte. cer Certificate: Data: Version: 3 (0x2) Serial Number: 0 (0x0) Signature Algorithm: md5WithRSAEncryption Issuer: C=ZA, ST=Western Cape, L=Cape Town, O=Thawte Consulting, OU=Certification Services Division, CN=Thawte Personal Basic CA/Email=personal-basic@thawte.com Validity Not Before: Jan 1 00:00:00 1996 GMT Not After : Dec 31 23:59:59 2020 GMT Subject: C=ZA, ST=Western Cape, L=Cape Town, O=Thawte Consulting, OU=Certification Services Division, CN=Thawte Personal
Authentication Confirms an Identity • Password-based authentication • A client submits user name and password • Server checks database to see if name and password match • Certificate-based authentication • A client digitally signs some piece of data, which are randomly generated based on the input from server and client • Both client and server must know exactly the data to be signed • The client sends both the certificate and the signed data to the server • The server uses the public key in the certificate to decode the signed data • The signed data is an “evidence” used to verify if the client owns the private key corresponding to the public key stored in its certificate
Types of Certificates • Client/server certificates • Used to authenticate client/server via SSL • S/MIMI certificates • Used for signed and encrypted email • Object certificates • Used to identify signers of Java code or other signed files • CA certificates • Used to identify Certificate Authorities that can be trusted
Establishing trust through CA Certificates • Any client/server software that supports certificates maintains a collection of trusted CA certificates • It is possible to delegate certificate-issuing responsibility to subordinate CAs, thus, creating CA hierarchies • The root CA’s certificate is a self-signed certificates, i.e., it is digitally signed by the same entity • The CAs that are directly subordinate to the root CA have CA certificate signed by the root CA • CAs under the subordinate CAs in the hierarchy have their CA signed the higher-level subordinate CAs
CA Hierarchies Note: each certificate is signed with the private key of its issuer so that its authenticity can be verified through its public key