610 likes | 761 Views
CSCI 233. Class 11. Agenda. World Wide Web SNMP Your Career. HTTP. World Wide Web. Early Internet data transfers largely used FTP By 1995, Web traffic overtook FTP, has been the leader ever since Web and Internet are identical for many users The Web is transforming society in many ways.
E N D
CSCI 233 Class 11
Agenda • World Wide Web • SNMP • Your Career
World Wide Web • Early Internet data transfers largely used FTP • By 1995, Web traffic overtook FTP, has been the leader ever since • Web and Internet are identical for many users • The Web is transforming society in many ways
What Is the Web? • Web pages—a large set of documents, accessible to Internet users • Each is called a hypermedia document • Hyper—document can contain links to other documents • Media—document can contain media other than text • Web browser and web server are building blocks • Browser—application program that user invokes to display a web page • Server—application program that delivers web pages on request to browsers
Web Protocols and Languages • HTTP—protocol used to transfer Web pages between requesting host and Web server • HTML—markup language used to represent documents that can include text, images, sound, video, other media • HTML tags in < > brackets, as <B> and </B>
URL • Uniform Resource Locator (URL)—each Web page has a URL <scheme>://<scheme-specific part> <scheme> is generally a type of access method such as http or ftp
An Example This is an html document: <HTML> This course is Internet Protocols, at <A HREF=“http://www.gwu.edu”> George Washington University.</A> </HTML> Is displayed as This course is Internet Protocols at George Washington University.
Hypertext Transfer Protocol (HTTP) • Application level—assumes reliable, connection-oriented transport service • Request/Response—once session is established, one side must send HTTP request to which the other side responds • Stateless—each HTTP request is self-contained • Bi-Directional Transfer—browser, server can transfer in both directions • Capability negotiation—browser, server negotiate • Caching—browser caches each page it receives; if requested again, browser can ask server if there have been changes • Intermediaries—allows machines along the path to act as proxy server that caches Web pages
HTTP Message A complete HTTP message consists of a client request and a sever response: HTTP-message = Request | Response
HTTP GET Request • Browser starts with URL • Browser extracts hostname, uses DNS to map name into IP address • Browser uses IP address to form TCP connection to server • Browser and Web server use HTTP to communicate • Browser sends GET for page • Server sends copy of requested page
Errors • Server presents error messages to browser in valid HTML • Browsers render error message for user • User can read the error message
Persistent Connections • Early versions of HTTP used a separate session for each transfer—separate TCP connection • HTTP 1.1 introduced a persistent connection—TCP connection is reused • HTTP sends a length, followed by data, to mark ends of objects during connection • If server doesn’t know length, then it informs the browser that it will close the connection after the transfer What are the benefits of a persistent connection?
Length Encoding, Headers HTTP borrows 822 format and MIME extensions from email Each line has keyword, colon, value
Example of Transfer Content-length: 34 Content-language: en Content-encoding: ascii <HTML> A Trivial example. </HTML> Connection: close
Negotiation • Server-driven • Begins with browser request • Specifies list of preferences, and URL • Server selects a representation that meets browser preferences • Agent-driven • Browser asks server what is available • Sends second request to obtain the item • Requires extra interaction, but keeps the browser in complete control
Conditional Requests • Browser can make a request conditional • Request is honored only if the condition can be met • Example: the header If-Modified-Since: Sun, 10 Nov 2002 08:00:01 GMT can be used along with a GET request for a page
Request Methods • HEAD—Asks for response like GET, without response body. • GET—Requests a representation of the specified resource. • POST—Submits data to be processed to the identified resource. • PUT—Uploads a representation of the specified resource. • DELETE—Deletes the specified resource. • TRACE—Echoes back the received request. • OPTIONS—Returns HTTP methods supported for a URL. • CONNECT– Converts the request connection to a transparent TCP/IP tunnel • PATCH—Applies partial modifications to a resource. Note: all methods except PUT are idempotent
Date Formats Supported date formats: Sun, 06 Nov 2002 08:49:37 GMT RFC 822, updated by RFC 1123 Sunday, 06-Nov-02 08:49:37 GMT RFC 850, obsoleted by RFC 1036 Sun Nov 6 08:49:37 2002 ANSI C's asctime() format Recognition of all three is required; the first must be used if dates are generated All Internet times are required to be in GMT
Content Coding Content coding is used to enable compression. Some types: • Gzip—produced by GNU Zip • Compress—produced by UNIX compress • Deflate—combines deflate program results with rlib compression • Identity—default, no compression
Proxy Servers • Browser can be configured to contact proxy server instead of original source • Proxy must be configured to cache Web pages • Proxies can reduce traffic to the Internet • HTTP includes explicit support for proxies, variety of control commands
Caching • Caching reduces Internet traffic by saving a page when retrieved • Subsequent requests for a page can be fulfilled by delivering a file from the cache • How long should a page be kept? • Too long: it gets stale • Too short: inefficiency • Server can specify caching details • Browser can specify zero age for retrieved page • Caching should be semantically transparent
Cache Control Directives Cache-Control: public—browser and proxies may cache the page Cache-Control: private—proxies may not cache, browsers may Cache Control: No-cache—browser must revalidate with the server before serving the page from the cache. Cache Control: No-store—browser may not cache, may not store Cache Control: max-age—browser may cache, but must re-validate with the server if the max-age is exceeded. Cache Control: must-revalidate—browser must revalidate the page against the server before serving it from cache. Cache Control: proxy-revalidate—proxy servers must revalidate, but the user’s browser need not revalidate.
WebDAV WebDAV stands for "Web-based Distributed Authoring and Versioning". It is a set of extensions to the HTTP protocol which allows users to collaboratively edit and manage files on remote web servers.
WebDAV is • The HTTP extensions necessary to enable distributed web authoring tools to be broadly interoperable • A network file system that works on entire files at a time, with good performance in high-latency environments • A protocol for manipulating the contents of a document management system via the Web
WebDAV Goals • To support virtual enterprises, being the primary protocol supporting a wide range of collaborative applications. • The support of remote software development teams. • To leverage the success of HTTP as a standard access layer for a wide range of storage repositories -- HTTP gave them read access, while DAV gives them write access.
WebDAV Features • Locking: long-duration exclusive and shared write locks prevent the overwrite problem. The duration of DAV locks is independent of any individual network connection. • Properties: XML properties provide storage for arbitrary metadata, such as a list of authors on Web resources. These properties can be efficiently set, deleted, and retrieved using the DAV protocol. DASL, the DAV Searching and Locating protocol, provides searches based on property values. • Namespace manipulation: Since resources may need to be copied or moved as a Web site evolves, DAV supports copy and move operations. Collections, similar to file system directories, may be created and listed.
Summary • WWW consists of hypermedia documents stored on Web servers, accessed by browsers • HTML allows a document to contain text, formatting commands, graphics and links to other documents • HTTP is an application-level protocol that supports negotiation, proxy servers, caching and persistent connections • WebDAV adds support for writing to HTTP, along with distributed versioning control
Network Management • Simple approach for a single network—use link level protocol for network management • Switches can be instructed to send control packets • Control packets cause receiver to act under control of manager, suspending normal operation • But! The Internet is multiple IP networks interconnected by routers, has no single link level protocol • Single manager can control only homogeneous devices • Controlled entities may not have common link level protocol • Machines under control may be at arbitrary points in the Internet
Internet Network Management • Network management protocols operate at the application layer • Network management uses TCP/IP transport level protocols • Advantage—one set of protocols for all devices • Disadvantage—may be impossible to reach a device not operating properly, can’t do anything unusual with link level protocols
Network Management Example Management client, management agents
SNMP • Simple Network Management Protocol • Defines set of operations and meaning of each • Management Information Base—MIB—required information to be maintained
Structure of Management Information • SMI standard covers MIB variables • Restrictions on types of variables • Rules for naming MIB variables • Rules for defining variable types • Abstract Syntax Notation 1 (ASN.1) is required • Notation readable by humans • Compact representation used by communication protocols
SNMP • Defining SNMP with explicit commands could be very complex • Instead, fetch-store paradigm is used • Operations take place as a side effect of fetch and store operations
SNMP Messages • No fixed formats • Integer that specifies protocol version • Additional header data • Security parameters • Data area • Data areas can be encrypted
Summary • Network management protocols are used to control routers and hosts • Devices are managed by a protocol that runs at the application layer. • SNMP is based on fetch-store paradigm • MIB defines all variables maintained by a managed entity
Some Internet Design Principles • Good Citizen Principle • Conserve Router Time • Make control messages human-readable • Soft state • Dumb Internet layer • Simplicity • Violate layers when needed
Careers One Path
Importance of the Web • “Metcalfe’s Law”—value of a network is proportional to the square of the number of connections (more or less) • Computers are valuable, but they add that value only when they can access the needed information • What if every computer in the world had access to all relevant information? What happens to the value of computing?
Changes Due to the Web • In business—disintermediation • Businesses that sell information from other businesses are disappearing • Travel agents, bookstores, CD stores • In politics—democracy • Rising importance of small campaign donors • Obama campaign • Inability of governments to control information flow • Social networks and insurrections