420 likes | 583 Views
Introduction to Web Application. Introduction to Internet and Web. Topics. Internet Overview Internet based on TCP/IP World Wide Web and Browsers URL/URN/URI MIME HTTP. The Internet Has Arrived. BBS: bbs.fudan.edu.cn E-MAIL: mail.fudan.edu.cn
E N D
Introduction to Web Application Introduction to Internet and Web
Topics • Internet Overview • Internet based on TCP/IP • World Wide Web and Browsers • URL/URN/URI • MIME • HTTP
The Internet Has Arrived • BBS: bbs.fudan.edu.cn • E-MAIL: mail.fudan.edu.cn • Instant Messenger: QQ, ICQ, MSN, Yahoo Messenger • E-Commerce: BookStore: www.china-pub.com • Stream Media: Audio, Video, Text: 上海热线影院 • Network Education: Network School of Fudan Univ. • Traveling Assistant: ctrip (携程网) • e-News: news.sina.com.cn • Net Game: cs, sc, 传奇 The Internet has arrived; are you ready for it?
How to visit internet • By computer • By PDA • By mobile phone • By television • … • By any e-quipment
Internet Overview • DARPA (Defense Advanced Research Projects Agency) Continued its research for an internetworking protocol suite, from the early NCP (Network Control Program) host-to-host protocol to the TCP/IP protocol suite, which took its current form around 1978. • The first real implementations of the internet were found around 1980 (BITNET and CSNET) when DARPA started converting the machines of its ARPANET to use the new TCP/IP protocols • In 1983, the transition was completed and DARPA demanded that all computers willing to connect to its ARPANET use TCP/IP.
Internet Overview (cont.) • DARPA also contracted Bolt, Beranek, and Newman (BBN) to develop an implementation of the TCP/IP protocols for Berkeley UNIX on the VAX and funded the University of California at Berkeley to distribute that code free of charge with their UNIX operating system. The first release of the Berkeley Software Distribution to Include the TCP/IP protocol set was made available in 1983 (4.2 BSD). • From that point on, TCP/IP spread rapidly among universities and research centers and has become the standard communications subsystem for all UNIX connectivity. • A new national network was created in 1986: NSFnet • 1995, a small part of NSFnet returned to being a research network. The rest became known as the internet
Internet DefinitionFederal Networking Council (FNC) • Refers to the global information system that • Is logically linked together by a globally unique address space based on the Internet Protocol (IP) or its subsequent extensions/follow-ons • Is able to support communications using the Transmission Control Protocol/Internet Protocol (TCP/IP) suite or its subsequent extensions/follow-ons, and/or other IP-compatible protocols • Provides, uses or makes accessible, either publicly or privately, high level services layered on the communications and related infrastructure described herein
The Internet is a huge collection of computers (or other devices: printer, plotter) connected in a communications network • The internet is primarily a network of networks rather than a network of computers • Every Device or Machine has unique address in internet • Every Device use TCP or its subsequent extensions/follow-ons to communication
IP address • IP: Internet Protocol • It is a unique 32-bit number: 10.13.31.99 • IPv6 to resolve the lack of IP address
Domain Name • www.software.fudan.edu.cn • It is a textual name • Like IP address, fully qualified domain names must be unique • Mapping between IP address and domain name • Domain name is translated to IP in DNS (Domain Name System) server
Internet and Telephone net • Like the telephone system, the Internet provides communication • Transit about internet: • If no one else has the service, it is useless; if everyone else has the service, it is a necessity. • Cost: • It will be a viable business until the cost of service becomes low enough for an average family to have an phone/internet installed • Access • Limited access to ubiquitous access
Standards for TCP/IP and the Internet • The Internet Society (ISOC) servers as the standardizing body for the internet community. It is organized and managed by the internet architecture Board (IAB) • The IAB itself relies on the Internet Engineering Task Force (IETF) for issuing new standards, and on the Internet Assigned Numbers Authority (IANA) for coordinating values shared among multiple protocols. • The RFC (Request For Comments) Edition is responsible for reviewing and publishing new standards documents. • The IETF itself is governed by the Internet Engineering Steering Group (IESG) and is further organized in the form of Areas and Working Groups where new specifications are discussed and new standards are proposed. • The Internet Standards Process, described in RFC2026– the Internet Standards Process – Revision 3, is concerned with all protocols, Procedures, and conventions that are used in or by the Internet, whether or not they are part of the TCP/IP protocol suite.
TCP (Transmission Control Protocol) • A connection-based protocol that provides a reliable flow of data between two computers. • The Hypertext transfer Protocol (HTTP), File Transfer Protocol (FTP), and Telnet are all examples of applications that require a reliable communication channel. • Connection-oriented: setup required between client, server • Reliable transport: between sending and receiving process • Flow control: Sender won’t overwhelm receiver • Congestion control: throttle sender when network overloaded • Does not provide: timing, minimum bandwidth guarantees
UDP (User Datagram Protocol) • A protocol that sends independent packets of data, called datagrams, from one computer to another with no guarantees about arrival, UDP is not connection-based like TCP. • Sending datagrams is much like sending a letter through the postal service: The order of delivery is not important and is not guaranteed, and each message is independent of any other • Many firewalls and routers have been configured not to allow UDP packets. If you’re having trouble connecting to a service outside your firewall, or if clients are having trouble connecting to your service, ask your system administrator to set UDP is permitted • As an example, a clock server that sends the current time to its clients when requested to do so. If the client misses a packet, it doesn’t really make sense to resend it because the time will be incorrect when the client receives it on the second try. If the client makes two requests and receives packets from the server out of order, it doesn’t really matter because the client can figure out that the packets are out of order and make another request. • Does not provide: connection setup, reliability, flow control, congestion control, time, or bandwidth guarantee.
Application-Layer Protocols • Define messages exchanged by apps and actions taken • Implementing services by using the service provided by lowerlayers.
Ports • Data transmitted over the internet is accompanied by addressing information that identifies the computer and the port for which it is destined. The computer is identified by its 32-bit IP address. Which IP uses to deliver data to the right computer on the network. Ports are identified by a 16-bit number, which TCP and UDP use to deliver the data to the right application • Port numbers range from 0 to 65,535 because ports are represented by 16-bit number. The port numbers ranging from 0-1023 are restricted; they are reserved for use by well-known services such as HTTP and FTP and other system services. These ports are called well-know ports.
Socket • The socket is the software abstraction used to represent the “terminals” of a connection between machines • Sockets provide an abstraction that hides the complexities of getting the bits and bytes on the wire for transmission.
Server-Client Communication through Socket • A socket can be considered as a connection point
What Transport Service Does an Application need? • Data loss • Some apps (e.g., audio) can tolerate some packet losses. • Other apps (e.g., file transfer, telnet) require 100% reliable data transfer • Bandwidth • Some apps (e.g., multimedia require minimum amount of bandwidth to be “effective” • Other apps (“elastic apps”) make use of whatever bandwidth they get • Timing • Some apps (e.g., internet telephony, interactive games) require low delay to be “effective”
Intranet and Internet • Intranet is a internet in an enterprise while internet is global.
Some Important Address • W3C, World Wide Web Consortium • http://www.w3c.org • ISO, International Organization for Standardization • http://www.iso.ch • IEC, International Engineering Consortium • http://www.iec.org • Unicode • http://www.unicode.org • IETF, Internet Engineering Task Force • http://www.ietf.org • IANA, Internet Assigned Number Authority • http://www.iana.org • OASIS, Organization for the Advancement of Structured Information Standards • http://www.oasis-open.org
WWW • World Wide Web, Web • In 1989, a small group of people led by Tim Berners-Lee at CERN (European Laboratory for Particle Physics) proposed a new protocol for the Internet and a system of document access to use • Allow a user anywhere on the internet to search for and retrieve documents in databases on any number of different document servers • For the form of its documents, the system used hypertext: text with embedded links to text in other locations to allow nonsequential browsing of textual material. • Document and page • Hypermedia: the document contain nontextual information
How to understand Hyper-Text void f1(int i) { int j; j = getJ(); if (decideEnter(j) f2(); j = getJ()+i; … } void f2(){ … }
Differences between WWW and Internet • The internet is a collection of computers and other devices connected by equipment that allows then to communicated with each other. • The web is a collection of software and protocols that have been installed on most, if not all, of the computers on the internet • In an abstract sense, the Web is merely a vast collection of documents, some of which are connected by links. These documents are accessed by Web browsers
Web browser • Documents provided by servers on the Web are accessed through browsers, which are programs. In early1993, NCSA releases Mosaic, which is the firstGUI browser. • Browsers are clients on the web, because they initiate the conversation with the server, which waits for a message from a client before doing anything. • Although Browser supports a variety of protocols, the most common one is the Hypertext Transfer Protocol (HTTP) • The most commonly used browsers are Microsoft Explorer, and Netscape Navigator.
Web Server • Web server are programs that provide documents to browsers. • Servers are slave programs • Static Web Server and Dynamic Web Server • Server • Apache • IIS: Internet Information Service
Why WWW • Graphical– text, graphics and other media can coexist on Web page • Easy to use– hypertext and good WWW browsers are intuitive tools to use • Cross-platform– One big advantage is that WWW can run on almost any computer • Distributed– Information and resources shared Globally • Dynamic– Information on the WWW can be constantly updated (unlike a book or CD-ROM). Live information can assimilated. • Interactive– through forms and other tools (e.g. Java) the WWW can be interactive.
URL - General form: scheme:object-address - The scheme is often a communications protocol, such as telnet or ftp - For the http protocol, the object-address is: fully qualified domain name/doc path - For the file protocol, only the doc path is needed - Host name may include a port number, as in zeppo:80 (80 is the default, so this is silly) - URLs cannot include spaces or any of a collection of other special characters (semicolons, colons, ...) - The doc path may be abbreviated as a partial path - The rest is furnished by the server configuration - If the doc path ends with a slash, it means it is a directory
URI, URL and URN • URI: Uniform Resource Identifier, The universal syntax allows access of objects available using protocols, and may be extended with technology. The specification of the URI syntax does not imply anything about the properties of names and addresses in the various name spaces which are mapped onto the set of URI strings. The properties follow from the specifications of the protocols and the associated usage conventions for each scheme (RFC 2396 ) • URL: For existing internet access protocols, it is necessary in most cases to define the encoding of the access algorithm into something concise enough to be termed address. URIs which refer to objects accessed with existing protocols are know as “Uniform Resource Locator” (URLs) and are listed here as used in WWW (RFC 1738 ) • URN: there is currently a drive to define a space of more persistent names than any URLs, There “Uniform Resource Names” are the subject of an IETF working group’s discussions. (RFC 2141) • URI = URL + URN
Multipurpose Internet Mail Extensions - Originally developed for email - Used to specify to the browser the form of a file returned by the server (attached by the server to the beginning of the document) - Type specifications - Form: type/subtype - Examples: text/plain, text/html, image/gif, image/jpeg - Server gets type from the requested file name’s suffix (.html implies text/html) - Browser gets the type explicitly from the server - Experimental types - Subtype begins with x- e.g., video/x-msvideo - Experimental types require the server to send a helper application or plug-in so the browser can deal with the file
HTTP: HyperText Transfer Protocol • The basis for WWW • HTTP/1.0 – RFC 1945, HTTP/1.1-RFC 2616 • A TCP/IP based client-server system • How http works? • The clientestablishes a TCP connection to the server, issues a request, and read back the server’s response • The server denotes the end of its response by closing the connection • If the response is an HTML file, usually the file contains pointers (hypertext links) to other files that can reside on other server • The client needs to re-establish TCP connections to obtain the other required resources (each connection fro each resource)
Request Response …
HyperText Transfer Protocol - Request Phase - Form: HTTP method domain part of URL HTTP ver. Header fields blank line Message body - An example of the first line of a request: GET /cs.uccp.edu/degrees.html HTTP/1.1 - Most commonly used methods: GET - Fetch a document POST - Execute the document, using the data in body HEAD - Fetch just the header of the document PUT - Store a new document on the server DELETE - Remove a document from the server - Header fields for requests - Host: www.fudan.edu.cn - Accept: image/jpeg - subtype can be an *
HyperText Transfer Protocol - Response Phase - Form: Status line Response header fields blank line Response body - Status line format: HTTP version status code explanation - Example: HTTP/1.1 200 OK (Current version is 1.1) - Status code is a three-digit number; first digit specifies the general status 1 => Informational 2 => Success 3 => Redirection 4 => Client error 5 => Server error - The header field, Content-type, is required
Summary • IP address and Domain Name • What is WWW • What is Internet • URL • What is HTTP