460 likes | 758 Views
World-Wide Web. Introduction to the World-Wide Web Setting up a Web server Authoring for the Web ... Gateways and forms Access control and security. History. CERN, Geneva, 1989 Def HTTP client/server protocol Sample server Prog. lib. wwwlib 1992: callable interface in public domain
E N D
World-Wide Web • Introduction to the World-Wide Web • Setting up a Web server • Authoring for the Web ... • Gateways and forms • Access control and security
History ... • CERN, Geneva, 1989 • Def HTTP client/server protocol • Sample server • Prog. lib. wwwlib • 1992: callable interface in public domain • ==> Development • Web browsers • New features • Now browsers & servers for all major architectures
Original goals • Link distinct documents • Success • Facilitate collaborative authoring • Not so much, yet
Introduction to the World-Wide Web • Components of Web architecture ... • What's the Web good for? ... • Basic Web Concepts ... • WWW Servers and Browsers ... • Searching on the Web …
Databases Contents Software applications Components of Web architecture … • HTML: Hyper Text Markup Language • HTTP: Hyper Text Transfer Protocol • CGI: Common Gateway Interface Web clients Web servers TCP/IP-based network CGI
Powerful linking abilities ... • Highlight words/pictures • ==> Link/point to other • Documents • Sound files • Movie clips • From any point to any point
Most graphical Internet service ... • Need browser w/GUI • Click on link to follow
What's the Web good for? ... • Museums • Newspapers, magazines • Business • Investment information • Libraries, universities • Governments • Individuals
Basic Web Concepts ... • Hyperlinking ... • The HTML Tagging Language ... • The URL Concept ... • What's a WWW Browser? ... • What's a WWW Server? ... • HTTP ...
Hyperlinking ... • Home page: central doc. in Web server • Hyperlinks (underlined) • Navigate: follow hyperlinks (surfing) • Anywhere on Web servers in the world! ...
The HTML Tagging Language ... • Hypertext Markup Language • HTML Documents • Describe structure of doc & hyperlinking info • No exact formatting • Example ...
Example ... • .
The URL Concept ... • Each hyperlink, 2 components: • Anchor text/graphics • Trigger hyperlink when clicked • Universal Resource Locator (URL) ...
Universal Resource Locator (URL) ... • What to do when HL activated • Protocol to reach • Target server • Host system (server name) where doc is • Directory path • Filename • E.g., master list of all public WWW server in world • http://info.cern.ch/hypertext/DataSources/WWW/Geographical.html (absolute)
What's a WWW Browser? ... • Client/server environment • Browser • Use URL • Retrieve Web document from Web server • Interpret HTML • Present document to user • Bitmap graphics ==> more than char. based • E.g., hyperlink: color/underline vs. reverse video • Example ...
Example ... • Protocol depends on server • Some browsers can access WAIS servers • Can start up telnet sessions • Can save, email, print, search, see HTML source
What's a WWW Server? ... • SW used with WWW • Client asks for page; server grabs & returns • Special scripts • Gateways to other info resources • E.g., input from forms • Custom scripts to process
HTTP ... • HyperText Transfer Protocol • Request browser --> server ==> new connection • Open connection • Transfer document • Close connection • Data types between server and browser • Text (text/html, text/plain, ...) • Image (image/gif, image/tiff, ...) • E.g., start xv to display (Preferences: Helper App's)
WWW Servers and Browsers ... • Browsers ... • Servers ...
Browsers ... • Netscape Navigator/Communicator • Internet Explorer • Mosaic (historic importance) • Lynx (dumb terminal) • Amaya (free) • Browser/editor (http://www.w3.org/Amaya) • Opera (not free)
Mosaic ... • The first publicly available browser • Made it happen!
Netscape Navigator ... • Netscape Communications Corp. (former Mosaic Comm.)
Lynx ... • Full-screen, character-based • VT100 terms, emulators • Arrow keys to navigate among HTML links (reverse video) • Bookmarks • Forms • Interactive mode: post articles to newsgroups • Non-interactive mode: filter HTML to formatted ASCII • ftp://ftp2.cc.ukans.edu/pub/lynx/
Many others • Came and went • Maybe 10% of today’s users
Servers ... • NCSA • CERN • Apache (48% of all servers) • Netscape Enterprise • O’Reilly’s WebStar • Other
NCSA ... • Public domain, written in C, small, fast • No licensing restrictions • Compatible with most HTTP browsers • Directory aliasing: doc's can be served from any physical directory structure • Searches, HTML forms, clickable image maps, control user access. • http://hoohoo.ncsa.uiuc.edu/docs/setup/PreCompiled.html
CERN ... • Public domain, written in C • No licensing restrictions • Compatible with most HTTP browsers • Directory aliasing: doc's can be served from any physical directory structure • Searches, clickable image maps, control user access • Proxy & caching • On firewall machines access to outside world from inside
Work in progress ... • See http://www.w3.org • User Interface • Technology & Society • Architecture • Web Accessibility Initiative • Other
User Interface • HTML Enhancements • http://www.w3.org/MarkUp.html • Style Sheets • Document Object Model • Mathematics • Graphics • Internationalization • Fonts • Amaya
Technology & Society • Digital Signature Initiative • Metadata • PICS • Privacy • Security • Electronic Commerce
Architecture • HTTP • Synchronized Multimedia • XML • Jigsaw • Libwww
Web Accessibility Initiative • Accessibility of the Web • Through five primary areas of work: • Technology • Guidelines • Tools • Education & outreach • Research & development
Secure Transactions ... • Credit card numbers • Signatures (electronic) • Legally binding time stamps • Secure HTTP development • Authority of transactions • Confidentiality of info exchanged • http://www.commerce.net/information/standards/drafts/shttp.txt • email: shttp-info@eit.com
Uniform Naming ... • URL: no uniquely identify doc • Only instance of doc • No account for mirrored doc's, versions out of date • URN: Universal Resource Name • Unique doc identifier • Like ISBN for books • Return URL of closest copy? • URC: Universal Resource Citation (just a diff name)
Commercialization ... • Bundle browser w/systems • License browsers + sell • ==> Web onto more desktops • Info providers + publishers • Secure web ==> electronic commerce
Searching on the Web ... • Catalogs, directories • Manual • E.g., Yahoo • Indices, search engines • Automatic • E.g., AltaVista
Search Engines ... • "Robots," "worms," "spiders," "crawlers,” meta searchers • No ultimate search tool on the Web • Different search strategies • ==> Different results • Partial list ... • More info ...
Partial list ... • Metacrawler ... • Alta Vista ... • WebCrawler ... • Lycos • InfoSeek • Copernic
Comparison ... • IEEE Internet Computing, July/August 1998, pp. 78-83, http://computer.org/internet/v2n4/w4arach.htm • http://www.cnet.com/Contents/Reviews/Compare/Search/ (old)
More info ... • "New Spiders Roam the Web" • http://www.rpi.edu/~decemj/cmc/mag/1994/sep/spiders.html • "WWW Robots, Wanderers, and Spiders" • http://web.nexor.co.uk/mak/doc/robots/robots.html • Netscape Web page • http://home.mcom.com/home/internet-search.html • Click Net Search button in Netscape
Internet Research Tool Kit • Links to most search & research tools • http://www.cs.uml.edu/~haim/NetResearch_links.html