430 likes | 546 Views
The Latest Web Developments. Email B.Kelly@ukoln.ac.uk URL http://www.ukoln.ac.uk/. Brian Kelly UK Web Focus UKOLN University of Bath Bath, BA2 7AY. UKOLN is supported by:. About Me. Brian Kelly: UK Web Focus – a JISC-funded post to advise HE and FE communities on Web developments
E N D
The Latest Web Developments Email B.Kelly@ukoln.ac.uk URL http://www.ukoln.ac.uk/ Brian Kelly UK Web FocusUKOLN University of Bath Bath, BA2 7AY UKOLN is supported by:
About Me • Brian Kelly: • UK Web Focus – a JISC-funded post to advise HE and FE communities on Web developments • Based in UKOLN (UK Office for Library and Information networking) – a small applied research organisation in University of Bath • Involved in Web since 1993, while working in Computing Service at University of Leeds • Close links with Computing Service and Library communities
About You • What is your involvement with the Web? • What topics would you like covered today?
Possible Interests Web applications XML File formats Content Management Systems Hyperlinking Interests Legal issues Technologies RDF When is it going to stabilise? What’s happening to HTML? Web browsers Web Standards Netscape or Microsoft? Web Architectures Open source vs licensed apps Web Applications Web Services
Contents • Standards and the Web • The Original Web Architecture • The Problems • Architectural Developments • Metadata • New Developments • Deployment Issues • Discussion
Standards, Architectures, Applications, Resources • This talk touches on several areas Standards: concerned with protocols and file formats Architectures: models for implementing systems Which standards are applicable NT / UnixFile system / database application HTML tools / content management Open standards vs. Proprietary HTML / XML vs. PDF CSS / XSL vs. HTML Applications: software products used to implement systems Resources: financial and staff costs needed to implement systems Apache / IIS FrontPage / Dreamweaver Oracle / SQLServer ColdFusion vs ASP Development vs. Migration costs Use of in-house expertise In-house vs. out-sourced Licensed vs. open source
Standards • Need for standards to provide: • Platform independence • Application independence • Avoidance of patented technologies • Flexibility ("evolvability" - Tim Berners-Lee) • Architectural integrity • Long-term access to data • Ideally look at standards first, then find applications which support the standards • Difficult to achieve this ideal!
Deployment Issues • What part of the spectrum are you closest to? Must support standards Go with the marketplace
I Support Standards • But: • You probably use PowerPoint, don't you? • Software vendors will subtly suck you into use of proprietary features • Home-grown solutions can be expensive (where are all the good Perl / C programmers willing to work on short-term contracts for a pittance in Universities?) • Standards may not take off – remember Coloured Book network protocols? • Proprietary solutions may become standardised • Standards may not yet be available (or finalised) • Do users want standards? Will "We support standards" conflict with "Our services are based on user requirements"?
I Follow The Marketplace • Good New Labour philosophy, but: • Can you trust your software vendor? • Will your software vendor be around in a few years time ("I only buy Rover") • Will your system be interoperable? • What happens when you want to interwork with partners or your organisation merges / is taken over? • What happens when you want to extend your system beyond the limits set by your software vendor? IBM was the market leader in the 1970s, but lost out in the PC revolution What will happen if Microsoft is split in two?
Some Difficulties • We should acknowledge some difficulties in a standards-based approach: • Keeping up-to-date (look at nos. of documents at http://www.w3c.org/TR/and size of http://www.diffuse.org/standards.html) • Spotting the winning standards • Implementing the standard in a timely way • Dealing with the problems of the software vendor • Resources!
Other • Standards bodies such as ECMA • Community groups which can agree on, say, profiles Standardisation HTML extensions PDF and Java? • Proprietary • De facto standards • Often initially appealing (cf PowerPoint) • May emerge as standards • W3C • Produces W3C Recommendations on Web protocols • Managed approach to developments • Protocols initially developed by W3C members • Decisions made by W3C, influenced by member and public review PNG HTML Z39.50 Java? • ISO • Produces ISO Standards • Can be slow moving and bureaucratic • Produce robust standards • IETF • Produces Internet Drafts on Internet protocols • Bottom-up approach to developments • Protocols developed by interested individuals • "Rough consensus and working code" HTTP URNwhois++ PNG HTML HTTP
World Wide Web Consortium • Much of the development of Web standards is being coordinated by the W3C: • W3C (World Wide Web Consortium): • International consortium, with headquarters at MIT, INRIA and Keio University (Japan) • Coordinates development of web protocols • Four domains: • Architecture • Technology & Society • User Interface • Web Accessibility
The Web Vision • Tim Berners-Lee's vision for the Web: • Automation of information management: If a decision can be made by machine, it should • All structured data formats should be based on XML • Migrate HTML to XML • All logical assertions to map onto RDF model • All metadata to use RDF A useful overview of Tim Berners-Lee's vision for the Web is given in his book Weaving The Web.
How Does The Web Work? • The Web has 3 fundamental concepts: • URLs: addresses of resources • HTTP: dialogue between client and server • HTML: format of resources 1 User clicks on link to the address (URL)http://www.netsoft.com/hello.html The Netsofthome page 2 Browser converts link to HTTP command (METHOD): Connect to computer at www.netsoft.com GET /hello.html Welcome to Netsoft 3 Remote computer sends file <HTML> <TITLE>Welcome</TITLE>.. <P>Welcome to <B>Netsoft</B> Web server Web Browser (client) 4 Local computer displays HTML file
Data Format HTML Addressing URL Web Protocols • Web initially based on three simple protocols: • Data FormatsHTML (HyperText Markup Language)provides the data format for native documents • AddressingURLs (Uniform Resource Locator)provides an addressing mechanism for web resources • TransportHTTP (HyperText Transfer Protocol)defines transfer of resources between client and server Transport HTTP
HTML 4.0, CSS 2.0 & DOM 1.0 • HTML 4.0 used in conjunction with CSS 2.0 (Cascading Style Sheets) and DOM 1.0 provides an architecturally pure, yet functionally rich environment • HTML 4.0 • Improved forms • Hooks for stylesheets • Hooks for scripting languages • Table enhancements • Better printing • CSS 2.0 • Support for all HTML formatting • Positioning of HTML elements • Multiple media support • DOM 1.0 • Document Object Model • Hooks for scripting languages • Permits changes to HTML & CSS properties and content • CSS Problems • Changes during CSS development • Netscape & IE incompatibilities • Continued use of browsers with known bugs
CSS http://www.w3c.org/Style/CSS/ • CSS: • Cascading Style Sheets • An open standard developed by W3C • Separates document structure (defined in HTML/XML) from the appearance • Makes maintenance of resources much easier body {background: blue;} h1: {font-family: arial} p: {font-family: times;text-align: justify} <link rel="style" src="sty.css" <h1>Heading</h1> <p>…</p> Imagine 10,000 HTML files .. With 1 CSS file
Limitations • HTML 4.0 / CSS 2.0 have limitations: • Difficulties in introducing new elements • Time-consuming standardisation process (<ABBREV>) • Dictated by browser vendor (<BLINK>, <MARQUEE>) • Area may be inappropriate for standarisation: • Covers specialist area (maths, music, ...) • Application-specific (<STUD-NUM>) • HTML is a display (output) format • HTML's lack of arbitrary structure limits functionality: • Find all memos copied to John Smith • How many unique tracks on Spice Girls CDs
XML • XML: • Extensible Markup Language • A lightweight SGML designed for network use • Addresses HTML's lack of evolvability • Arbitrary elements can be defined (<STUDENT-NUMBER>, <PART-NO>, etc) • Agreement achieved quickly - XML 1.0 became W3C Recommendation in Feb 1998 • Support from industry (SGML vendors, Microsoft, etc.) • Support in Netscape 6 (?) and IE 5
XML Concepts • Well-formed XML resources: Make end-tags explicit: <li>...</li> Make empty elements explicit: <img .../> Quote attributes <imgsrc="logo.gif" height="20" Use consistent upper/lower case • Valid XML resources: Need DTD • XMLNamespaces: Mechanism for ensuring unique XML elements: <?xml:namespace ns="http://foo.org/1998-001" prefix="i"> <p>Insert <i:PART>M-471</i:PART></p>
More XML Developments • Momentum behind XML is driving additional standardisation developments XML PathA language for addressing parts of an XML document, designed to be used by XSLT and XPointer XML Schemas (Ii)Defining the nature of XML schemas and their component parts XSLTA language for transforming XML documents into other XML documents …
XHTML • XHTML: • Extensible Hypertext Markup Language • HTML represented in XML • Some small changes to HTML: • Elements in lowercase (<p> not <P>) • Attributes must be quoted (<img src="logo" height="50"> • Elements must be closed (< p >..</ p >) • Empty elements must be closed (<img src="logo" . />) • Gain benefits from XML • Tools available (e.g. HTML-Kit from http://www.chami.com/html-kit/) • See <http://www.webreference.com/xml/column6/>, <http://groups.yahoo.com/group/XHTML-L/> and <http://www.ariadne.ac.uk/issue27/web-focus/>
Transport • HTTP/0.9 and HTTP/1.0: • Design flaws and implementation problems • HTTP/1.1: • Addresses some of these problems • 60% server support • Performance benefits! (60% packet traffic reduction) • Is acting as fire-fighter • Not sufficiently flexible or extensible HTTP/NG: • Radical redesign using object-oriented technologies • Undergoing trials • Gradual transition (using proxies) • Moving slowly
Addressing • URLs (e.g. http://www.bristol-poly.ac.uk/depts/music/) have limitations: • Lack of long-term persistency • Organisation changes name • Department shut down or merged • Directory structure reorganised • Inability to support multiple versions of resources (mirroring) • URNs (Uniform Resource Names): • Proposed as solution • Difficult to implement (no W3C activity in this area)
Addressing - Solutions • PURLs (Persistent URLs): • Provide single level of redirection • DOIs (Digital Object Identifiers): • Proposed by publishing industry as a solution • Aimed at supporting rights ownership • Business model needed • OpenURLs • Address mirroring issues • Pragmatic Solution: • URLs don't break - people break them • Design URLs to have long life-span • Further information: <URL: http://www.ukoln.ac.uk/metadata/resources/urn/> <URL: http://www.w3.org/Provider/Style/URI>
URNs, DOIs AddressingURL Metadata -RDFPICS, TCN, MCF, DSig, DC,... TransportHTTP Data formatHTML HTML 4.0, CSS, XML HTTP/1.1, HTTP/NG Metadata • Metadata - the missing architectural component from the initial implementation of the Web • Metadata Needs: • Resource discovery • Content filtering • Authentication • Improved navigation • Multiple format support • New devices • Rights management
Metadata Examples • DSig (Digital Signatures initiative): • Key component for providing trust on the web • DSig 2.0 will be based on RDF and will support signed assertion: • This page is from the University of Bath • This page is a legally-binding list of courses provided by the University • P3P (Platform for Privacy Preferences): • Developing methods for exchanging Privacy Practices of Web sites and user • Note that discussions about additional rights management metadata are currently taking place
RDF • RDF (Resource Description Framework): • Highlight of WWW 7 conference • Provides a metadata framework ("machine understandable metadata for the web") • Based on ideas from content rating (PICS), resource discovery (Dublin Core) and site mapping (MCF) • Based on a formal data model (direct label graphs) • Applications include: • cataloging resources – resource discovery • electronic commerce – intelligent agents • intellectual property rights – privacy • See <URL: http://www.w3.org/Talks/1998/0417-WWW7-RDF>
RSS – An RDF Application • RSS (Rich Site Summary): • Now an RDF application • Used for news feeds • Of interest to JISC (DNER architecture) • Lightweight approach that we should be investigating See example of an RSS authoring tool and parser at <http://rssxpress.ukoln.ac.uk/>. Note this service uses CGI – a JavaScript solution is also being developed.
RDF Conclusion • RDF is a general-purpose framework • RDF provides structured, machine-understandable metadata for the Web • Metadata vocabularies can be developed without central coordination • RDF Schemas describe the meaning of each property name • Signed RDF is the basis for trust • But: • Is RDF too complex? • Will it gain acceptance in the market place? • The jury is till out
Other Web Developments • Many Web standards developments are taking place outside W3C: • UDDI (Universal Description, Discovery, and Integration) – a way of describing Web services in a machine readable way to facilitate location of services by agents. See <http://www.uddi.org/> • Biztalk – a framework for developing XML schemas for B2B applications. See <http://www.biztalk.org/> • SOAP (Simple Object Access Protocol) - an XML protocol for exchange of informationSee <http://www.w3.org/TR/SOAP>
New Web Areas • Initially the Web provided: • An open environment for • sharing information • And aimed to: • provide a rich publishing and collaborative environment • The Web is now: • Widely used in closed environments (Intranets and Extranets, for ecommerce, etc.) • Addressing the missing components from the original architecture • Addressing universally by providing the infrastructure for support of new devices
E-commerce Example 1 http://www.w3.org/Signature/ • E-commerce: • Requires trust • Requires security • Is there a viable business model? • Developments: • Digital signatures • Public Key Infrastructure • Athens and Sparta in UK HE
The Mobile Web Example 2 • The Mobile Web: • Much hype at present • Have you used it? • Is it usable on such a small screen with slow network times? • What about the resources need to build a WAP site and a Web site
The Mobile Web Comments • Store resources in neutral format (XML) and generate WAP and Web • XML: open storage format • XSLT: Transform XML XSLTrules XML XSLTengine Ebook format WML filefor WAP XHTML for Web 3G promises multimedia and faster networks
Is It Worth It? • Has the Web stabilised? • Are you thinking about WAP services? • Will you want to (be forced to) make your Web service accessible? • Will you want to deploy personalised interfaces (e.g. My.Oxford.ac.uk) • Will your web service move from information provision to e-business? • Do you want your University web site to use business-to-business (B2B) protocols to automate transfer of link and news items to HERO?
What Should I Do? • How can I best exploit new developments? • Storing information in a structured format makes subsequent redevelopment easier • Be driven initially by standards and architectural considerations, not by applications • Consider use of more sophisticated web management tools, rather than HTML authoring tools • An organisational standards guidelines document (part of a Web Strategy document) may be useful • Don't work in isolation: • Monitor standards development (e.g. W3C) • Listen to others in your community • Talk and discuss issues within your community
Authoring • Authoring Web pages: • Was easy • Becoming more difficult as Web becomes more complex • More difficult to maintain • For large Web sites there is a need for: • More sophisticated tools e.g. content management systems • Tailoring content for devices?
browser browser Architectural Models • There is a need for more intelligent software which can process structured resources or reformat unstructured ones Web server simply sends file to client File contains redundant information (for old browsers) plus client interrogation support HTML resource Web server HTML / XML / databaseresource IntelligentWeb server Client proxy Server proxy • Intermediaries can provide functionality not available at client: • DOI support • XML support • Format conversion
Architectural Models – e.g. XML Deployment • Ariadne issue 14 has article on "What Is XML?" • Describes how XML support can be provided: • Natively by new browsers • Back end conversion of XML - HTML • Client-side conversion of XML - HTML / CSS • Java rendering of XML • Examples of intermediaries See http://www.ariadne.ac.uk/issue15/what-is/
Conclusions • To conclude: • The Web will continue to develop • Standards are important • Proprietary solutions are often tempting because: • They are available • They are often well-marketed and well-supported • They may become standardised • Solutions based on standards may not be properly supported by applications • Metadata is big growth area • Intermediaries may have a role to play in deploying standards-based solutions • There is a continual need to keep informed
Questions • Any questions?