390 likes | 400 Views
Standards In A Digital World: Z39.50, HTML, Java: Do They Really Work?. Brian Kelly UK Web Focus UKOLN University of Bath B.Kelly@ukoln.ac.uk http://www.ukoln.ac.uk/. Contents. Introduction HTML Initial Roadmap / The Diversion / Back on Course W3C Standardisation Process
E N D
Standards In A Digital World:Z39.50, HTML, Java: Do They Really Work? Brian Kelly UK Web Focus UKOLN University of Bath B.Kelly@ukoln.ac.uk http://www.ukoln.ac.uk/
Contents • Introduction • HTML • Initial Roadmap / The Diversion / Back on Course • W3C Standardisation Process • Rivals to HTML • PDF • Viewers • Scripting • Client-side Scripting Languages • Server side Scripting • Distributed Searching • Z39.50 • Other Protocols • Conclusions
UK Web Focus UK Web Focus: • National web coordination post for UK HE community • Based at UKOLN, University of Bath • Responsibilities include: • Technology watch • Information dissemination in variety of ways: • Workshops (national, regional) • Presentations at conferences and seminars • Online • Coordination activities • Representing JISC on W3C • Brian Kelly appointed on 1st November 1996 • Involved with web since January 1993 • Previously worked at University of Newcastle, Leeds, Liverpool, and Loughborough
The Question Where do you stand? The success of the Web is based on building on open, non-proprietary standards. Use of proprietary systemshas increased costs forthe user, and resulted in flawed systems. The success of the Web is based on competition in the marketplace. Just look at the benefits provided by competition between Netscape and Microsoft.
HTML Roadmap HTML 1.0 Gets things started HTML 2.0 CERN / NCSA partnership introduces NCSA Mosaic with support for forms and inline images HTML + Proposal for enhancements including improved layout control (e.g. tables), maths, etc. Style Sheets Mechanism for defining appearance Structure separate from appearanceVarious proposals (DSSSL, CSS, …)
HTML History HTML 1.0 Unpublished specification. DTD developed by Tim Berners-Lee (CERN). HTML 2.0 Spec. based on innovations from NCSA (forms and inline images!) HTML 3.0 Proposed spec. (renamed from HTML+).Very comprehensive Failed to complete IETF standardisation processLittle implementation experience HTML 3.2 Spec. based on description of mainstream innovations in marketplace HTML 4.0 Current proposal.
HTML Wars October 1994 Netscape released (Mosaic Communication Corporation)Quality browser, but supported proprietary tags (<BLINK>, <FONT>, etc.) 1995 New versions of Netscape released, supporting additional proprietary tags (<SPACER>, <LAYER>, etc.) 1996 Microsoft respond to competition with their own proprietary tags (<MARQUEE>, etc)
HTML Wars - The Problems Device Dependency • Resources are dependent on a particular browser • Platform dependency Costs • Costs in supporting authoring tool • Potential costs in re-engineering Architecture • Proprietary innovations have been flawed: • Merging content and appearance • Maintenance of resources • Accessibility problems: • Poor support for access by disabled (e.g. speaking browsers for visually impaired)
End of the Wars? Thursday, August 21 1996 Microsoft Pledge on HTML Standards "HTML is the most basic and fundamental data format of the Web. Support for HTML standards ensures that content can be viewed by any browser as the creator intended. …. agreement on the most basic data format is critical to interoperability and the continued growth of the industry." See http://www.microsoft.com/internet/html.htm
Microsoft Pledge (Cont.) "Previous proprietary HTML extensions from Microsoft and other vendors have confused the market, hampered interoperability and been ill-conceived with respect to [HTML] design principles ... Microsoft will agree to: • Not ship extensions to HTML without first submitting them to W3C. • Implement all W3C approved HTML standards. • Clearly identify any not-yet-approved HTML tags we support as such. • Publish a Document Type Definition (DTD) for its browser as mandated by SGML. • Follow the architecture principles of HTML and its parent, SGML, when proposing new extensions. Microsoft agrees to hold itself to these standards. Will all the other Web browser vendors, including Netscape, also agree to this conduct of behavior?"
HTML 4.0 and CSS HTML 4.0 and CSS will provide an architecturally pure, yet functionally rich environment • HTML 4.0 • Improved forms • Hooks for stylesheets • Hooks for scripting languages • Table enhancements • Better printing • CSS • Support for all HTML formatting • Positioning of HTML elements • Support for multiple media • Problems • Some problems with CSS are being experienced following: • Use of CSS features which changed during CSS development • Browser supported features which changed
W3C Process W3C: • A consortium of subscribing member organisations • Areas of work agreed by members • Working group set up: • Charter • WG membership (restricted) • Initial recommendationsproduced by WG • Recommendation made public • Feedback on open mailing lists and to editor • Recommendation updated • Members vote • User Interface: • HTML • Style Sheets • Document Object Model • Maths • Graphics • Fonts
W3C Process Pros • Work can be well-focussed • Avoids "flaming" • Battle can take place in private • Implementation and development of spec closely linked Cons • Discussions are closed • Process undemocratic • Only rich companies can afford to take part • Difficult for non-members to contribute their expertise • Non-members may be developing systems in isolation
HTML - The Competition What are the alternatives to HTML ? HTML An SGML DTDDescribes document structureUsed in conjunction with emerging style sheet proposalAgreements on standards emerging PDF Adobe's Portable Document FormatProvides control over appearanceProprietary Native file formatStore document in native format, and provide user with reader on client machine SGML / XMLRicher DTDs
PDF PDF Pros • Control over appearance not (yet) easily available in HTML • Functionality of PDF Reader can controlled (e.g. prevent copying, printing with watermarks) PDF Cons • Does not store document structure • Proprietary • How would we feel about it if it where owned by Microsoft? • Remember GIF patent problems! • Printing problems
Use of Native File Format Files can be stored in their native file format (Word, Powerpoint, LaTeX, DVI, etc.) Files may then be viewed using the application or a viewer which understands the format Pros: • No conversion needed Cons: • Viewing software needed • Format version issues • Indexing issues • Viruses • Proprietary
XML XML: • Extensible Markup Language • A lightweight SGML designed for network use • Arbitrary elements can be defined (<STUDENT-NUMBER>, <PART-NO>, etc) • Eliminates problems encountered in extending HTML: • Extension by fiat e.g. <FONT> • Public experiments e.g. the <BLINK> tag • The standards process e.g. Maths • Agreement achieved quickly • Support from industry (SGML vendors, Microsoft, etc.)
XML Support Microsoft have expressed support for XML: "Internet Explorer version 4.0 will support a few XML applications (such as CDF). Microsoft will be supporting XML in future versions of Internet Explorer"See http://www.microsoft.com/standards/xml-f.htm Note how they will be supporting an ISO standard!
AddressingURL MetadataPICS, TCN, MCF, DSig, DC,... TransportHTTP Data formatHTML CSS, Cougar, XML HTTP/1.1, HTTP/NG Metadata Metadata - the missing architectural component from the initial implementation of the web URNs, DOIs
Metadata Requirements Imagine a university prospectus on the web
Metadata Standards PICS Agreement within industry (US Communications Decency Act perceived as threat)Format moving to XML in PICS/NG Dublin Core Pressure from library community results in changes to HTML 4Format likely to move to XML Digital SignaturesBased on PICS/NG W3C to set up a Metadata Coordination Group
Other XML Developments XML seems to be gaining momentum: PICS Moving from rating system to key part of metadata architecture CDF Channel Definition FormatMicrosoft proposal for push technology OPS Open Profiling SpecificationMicrosoft proposal XML Web CollectionsMicrosoft proposal for defining relationships between resource. MCF using XMLNetscape proposal for describing metadata for collections of resources using XML CML Chemical Markup Language MML Math Markup Language
Scripting Background: • Netscape's Javascript (renamed from Livescript) was first widely-deployed scripting language • Problems with inter-working between different versions • Problems with inter-working across browsers (Microsoft and Jscript) • Problems with use of multiple scripting languages in a document
Scripting Developments: • Javascript handed to standards body (ECMA)See http://www.ecma.ch/memento/tc39.htm • W3C developing standards for integrating scripting languages with HTMLSee http://www.w3.org/TR/WD-script • W3C working on Document Object Model (DOM) " .. a platform- and language-neutral interface that will allow programs and scripts to dynamically access and update the content, structure and style of documents."See http://www.w3.org/MarkUp/DOM/
Java Java: • Development began by Sun in early 1990s (known as Oak) • Moved to Web and released in 1995 • Programming language and virtual machine environment (provides portability and security) • See http://java.sun.com/
Java Applications http://www.mini.co.uk/ Java is gaining momentum: • Interactive applications • Enhanced user interfaces • Replacing conventional desktop applications • Extending browsers
Java Standardisation Java developments: • Sun submitting Java to standards body (ISO/IEC JTC1) • Concerns over process ("Microsoft believes that .. that Sun wishes to retain full ownership and control over its Java specifications ..") • See http://java.sun.com/aboutJava/standardization/index.html
Distributed Searching - The Problem End users face difficulties due to the wide variety of search interfaces available
Possible Solutions Agree to use the same software • Unlikely to happen • Undesirable Agree to use implement similar interfaces • Probably not feasible Have a centralised database • Scaling problems Use software which implements protocol designed to provide common search interface across diverse services • e.g. Z39.50
An Applications Solution Metacrawler can be used to search several large search engines. Problems: • Breaks if APIs change • Centralised system http://www.metacrawler.com/
Z39.50 - What Is It? Z39.50: • A protocol which specifies data structures and interchange rules that allow a client machine to search databases on a server machine and retrieve records that are identified as a result of the search • Maintained by Library of Congress • Developed by ZIG Why is it important? • Powerful searching • Local, familiar interface • Retrieves structured data
Z39.50 History Z39.50 (1988) • NISO work with roots inOSI work • "an unimplementable abomination which should neverhave been adopted" • "Inspired" WAIS (which was not interoperable) Z39.50 (1992) • Implementation experience • OSI now regarded as failure Z39.50 (version 3) • Accepted as ISO standard in 1996 ISO (23950) • Implemented using TCP/IP • Toolkits, profiles, etc now available Taken from Clifford Lynch's article athttp://hosted.ukoln.ac.uk/mirrored/lis-journals/dlib/dlib/dlib/april97/04contents.html
Z39.50 Pilot UKOLN is piloting Z39.50 across a number of services (UKOLN web site, BUBL, eLib project database, ...) • Imagine searching across JISC services (and institutions): • Find the chemical XML browser, and relevant reviews & papers. • Search HENSA software archive, Mailbase lists, a Chemistry gateway and Imperial college web site
Related Protocols LDAP Lightweight Directory Access ProtocolDerived from X.500 directory service See "Lightweight Directory Access Protocol" http://ds.internic.net/rfc/rfc1777.txt See also http://www.novell.com/ products/nds/ldap.htmlhttp://www.critical-angle.com/ ldapworld/Welcome.html whois++ Derived for whois protocol for finding people (IETF)See "Architecture of the Whois++ Index Service" at the URL http://ds.internic.net/rfc/rfc1913.txt
What The Software Companies Say Netscape (see http://search.netscape.com/newsref/std/standards_qa.html) • [We will] aggressively support open standards wherever they exist • Work within the open standards process to innovate valuable new functionality in ways that promote openness and interoperability. • All current Netscape products implement and support the existing open standards appropriate to their functionality. Microsoft (see http://premium.microsoft.com/msdn/library/sdkdoc/inetcsdk_2htc.htm) • Microsoft is fully committed to the HTML standards articulated by the World Wide Web Consortium (W3C) and the international Internet community.
Caveat Emptor! Beware of free software - it can be expensive! Remember Your Music Collection? 7" single Your favourite single 12" LP The album containing the hit 12" LP Greatest hits CD When you bought your CD Record companies are happy to sell you the same information in several formats! • Is The Same True Of Your Information Systems? • Home-grown • Gopher The hit of 1992 • WWW The HTML 2 version • WWW (2) Revamped, based on Netscapeisms • WWW (3) Revamped, based on HTML 4 and CSS • WWW (4) ?? • Microsoft and Netscape will be happy to sell you tools to manipulate the same information!
Conclusions • Without standards, costs are liable to escalate • Software companies are happy to take our money • OSI networking standard gave standardisation process a bad name • Current IETF / W3C process of developing standards and gaining implementation experience is valuable • Standards are not frozen • The difficult choice may be "What standard?"
Further Information List of Standards Bodies http://www.yahoo.com/Reference/Standards/ http://www.iso.ch/VL/Standards.html http://www.cmpcmm.com/cc/standards.html World Wide Web Consortium http://www.w3.org/ IETF http://www.ietf.cnri.reston.va.us/home.html http://info.isoc.org/home.html ISO http://www.iso.ch/welcome.html ECMA http://www.ecma.ch/ ISO-HTML ftp://ftp.cs.tcd.ie/isohtml/ Microsoft and Standards http://www.microsoft.com/standards/ Netscape and Standards http://search.netscape.com/newsref/std/standards_qa.html
On Julius Caesar, Queen Eanfleda, and the lessons from time past 1 Dual standards rather than a single standard cause trouble. 2 If you must have dual standards, specify mandatory conversions or interfaces between them. 3 Never leave anything implementation-dependent 4 If irregularities are unavoidable in a standard (e.g. because of external constraints), put them where they will do the least damage. 5 Never alter standards to please the rich and powerful, unless the changes can be justified on firm technical grounds. 6 Even the most rich and powerful can be persuaded that they will benefit from changing from their local standard to a general one. 7 The most effective standards are those you take so for granted you don't have to think about them. 8 If provisions of standards are based on external assumptions or constraints unrelated to the purpose of the standard, they are likely to appear irrational. http://www.kcl.ac.uk/kis/support/cc/staff/brian/caesar.html