140 likes | 149 Views
Understand the principles of XML, its advantages, how to define tags, and related technologies like SGML and CSS. Learn about well-formed and valid XML documents, and the basics of XML syntax. This session covers topics ranging from shortcomings of HTML, descriptive markup, to the extensible nature of XML. Discover how XML provides machine-independent data that can be used for various purposes. Dive into the world of extensible systems and the potential longevity of XML files. Take advantage of this opportunity to enhance your knowledge of XML and its applications in modern information systems and the web. 8
E N D
LIS1510Library and Archives Automation IssuesXML and extensible systems Andy Dawson School of Library, Archive & Information Studies, UCL(University of Malta 2008) Andy Dawson
What we will be covering today • Shortcomings of HTML • Generalised markup languages • How XML works • XML document types • Other related extensible technologies
Limitations of (X)HTML • Fixed tag set (specifications determined by W3C) • Intended for display of documents on the Web • Doesn’t do everything everyone wants • Not easy to use for other purposes • searching in documents • analysis of documents
Principles of Generalized Markup • Descriptive markup – encodes features within a document • Say what those features are - not what to do with them • Need to define your own tags • Creates machine-independent data • Data can then be used for different purposes
SGML • SGML – Standard Generalized Markup Language • International standard in 1986 • Metalanguage (syntactic framework) for defining markup tags • Parts of SGML are rather complex • Used by large projects • Not particularly easy to get started
XML • XML (Extensible Markup Language) • Adopted by World Wide Web Consortium in 1998 • Cut-down version of SGML • Based on same principles • Designed to implement easily on the Web
Advantages of XML • Machine-independent plain ASCII files • Potential longevity • Multi-purpose use • Ability to analyse/manipulate content • BUT need to define tag set! • Not a replacement for HTML unless analysis/manipulation of data is required • However, XHTML has become a ‘reliable’ alternative option for simple web publishing
Defining Your Own Tags Need to undertake document analysis Identify key features in document Identify structure of document Choose names for tags Only then can we apply the tag scheme
Example of a Newspaper Name of newspaper Issue Article Headline Author Paragraphs Pictures
Basics of XML Syntax Documents are composed of elements Start and end tags for every element - unlike HTML, end tags must be present also “Empty elements” Attributes modify an element have a name and a value Value must be enclosed in matching quotes (single or double) An element may have several attributes Documents can be “Well-formed” or “Valid”
Well-formed Documents Well-formed documents follow XML syntax i.e. start and end tags attributes in quotes nested structure But they have no pre-defined structure! Therefore: Can only check the syntax Cannot validate the structure of well-formed documents Prepares documents for potential use/conversion
Valid Documents A Valid XML document contains (or refers to) a Document Type Definition (DTD) The DTD is a specification of the document structure identifying which elements are allowed where they are allowed which attributes they may take
Related technologies • CSS – Cascading Style Sheets • As used with HTML • Concentrate only on appearance • XHTML • Version of HTML conformant with XML syntax • XSL - eXtensible Stylesheet Language • XML language for style sheets • Controls the appearance of the elements within the document & defines templates for processing elements • XML Schemas • Another way of defining document information
That’s all folks… • Any questions? • Optional XML exercise is available…anyone? • Otherwise – carry on with your coursework • Next Tuesday: Website management and last chance to finish off your website! …and have a nice weekend