1 / 81

Intro to XML

Intro to XML. Lecture overview: What is XML? Mark-up languages XML vs. HTML Style Sheets (CSS and XSL) Introduction to XML See text Chapters 7 and 3. What is XML?. E X tensible M arkup L anguage http://www.w3.org/XML/ http://en.wikipedia.org/wiki/XML

julie
Download Presentation

Intro to XML

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Intro to XML • Lecture overview: • What is XML? • Mark-up languages • XML vs. HTML • Style Sheets (CSS and XSL) • Introduction to XML • See text Chapters 7 and 3

  2. What is XML? • EXtensible Markup Language • http://www.w3.org/XML/ • http://en.wikipedia.org/wiki/XML • Metalanguage: language and tools for creating new markup languages • Designed to transport and store data • Showing data in formatted way is possible with XML, but not the purpose of the language

  3. What is XML? • Consists of tags (just like HTML) • But now we define the tags ourselves • Thus, technically speaking documents that we claim to be “XML” are actually “documents in XML-generated languages • Thus, with XML, we can define markup elements based on a particular domain • Ex: MathML which is the Mathematical Markup Language • http://www.w3.org/Math/

  4. What is XML? • Ex: XHTML – a version of HTML generated using XML • Ex: RSS – formatting for "really simple syndication" • XML is an open standard and can be edited with a plain text editor • Being text-based enables it to be portable across platforms • Includes • syntax - the rules of the language • structure - organizing and storing information

  5. Markup Languages • A set of rules that define the layout, format, or structure of text within a document • Markup elements are added to the document, then processed by a program that can interpret the elements • See: http://en.wikipedia.org/wiki/Markup_language

  6. Markup Languages • “Marking up” has been used by typesetters for hundreds of years • “Markup Languages” were first proposed in the late 1960’s • Ex: LaTeX is a markup language with elements for describing the format of documents • Used a lot in Math and CS research papers • Based on TeX, developed by Donald Knuth • Leslie Lamport added some features to TeX (thus the La)

  7. Markup Languages • See: http://www.latex-project.org/ • Side note: Knuth and Lamport are both very famous CS researchers • Text formatting was done on the side! • Ex: Microsoft Word provides an interface for “marking up” elements in a document such as bolding a string • In Word 2007+ it actually uses XML • http://en.wikipedia.org/wiki/Office_Open_XML • Let’s look at two simple examples • hello.docx, wordXML.docx

  8. Markup Languages • SGML - Standard Generalized Markup Language • Established a standard for markup • International standard for large document projects • But extremely complex • Ex: parsing is difficult • http://en.wikipedia.org/wiki/SGML • HTML is derived from SGML (mostly)

  9. HTML • Advantages: • Fairly small, easy to learn language • It’s an open standard, widely supported • Fast interpretation - fast web browsers • Portability vitally important to adoption as the standard for web markup • Disadvantages: • Limited capabilities, fixed specification • It’s not extensible for new domains

  10. Motivation for XML • Motivation for XML: • HTML elements are primarily for defining presentation and formatting • Show the data in a browser • Allow interaction with the user • HTML does not provide semantic information about the data itself • What does the data mean? • How is the data on one page different / similar to data on another page?

  11. XML • Official release 1.0 in 1998 • Fifth edition recommendation in November 2008 • Also Version 1.1 released in 2004 • Second edition in 2006 • Version 1.0 is still most widely used • Has simplicity of HTML and extensibility of SGML • It’s a subset of SGML • Easier to parse than SGML

  12. XML Features • Allows data to be self-describing • Tag names allow information about the content to be inferred • This has been debated • Some believe it is not a valid feature • Tags can be ambiguous • Meaning is to humans, not computers • Google "XML self-describing"

  13. XML Features • Provides rules for XML elements to limit type of data in an element • This can be done via Document Type Definitions or Schema • Allows custom data structures • Tags can be nested to form arbitrary tree configurations for data representation

  14. XML Features • Can be used for data storage and interchange • As long as the data specification is known, any party retrieving / receiving the data can parse it correctly • Separates the data from its format (presentation) • Allows different presentation styles for the same data

  15. XML Features • Can create custom elements / tags • Tags can describe data <smart-phone-type> Android </smart-phone-type> • Elements do not map to formatting styles • Unlike HTML • Style sheets allow data to be formatted in different ways

  16. Example: HTML <html> <head><title>Job Posting: Web master</title></head> <body> <h1>Job Posting</h1> <h2>Job title: <i>Web master</i></h2> <p><b>Job Description:</b> We are looking for a Web master to create and oversee our company’s web pages. </p> <p><b>Skills needed:</b> Basic writing skills, good communication, HTML.</p> </body> </html>

  17. Example: HTML • In this example, the tags tell how the data is to be formatted • However, they tell us nothing about the type of information that is being presented • Just looking at the tags this could contain anything • We must try to infer that it is a job posting by reading the document

  18. Example: XML <?xml version = “1.0”?> <job-posting> <title> Job Title: <emphasis> Web Master </emphasis> </title> <description> We are looking for a Web master to create and oversee our company&apos;s web pages. </description> <skill-list> <skill> Basic writing skills </skill> <skill> Good communication skills </skill> <skill> Programming experience in web languages </skill> </skill-list> </job-posting>

  19. Example: XML • In this example, the tags tell us information about the data that is stored • By looking at the tags (without even seeing the values) we can infer a lot about the nature of the data • Even if a computer is "looking at the tags" we can still program specific behaviors to specific tags

  20. Example XML • However, the tags tell us nothing about how the data will be formatted • In some cases we may not even care about this • Data may not need to be presented visually • If needed we can use style sheets for this

  21. XML Data Hierarchy • Hierarchy of data in XML - defined by function and relationship to other elements • Root: Element encompasses all other elements • In effect defines what the document is • Children: Elements in other elements • Parent: The containing element

  22. job-posting Hierarchy description skill-list title skill skill skill emphasis

  23. Displaying XML files in browser • Relating XML document and style sheet: • We can use either a cascading style sheet (.css) or an XSLT style sheet (.xsl) • http://www.w3.org/Style/CSS/ • http://www.w3.org/Style/CSS/learning <?xml-stylesheet type = “text/css” href = “job.css"?>

  24. Style Sheets • Cascading style sheets (CSS) • A means for presenting document • We locate the style sheet in a file • Ex: job.css • Has rules and declarations to tell browser how to display the document • In the XML document add a line to show where the stylesheet is located <?xml-stylesheet type=“text/css” href=“job.css”?>

  25. Style Sheets • Two parts in style sheet • Element selector • Property declarations address { font-size:12pt; font-family:arial } Element (comma separated list) Properties (property and value pairs separated by semicolons)

  26. Selected Formatting Properties A wide variety of properties PropertyDescriptionValues font Font properties font: italic small-caps bold 12px arial font-family Typeface font-family: arial font-size Size of font font-size: small font-style Style of font font-style: italic text-align Alignment of text text-align: center text-indent Indent first line text-indent: 10 (# pixels) color Text color color: red and many, many more.... see: http://www.w3schools.com/css

  27. Cascading Style Sheets To, from elements are bold, left aligned, with solid border to, from { font-weight:bold; text-align:left; border-style:solid } subject { text-decoration: underline; background-color: green; color: yellow } * { color:green } Subject element is underlined, green background color (yuck), text is yellow Default properties to use: text color is green See job.css

  28. CSS Inheritance • Hierarchy of elements in XML docs • Hierarchy is applied to style sheet with property inheritance • Properties defined for parents are passed to child elements • E.g., parent is 18pt -> child is 18pt unless property is redefined

  29. Example <?xml version = “1.0”?> <?xml-stylesheet type = “text/css” href = “job.css"?> <job-posting> <title> Job Title: <emphasis> Web Master </emphasis> </title> <description> We are looking for a Web master to create and oversee our company&apos;s web pages. </description> <skill-list> <skill> Basic writing skills </skill> <skill> Good oral skills </skill> <skill> Programming experience in web languages </skill> </skill-list> </job-posting>

  30. Style sheet (.css) title { font-size: 28pt; color: red; } emphasis { font-weight: bold; } description { display: block; margin-top: 15px; font-size: 18pt; } skill-list {background-color: yellow; color: green;} skill { display: block; margin-left: 30px; margin-top: 5px; font-size: 14pt; font-family: 'Comic Sans MS';}

  31. CSS Inheritance • Consider job.xml and job.css example • <emphasis> tag is within both the <title> tag and the <skill> tag • In both cases it changes the font to bold, but does not affect any other formatting • The other properties are inherited from the parent tag • See also my home page and CS 1520 page

  32. CSS with HTML • CSS can also be use effectively with HTML files • We can define style classes to be used with our documents • We can define style for given tags • Syntax for linking is different with HTML than XML • Use the <link> tag • See CDpoll-style.php and CDstyle.css

  33. CSS Limitations • CSS is not a general way of expressing presentation; it provides a “static” formatting • E.g., we can’t make on-the-fly decisions about whether to include a header or footer, whether to color something green when it has two children, etc. • Formatting is based on the tags / attributes, not on the organization

  34. CSS Limitations • We can get around these limitations using Javascript / DOM • Allows dynamic updating of the style through events • See CS 1520 Home page • See CDpoll-style.php and CDstyle.css • We can also use XSL for style • XSL is more flexible than CSS – we will briefly look at XSL

  35. Displaying XML files in browser • Without the style sheet, the document will appear with the tags (elements) intact • In Firefox or IE, at least • However, we can elide elements by clicking on the "–" that appears before them • Similar to opening subfolders in a folder hierarchy • See job-no-css.xml

  36. XSL • XSL (eXtensible Stylesheet Language) is a very powerful combination of 3 different languages: • XSLT – XSL Transformation • XML language used to transform an XML document in various ways (perhaps into a different type of document) • Ex: Transform XML into XHTML for display • Ex: Transform from one XML language into a different one

  37. XSL • Xpath – XML Path language • Used to access / parse / traverse an XML document • Enables a user to query / access parts of the document tree in a regular way • Not itself an XML language • XSL-FO – XSL Formatting Objects • XML language designed to format / present documents • Ex: Can be used to generate PDFs from XML documents

  38. XSL • Big Picture Example: • We have an XML document which we would like to format for display in a browser • Perhaps we would like it to be displayed in an HTML table, or within some other HTML elements • We can use Xpath to select the XML elements that we want to present, and XSLT to transform them into XHTML

  39. XSL • If we want to add style to our newly generated XHTML document we can easily add some CSS as it is generated • Alternatively, we could transform the original document into XSL-FO and impart the style there • This is actually very powerful but in many cases using XSLT + CSS will be sufficient • Result: We still use CSS for style, but XSLT gives us much more flexibility than CSS alone

  40. XSL • Ex: Consider a document that is storing XML emails • We would like to display this document in a nicely formatted way in the browser • If we use CSS alone, we can add style to the XML elements, but that is basically it • We are seeing an XML document, with style applied to the elements • See emails2.xml and emails2.css

  41. XSL • If we use XSLT + CSS, we can create a new XHTML document that includes XHTML tags, our XML data AND CSS • Now we are seeing an XHTML document which has data from our XML document within it • Note that we are not changing our original XML document • The XHTML that we see is dynamically generated via XSLT and Xpath • See emails2-xsl.xml, emails2.xsl and emails2-xsl.css

  42. XML Syntax • Components of XML documents • Declaration - says it’s an XML document • Elements - describe data in document • Attributes - info. clarifying element • Entities - placeholders for content • Comments - useful notes & documentation • These components can be specified and regulated using DTDs (Document Type Definitions) or Schema

  43. XML Declaration • Indicates document is an XML document <?xml version=“1.0”?> <?xml version=“1.0” encoding=“UTF-8”?> <?xml version=“1.0” encoding=“UTF-16”?> Encoding attribute deals with the character set that will be used Ex: if non-ASCII characters will be used

  44. XML Elements • Core components of XML document • Consist of • Start tag: <element> • Content: data or other elements or both • End tag: </element> • Elements are like English nouns – definable objects

  45. Element Examples Start tag <book> Here is Edward Bear, coming downstairs now, bump, bump, bump on the back of his head, behind Christopher Robin. It is, as far as he knows, the only way of coming downstairs, but sometimes he feels that there really is another way, if only he could stop bumping for a moment and think of it. … </book> Content End tag

  46. Element Examples <email_message> Dear CS 1520: Web programming is sure fun! </email_message> <plane> F117 Nighthawk </plane>

  47. Root element • All XML documents contain • Outermost element – root element • All other elements and data within document further describe root <book> <title> Programming the World Wide Web </title> <author> Robert Sebesta </author> <publisher> Addison-Wesley </publisher> </book>

  48. Elements are containers • Elements contain • Elements and contents (data) • Elements can nest within each other • May contain child elements e.g.,<title> contained within <book> • Empty elements • In html, <br> <p> • In XML (or XHTML) <br/> <p/>

  49. XML Attributes • Information that describes elements • Similar to an adjective - adding more to the definition • Defined in the start tag of elements • Attributes are name-value pairs • Value must be in quotes • Same idea as with HTML attributes

  50. XML Attributes • Examples: <movie source=“http://marvel.com/ironman3”>Iron Man 3 </movie> <band genre=“Post-punk”>Joy Division</band> • We can either use elements or attributes to modify tags – up to programmer and situation • See p. 279-280 of Sebesta (7th edition) • One approach is to use attributes only for items that are not content-related • Ex: an id for an element

More Related