190 likes | 312 Views
XML Study-Session: Part I. Writing a XML Document. Objectives: . By completing this study-session, you should be able to: Recognize well-formed XML documents. Understand basic XML syntax. Create simple XML documents of your own. XML Elements:.
E N D
XML Study-Session: Part I Writing a XML Document
Objectives: By completing this study-session, you should be able to: • Recognize well-formed XML documents. • Understand basic XML syntax. • Create simple XML documents of your own.
XML Elements: The basic components in XML is the element, a piece of text bounded by matching tags. • E.g. <item> A Simple Element </item> • Tags must be balanced. • Overlapping elements are prohibited.
Example: The following may appear in HTML, but not in XML: <B> This is bold. <I> This is bold italic. </B> This is italic. </I> The correct way to do this in XML: <B> This is bold. <I> This is bold italic. </I></B><I> This is italic. </I>
XML Elements (contd.): • An element may contain one or more subelements. • The structures between tags are referred to as the content. • The data - i.e. the part not enclosed within brackets <…> - is taken to be the text of the document and is referred to as PCDATA (Parsed Character Data).
Example: • <person> <name> Joe Black </name> <age> 35 </age> <e-mail> jb@whitehall.com </e-mail> </person> • XML has an abbreviation for empty elements. The following: <married></married> can be abbreviated to <married/>.
XML Attributes: XML allows us to associate attributes with elements. In XML, attributes are defined as (name, value) pairs. • E.g. <pricecurrency=“Euro”> 420.12 </price> • E.g. <soundfilename=“hotboogie.mp3”/> • A given attribute may only occur once within a tag, while subelements with the same tag may be repeated.
Well-formed XML: A document is considered well-formed if: • Tags nest properly. • Attributes are unique. This ensures that the XML data will parse into a labeled tree.
Example: person name age e-mail Joe Black 35 jb@whitehall.com
Rules of thumb: • XML document must contain one or more elements. • There is exactly one element, called the root or document element, which encapsulates all content. • Tag names are case sensitive. • Tag names must begin with a letter, and contain letters, digits, hypens, undescores, colons (reserved for name-spaces), or full-stops. • The string “XML” (any case) is reserved.
More on attributes: • Attributes and their values can only appear in the start-tag. • Attribute names must begin with a letter, and can consist of letters, digits, hyphens, underscores, and full-stops. Colons are reserved for name-spaces. • Values can be demarcated by single or double quotes. • The ‘<‘ and ‘&’ characters are not permitted in attributes.
Anatomy of an XML document: 1. Prolog • this section is optional. • contains: • XML declaration (version, encoding, standalone) • Comments E.g. <!-- Hello, I am a comment --> • Processing instructions (PIs) E.g. <?realaudio version=“5.0” bitrate=“16kps”?> • Document Type Declarations (DTDs)
Anatomy of an XML document: (contd.) 2. Root document • this section is required. • contains: • Elements and Attributes • Comments • Subelements 3. Epilog • this section is optional. • may contain comments and Pis.
Example: <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <CATITEM CATEGORY="Clock"> <ITEMNAME>Jimbo's Super Clock</ITEMNAME> <DESCRIPTION><STORY>Ever wake up in the morning to discover that your alarm clock didn't go off because the power failed? Or that your roof leaked, it rained, and the stupid thing just plain shorted out when it got wet? Now you don't have to worry about waking up two hours after you were supposed to be at work.</STORY> <FEATURES>Our latest, greatest Super Clock is a dream come true. It plugs into the wall but has its own set of batteries and protection from short circuits. The batteries even warn you when they're starting to fade - and they come with a twenty-five year guarantee! This clock is completely watertight, a sealed sphere of time in a stainless steel case. The clock face is large enough to read from a distance, and lights up with a touch for those nights when you're stumbling in the dark. The alarm starts off quiet, but gets louder and louder when you don't turn it off - guaranteed to wake even the soundest sleepers. Snooze features let you sleep just a little bit more, but it won't let you sleep in for more than an hour past the alarm. This clock is ready to adorn your bedroom, and even includes connections for lamp controls to brighten your morning, and electroshock clips for those who can't wake up any other way.</FEATURES></DESCRIPTION>
Example: (contd.) <PICTURE SRC="supclock.gif"/> <ITEM><PRODNAME>Jimbo's Super Clock</PRODNAME>: <PART>SC45-A</PART> <PRICE>$199.95</PRICE> (<AIRF>$19.95</AIRF> freight/air, <GROUNDF>$7.95</GROUNDF> ground) <WARRANTY>Twenty-five year</WARRANTY> Warranty. Made in <ORIGIN>Canada</ORIGIN></ITEM> <ITEM><PRODNAME>Lamp Controller</PRODNAME>: <PART>LC45-X</PART> <PRICE>$25.95</PRICE> (<AIRF>$9.95</AIRF> freight/air, <GROUNDF>$4.95</GROUNDF> ground) <WARRANTY>Ten year</WARRANTY> Warranty. Made in <ORIGIN>Canada</ORIGIN></ITEM> <ITEM><PRODNAME>Electroshock Clips</PRODNAME>: <PART>ES45-L</PART> <PRICE>$59.95</PRICE> (<AIRF>$9.95</AIRF> freight/air, <GROUNDF>$4.95</GROUNDF> ground) <WARRANTY>One-year</WARRANTY> warranty. Made in <ORIGIN>USA</ORIGIN></ITEM> </CATITEM>
Built-in and Character Entities: The following built-in entities can be used in character data and also in attribute values when necessary: • Note: The ' and " entities are only needed in attributes.
Creating general Entities: You can define your own entities just as you define elements. • The syntax for defining an entity is : <!EntityName EntityDefinition> • The name must be composed of letters, digits, periods, dashes, underscores, or colon, and begin with a letter or underscore. • The syntax to use an entity in the markup is: &Name; • E.g. <!Entity ProdName “Ginger”> &ProdName; is a remarkable advance, guaranteeing users happier days.
CDATA marked sections: CDATA sections make it possible to encode content that uses markup characters for other meaning. • The syntax is : <![CDATA [content]]> • E.g. XML uses tags, which look like<![CDATA[<OPENINGTAG>,<ENDTAG>, AND <EMPTYTAG/>]]>to mark up content. • The only content which may not appear in the CDATA contents is the closing string ]]>.
Next session: Validating XML Documents • Creating your own DTDs • Using XML-Schema