250 likes | 270 Views
Learn the basics of XML, including elements, attributes, entities, and declaration. Practice creating XML documents and understand the rules of well-formed XML. Enhance your knowledge in processing, structuring, and organizing XML data effectively.
E N D
Review of XML IST 421 Spring 2004 Lecture 5
XML • eXtensible Markup Language • Used for digital representation of documents • Store, process, search, transmit, display and print documents • www.w3c.org/xml - current information about XML
XML • Basic building block is the element, defined by tags • Root element contains all of the other elements • Attributes describe properties of elements • XML uses delimiters to differentiate character data • Less than < and greater than > called a tag
XML Elements • Name the contents of the element. • Typically in pairs with a start and end tag. • Some elements take attributes. • The structure describes the relationship between the elements. Example: <order_no>101</order_no>
XML Elements • May start with a letter or an underscore • May consist of: • Letters • Digits • Underscore character • Dot • Hyphen • Cannot start with the string “xml”
XML Elements • XML names are case sensitive unlike HTML tags which are not • Must have one root element • Similar to the <HTML>; </HTML> in html type of document • Programmer defines a root name • First line must be xml declaration • <?xml version=“1.0”?> (Note: ? Means information is passed)
XML Comments • <!-- ………--> • Any text desired can be placed within the <!-- -- > Example: <!-- Updated Apr 3 -->
XML Processing Instructions • Enables the passing of information to another application • Format: <?… ?> • Specifies version. Example: <?xml version “1.0”?>
Let’s Practice • Open Microsoft Notes (not XML notes!) • Declare the XML version as “1.0” • Create a Root Element, called class_listing • Note that every element must have both a beginning and ending tag • Save as: class_listing.xml • Add some other elements! • Create data within each element! • Test this in the browser.
Review the Practice • XML data is hierarchical • Elements contained within other elements are called children • Elements that contain children are called parent elements – nesting • Each XML document contains a root element • Element names describe the data
Components of XML Documents • XML Declaration • First line of document • Declaration tag begins with: <?xml version=“1.0” encoding=“UTF-8” standalone=“no”?> • May contain 3 attributes: • version=“1.0” • encoding=“UTF-8” (default if not given) • standalone=“yes” or standalone=“no” (default) UTF-8 = 8-bit Unicode character-encoding scheme. Others are UTF-16, UTF-32, and ISO-10646-UCS-2.
Attributes • Attributes may be attached to elements • Attributes have: • Names • Values • Name is separated from value by “=“ sign • Value must have “ “ around it
Attributes • Creates additional information • It is often information about the ELEMENT content • Nesting an ELEMENT within others may accomplish the same purpose
Let’s Practice • Add attribute to define categorize of student status • <student status=“sr”> • Add this attribute to all students • Remember to save with .xml extension.
XML Entities • Entities are used as placeholders for content • Two types of entities: • General • Parameter
XML General Entities • Placeholders for any information contained in the root element • Three types: • Character – used in place of special characters • Content – used to mark the place of a common block of content that you type often • Unparsed – used for binary or nontext data like images or video clips
Character Entities • Some tag delimiter characters have special meaning in XML <?xml version=“1.0”?> <equation> 50 < 100 </equation> Cause a syntax error
Character Entities • Solve problem by using character entities: > > < < “ " ‘ '
Content Entities • Used to mark the place of a common block of content that you type often or that may change • Internal entities – defined as part of the DTD within the XML document • Example: <!DOCTYPE class_listing [ <!ENTITY campus "Harrisburg"> ]> <class_campus>Penn State &campus;</class_campus> • External entities – information saved in an external file with a .xml extension
Unparsed Entities • Used for binary or nontext data like images or video clips <!ENTITY picture SYSTEM “sunset.gif” NDATA GIF> NDATA = notation data The unparsed entity declaration tells the processing system not to parse the data but rather to pass it through as is.
Well-Formed XML • Document that adheres to XML syntax rules – well formed • Rules: • Must contain only one root element • All elements must have a start and end tag • Elements must be nested properly and cannot overlap <book><chapter> ….</book></chapter>
Well-Formed XML • Rules (cont.) • All attributes must have a value and must be enclosed in quotes <student status=“sr”> • Attributes must be placed in the start tag of an element and may appear only once • Element names are case-sensitive <STUDENT> vs. </student>
Well-Formed XML • Rules (cont.) • Certain markup characters are reserved such as < and >. Must use a character entity instead • Element names may start with letters or an underscore; names may contain only letters, numbers, hyphens, periods, and underscore • Element names may not start with xml
XML Parser • XML Parser is a program that checks an XML document to ensure it follows the rules and is well formed. • Nonvalidating parser – looks for syntax errors according to the language rules • Validating parser – checks your document against a DTD or schema • Like compilers, one error may cause many messages
Homework • Create an XML document for the following: Camping Trip Gear List The following is a list of items that are essential on any camping trip: Flashlight Hiking boots Sleeping bag Pocket knife Bug spray Compass Hatchet Lantern Shovel Tent Bucket Ground cloth