170 likes | 182 Views
Explore the basics of XML in this introductory chapter. Learn about markup language fundamentals, SGML, HTML, and the benefits of XML. Understand the distinctions between XML and HTML, and discover how XHTML merges the best of both worlds. See examples of XML and HTML documents, and grasp the importance of data separation from presentation.
E N D
Chapter 1 - IntroductionLearning XMLbyErik T. Ray Slides were developed by Jack DavisCollege of Information Scienceand TechnologyRadford University
What is XML? • XML stands for “Extensible Markup Language • XML is a “metalanguage” that can be used to create markup languages • XML languages can be created to describe specific data • XML is an open standard, meaning that it is not tied to any specific technologies • XML files can be created and edited with a text editor • XML is a general-purpose information storage system
Markup Language Fundamentals • A “markup language” is a set of rules that define the structure of a document • Programs, or applications, are used to interpret documents containing markup • Some applications contain rules and instructions that can produce documents that can only be interpreted by that application – this is known as a “proprietary” format • XML documents are “portable” because they can be interpreted by many different applications
The Beginning:SGML • SGML stands for “Standard Generalized Markup Language” • SGML was developed in the 1960’s and was the first standardized markup language • SGML provides a framework for creating other markup languages • XML and HTML are both SGML languages • SGML is used mainly for very large documentation projects
HTML • HTML was developed in the mid 1990’s as a lightweight language to be used for exchanging information over the World Wide Web • HTML is an open standard, meaning that it is free to use and not tied to any particular technology • HTML documents, like XML documents, are plain text documents and can be created using a text editor • HTML is limited in it’s scope and can not be extended
HTML Document Example <html> <head> <title>Job Posting: Webmaster</title> </head> <body> <h1>JOB POSTING</h1> <h2>Job Title: <i>Webmaster</i></h2> <p><b>Job Description:</b> We are looking for a Webmaster to oversee the management of our company website. The Webmaster will be responsible for working with other staff members to collect information for the website, and for creating and maintaining the web pages.</p> <p><b>Skills needed:</b> Basic writing skills, good communication skills, HTML</p> </body> </html>
The Need for XML • XML was developed partly because of the limitations of HTML • The W3C (World Wide Web Consortium) released the official XML version 1.0 specification in 1998 • XML quickly gained popularity in the Web community • XML itself is NOT a language, but rather a set of tools that can be used to create markup languages
Benefits of XML • XML: • Allows data to be self-describing • Allows an author to create rules for the content an element can contain • Languages can be developed for industry-specific or company-specific needs • Elements describe the data, not the format • Provides extensive linking functionality • Can be used to interchange data between two proprietary formats • Can be used to define standard syntax for many different languages • Contains robust searching capabilities
Data vs. Presentation • XML elements describe data properties • HTML elements describe formatting properties • XML elements can be formatted by using “style sheets” • A style sheet is a set of instructions that describes how to format a document • Many style sheets can be created to provide different presentations of a single document (ie – print vs. web page) • A single style sheet can be used to provide formatting instructions for many XML documents
Differences Between XML and HTML • XML is not dependant on a single document type • XML allows an author to create elements that best fit the data • XML separates data from presentation • XML is strict about syntax • XML tags are case-sensitive • XML documents can be used with many different clients, not just web browsers • XML documents require style sheets for their formatting information
XHTML: The Best of Both Worlds • XHTML stands for “Extensible Hypertext Markup Language” • XHTML is a language that is meant to merge HTML and XML • XHTML contains the HTML element set, but adheres to XML’s syntax rules • XHTML is extensible • XHTML is accepted by many browsers
XML Document Example <?xml version=”1.0”?> <job-posting> <title>Job Title: <emphasis>Webmaster</emphasis></title> <description>We are looking for a Webmaster to oversee the management of our company website. The Webmaster will be responsible for working with other staff members to collect information for the website, and for creating and maintaining the web pages.</description> <skill-list> <skill>Basic writing skills</skill> <skill>good communication skills</skill> <skill>HTML</skill> </skill-list> </job-posting>
XML Goals • Form should follow function- Markup languages need to fit their data snugly. Rather than invent a single, generic language to cover all document types (badly), let there be many languages, each specific to its data. • A document should be unambiguous- Markup should occur so there is only one way to interpret the names, order, and hierarchy of the elements. • Separate markup from presentation- Documents should have style information stored externally, outside the body of the document. Documents that rely on stylistic markup are difficult to repurpose or convert into new forms. (stylesheets) • Keep it simple- widespread acceptance and use requires simplicity
XML Goals (cont.) • Enforce maximum error checking- Some markup languages are so lenient about syntax that errors go undiscovered. When errors build up in a file, it no longer behaves the way you want it to.; its appearance in a browser is unpredictable, information may be lost, and programs may act strangely.- The XML specification requires the use of well-formed documents. • Culture agnosticA markup language should not be limited by particular alphabets or symbols.