170 likes | 322 Views
Introduction to Markup. Slavic Digital Text Workshop University of Illinois at Urbana-Champaign 2005-07-06. David J. Birnbaum University of Pittsburgh djbpitt+@pitt.edu http://clover.slavic.pitt.edu/~djb/. How to Set up an XML Project. Examples (critical edition of Igor ′ Tale)
E N D
Introduction to Markup Slavic Digital Text WorkshopUniversity of Illinois at Urbana-Champaign 2005-07-06 David J. Birnbaum University of Pittsburgh djbpitt+@pitt.edu http://clover.slavic.pitt.edu/~djb/
How to Set up an XML Project • Examples (critical edition of Igor′ Tale) • Document instance (main XML file) • XML Declaration • Doctype Declaration • Data with markup • Document Type Definition (DTD) • Elements • Attributes • Transformation Stylesheets (XSLT) • Target files (HTML) • Cascading Style Sheets (CSS) • Batch file
The XML Declaration <?xml version="1.0" encoding="UTF-8"?>
Doctype Declaration <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE text SYSTEM "igor.dtd">
Document Type Definition <!ELEMENT text (title, stanza+)> <!ELEMENT title (#PCDATA | variant | notegroup)*> <!ELEMENT notegroup (item, variant?, note)> <!ELEMENT item (#PCDATA | variant)*> <!ELEMENT note (p | stanza)*> <!ELEMENT p (#PCDATA)> <!ELEMENT stanza (line+)> <!ELEMENT line (#PCDATA | variant | notegroup)*> <!ELEMENT variant (rdg)+> <!ELEMENT rdg (#PCDATA)> <!ATTLIST rdg wit (likh | p | k) "likh">
What the Instance Looks Like <html> <head><title>Title goes here</title></head> <body> <h1>Heading goes here</h1> <p>Paragraph text goes here</p> </body> </html>
Non-Empty and Empty Elements • Element content <a><b>…</b></a> • Character data (#PCDATA) content <a>blah, blah, blah</a> • Mixed content <a>blah<b>…</b>blah</a> • Some elements are empty <a></a> <a/>
Element Declarations 1 • Examples <!ELEMENT text(title, stanza+) > <!ELEMENT line(#PCDATA | variant | notegroup)* > • Element name and content model • #PCDATA = parsed character data (plain text) <text> <title>blah blah blah</title> <stanza> … </stanza> <stanza> … </stanza> </text>
Element Declarations 2 • Examples <!ELEMENT text (title, stanza+)> <!ELEMENT line (#PCDATA | variant | notegroup)*> • Connectors • Sequence (,) • Any order (|)
Element Declarations 3 • Examples <!ELEMENT text (title, stanza+)> <!ELEMENT line (#PCDATA | variant | notegroup)*> • Repetition • Exactly one (no repetition indicator) • Zero or one (?) • One or more (+) • Zero or more (*)
Attribute Declarations • Example <!ELEMENT rdg (#PCDATA)> <!ATTLIST rdgwit(likh | p | k)"likh"> • Element name, attribute name, token list, default <rdg wit="likh">пѣснь</rdg>
Document Type Declaration <!ELEMENT text (title, stanza+)> <!ELEMENT title (#PCDATA | variant | notegroup)*> <!ELEMENT notegroup (item, variant?, note)> <!ELEMENT item (#PCDATA | variant)*> <!ELEMENT note (p | stanza)*> <!ELEMENT p (#PCDATA)> <!ELEMENT stanza (line+)> <!ELEMENT line (#PCDATA | variant | notegroup)*> <!ELEMENT variant (rdg)+> <!ELEMENT rdg (#PCDATA)> <!ATTLIST rdg wit (likh | p | k) "likh">
The Document Instance <line>аще кому хотяше <variant> <rdg wit="likh">пѣснь</rdg> <rdg wit="p">пѣснѣ</rdg> <rdg wit="k">пѣснѣ</rdg> </variant>творити, </line>
XSLT • Extensible Stylesheet Language for Transformations • Can rearrange elements (unlike other stylesheet strategies) • Programming language for manipulating XML • Use XSLT transformation engine to generate HTML from XML
HTML • Hypertext Markup Language • Why not just create HTML in the first place? • XML easier to edit and maintain • XML supports multiple output formats from single source
CSS • Cascading Stylesheets • External stylesheet • Internal style commands • “Decorate the tree” • Cannot rearrange elements • Likh: трудныхъ повѣстий(4) о пълку Игоревѣ,P: трудныхъ повѣстий о полку Игоревѣ,K: трудныхъ повѣстий о полку Игоревѣ, • <span style="color: red">ПОЛКУ</span>
Batch File saxon -o igor.htmligor.xmlscore.xsl saxon -o commentary.htmligor.xmlcommentary.xsl saxon -o igor1.htmligor.xmligor.xsl saxon -o variants.htmligor.xmlvariants.xsl