140 likes | 254 Views
Leiden University. The university to discover. DMT Week 3. Adriaan van der Weel and Peter Verhaar. Leiden University. The university to discover. Where do we stand?. Leiden University. The university to discover. Principles of markup. HTML: Document instance (your CV) Stylesheet (css)
E N D
Leiden University. The university to discover. DMT Week 3 • Adriaan van der Weel and Peter Verhaar
Leiden University. The university to discover. Where do we stand?
Leiden University. The university to discover. Principles of markup • HTML: • Document instance (your CV) • Stylesheet (css) • Application • Document instance (your CV) • Stylesheet (css) • DTD/Schema • Add: Prologue (XML decl.; DTD)
Leiden University. The university to discover. Text and markup
Leiden University. The university to discover. Knowledge representation • Structure and content • Ontology • What knowable things exist • What are the relationships that hold between them • Tree diagram • The book has structure and content: chapters, paragraphs, footnotes, etc. • XML represents structure and content • Various ontologies - various DTDs
Leiden University. The university to discover. XML Basics 1 • Elements <p>...</p> • Attributes <title type=play>...</title> • Entities • Character: è = è • General entities, referencing: • Chunks of text defined elsewhere • Text or image files, etc. • E.g., <p>The &BTCP; aims to ... </p> • Well-formedness, validation • Prologue (XML decl.; DTD)
Leiden University. The university to discover. XML Basics 2 • Open standard (cf de facto standard): • Publicly available • Royalty-free • Fully and publicly documented • NB: ‘Who owns your data?’ • (Lower) ASCII and Unicode: • Platform and software independent • Software independent • Device independent
Leiden University. The university to discover. Open standards 1 • Open standards in a networking world • Why? • Which? E.g., Internet Protocol Suite: • Link layer (physical/data, e.g., ethernet) • Internet layer, facilitating transport, e.g., IP • Transport layer, e.g. TCP • Application layer, e.g., HTTP, SMTP, FTP
Leiden University. The university to discover. Open standards 2 • E.g.: • File format: Pdf, txt • Programming language: PHP, Linux • Style language: CSS, XSLT • Markup metalanguage: SGML, XML • Markup language: DocBook, HTML, EAD, TEI
Leiden University. The university to discover. TEI basics • Text Encoding Initiative, 1987 • Text exchange in the humanities • TEI is a DTD • TEI is a collection of DTD fragments or modules • Platform and software independent (ASCII); open standard; open source • Used in an XML application (diagram) • Document ‘instances’ should be validated against the TEI DTD
Leiden University. The university to discover. TEI DTD • The TEI DTD is modular. We use: • <!DOCTYPE TEI PUBLIC "-//TEI P5//DTD Main Document Type//EN" "http://www.tei-c.org/release/xml/tei/schema/dtd//tei.dtd" [ • <!ENTITY % TEI.header "INCLUDE"> • <!ENTITY % TEI.core "INCLUDE"> • <!ENTITY % TEI.textstructure "INCLUDE"> • <!ENTITY % TEI.transcr "INCLUDE"> • <!ENTITY % TEI.linking "INCLUDE"> • <!ENTITY % TEI.namesdates "INCLUDE"> • ]> • http://www.tei-c.org/release/xml/tei/schema/dtd/
Leiden University. The university to discover. Why this rigmarole? • Print (‘Order of the Book’): • Author’s brain > Book > reader’s brain • Instrument: typography • Digital (‘Digital Order’?): • Author’s brain > Computer > reader’s brain • Instrument: markup • For both typography(=form) and content • So: Need to make text intelligent
Leiden University. The university to discover. Using the computer / UM • Author’s brain > Computer > reader’s brain • Vary output format (paper, pdf, html, mobile phone, etc.) • Exchange • Reuse • Search and select • Count • Change content (order) and form • Etcetera
Leiden University. The university to discover. New research questions? • Chris Anderson (The Long Tail), in Wired ‘The end of theory’ • But: need for hypothesis remains • But: humanities data: • Quantity: not such a wealth of data. Bitty. Discontinuous. • Quality: narrative, evaluative, ambiguous, subjective, conceptual • Who decides the agenda? Need to lead, rather than follow.