1 / 13

The Orlando Project

The Orlando Project. An Integrated History of Women’s Writing in the British Isles The University of Alberta The University of Guelph http://www.ualberta.ca/ORLANDO/. The Orlando Project.

alagan
Download Presentation

The Orlando Project

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Orlando Project An Integrated History of Women’s Writing in the British Isles The University of Alberta The University of Guelph http://www.ualberta.ca/ORLANDO/

  2. The Orlando Project • is producing the first scholarly history of women’s writing in the British Isles • is in the 4th of 6 years as a Major Collaborative Research Initiative funded by SSHRC and the Universities of Alberta and Guelph • is using SGML to markup its own research rather than tagging pre-existing texts

  3. SGML • 3 DTDs that blend structural markup with tags that foreground the interpretive nature of our research • 252 unique tags • 114 unique attributes • 640,000 element occurrences across all documents • e.g. 22,459 uses of the <name> tag

  4. Why strive for consistency? • Delivery Needs/End-User needs • Consistency of presentation • titles, quotations, foreign words • Adequate search and retrieval • standard expressions for people, places, organizations, texts • Chronological sorting • standard expressions for dates

  5. Tag Cleanup Pilot • Step 1: Analyze data to locate common inconsistencies • Step 2: Prioritize tag types for cleanup • Step 3: Establish workflow protocols • Step 4: Generate assignments • Step 5: Update user documentation

  6. Division of Labour • Batch changes: fix errors that regular expressions can easily find and replace • Undergraduates: fix problems that are predictable but not machine processable or that require minimal research • GRAs and PDFs: fix problems that require an experienced tagger or that need further research • Volume Author: act as consultants on research and practice issues

  7. Sample Batch Changes

  8. Assignments and the tools that make them happen • SGML-aware fulltext search engine: helps find “prosey” tags in context; helps find incorrect or missing attributes • Document-wide statistics: reports odd sub-elements and content that varies from the norm • Tag cleanup reporting: organizes “index” tags to make cleanup easier

  9. Establish priorities early on Weigh priorities against needs, wants, and total available resources Divide tasks according to expertise Train “experts” to fix major tags Develop a good checking system Revisit workflow models regularly to make sure the workload is equal to the resources available Conclusions

  10. Consistency -- it is possible? is it possible across projects? SMGL -- does it help? ... “that's not a date” ... “no, that's not a date, either” ... “nope, still not a date” ... “there you go: a date at last!” ... “your text is getting pretty long” ... “the text for this tag is usually not this long” ... “you have exceeded the average length of text in this tag by 250%!” ... “congratulations, you have created the world's longest tag!” Reflections on Consistency

  11. The Orlando Project An Integrated History of Women’s Writing in the British Isles The University of Alberta The University of Guelph http://www.ualberta.ca/ORLANDO/

More Related