370 likes | 498 Views
Where XML? Where does XML fit into your workflow. Jabin White Exec. Director, Electronic Production Elsevier November 8, 2005 SSP Fall Seminar: Embracing Technology and Process Changes to Successfully Transform a Scholarly Publisher. Where does XML fit into your workflow?. Right here .
E N D
Where XML?Where does XML fit into your workflow Jabin White Exec. Director, Electronic Production Elsevier November 8, 2005 SSP Fall Seminar: Embracing Technology and Process Changes to Successfully Transform a Scholarly Publisher
Where does XML fit into your workflow? Right here End of workflow Beginning of workflow Your workflow
Bad News General statements are no more accurate (or funnier) than previous slide Specific inputs, outputs, and requirements of each workflow mean different XML solutions Good news The answers today are A LOT easier, and better, than 5-10 years ago Availability of tools, ubiquity of knowledge is a good thing Bad News, Good News
Agenda • But first, an observation • Basics of an XML workflow • Where does XML fit? • Considerations for making decision • Where does XML fit at Elsevier? • Conclusions
But first, an observation • Talkin’ bout an evolution… • 1995: Should SGML be used in your workflow? • 2000: Should SGML or XML be used in your workflow? • 2005: Where does XML fit into your workflow?’ • 2010: ???
Basics of an XML Workflow • Ingredients • A Need • Tools • Some process changes • (unless you are starting from scratch) • Communication and patience
Basics: A need • Do you require this content to be used more than once? • Is there intelligence that can be imbedded in the content? • If your content is “one off” print and you are sure it will never be used again, the answer to XML may be “NO!”
Basics: Tools • DTDs and related technologies • Schema is an option, but DTDs are still better for authoring • Author-submission tools, or conversion processes • XML-aware editor(s) • XMetal: $623; XML Spy $847; or Google “free XML editor” • Don’t be fooled by “Save as XML” • CMS? • Used to be a “nice to have,” but today, not so much • Pay attention to Patti and Bob later
Basics: Process Changes • Depending on where XML is inserted, processes may change a lot or a little • Careful consideration should be given (and planned for) to these changes • Don’t underestimate the human impact of these process changes • Impact on skill sets – people who didn’t do XML before may not want to
Basics: Patience, Young Skywalker • Rome was not built in a day, and they didn’t even use XML! • Or did they? <question>Et tu, Brute?</question) • Process changes require open lines of communication, and clear mission statement from the top • XML should be a priority of the organization, not something it does because it feels it has to (this is getting better)
When to go to XML? • Repeat caveat against generalizing • But… • In general, human beings will do better at inserting intelligence (although Ontology vendors will argue that one); machines will be cheaper on structural, pattern-matching stuff • That being said…
The Case for Early XML • Getting angle brackets in data early allows you to take advantage of other XML-based technologies earlier in the workflow • Content and tags can be QC’d early, fewer changes toward the back end of the production cycle • More timely delivery of files to web or elsewhere
The Case for Late XML • Smaller impact on current workflows • Fewer process changes • Ability to focus XML tags on “E” only content delivery • Potentially cheaper (don’t have to buy XML tools for earlier in the workflow or “round trip” pagination), but this one is arguable
The Verdict • Haven’t you been paying attention??? • No global answers, but a “framework” for making your specific decision • The more costs absorbed during “traditional” editing process, the better • If your traditional editing process can absorb all the costs of XML, then send me a postcard, because you’ve reached Nirvana • Ideally, your decision will be a *business* decision, not a technology decision
Who tags? • Production has and always will drive *format* driven tagging • <Head>, <Para>, <List>, etc. • Recent trend is to outsource this • Semantic, rich data tagging is done by subject-matter experts • This can happen early (authors) or later (product-specific taggers), depending on your product requirements • This is expensive, so you must have solid business reasons for doing this
Little gain, little impact Some gain Short-term costs ($$ and pain), but huge long-term benefits This just in …you get what you pay for • Cheap and easy • Moderate and moderate • Difficult and expensive
You get what you pay for Investment (time and money) Gain (re-use, flexibility of data)
Cheap and Easy • Post-print XML means no impact on current workflow • Options are limited to “do more” • Not taking full advantage of technology, may struggle to meet customer expectations • I call this the “Veruca Salt” effect – “But Daddy, I want multi-purposed content now!”
Difficult and Expensive • Up-front costs, in both time and money, can be daunting • Transition of skill sets, watch employee morale • Takes long-term view and patience in a short-term, sometimes ‘quarter by quarter’ world • Management wants you to press the “XML Button” (see V. Salt reference on last slide)
Journals vs. Books • Journals are a known, repeatable process • Costs to set up XML workflow can be amortized over the life of a journal • Book people have a more “one off” mindset, so XML is more difficult • Not impossible, but more difficult • Conversations more about ROI
Where does XML fit at Elsevier - Books • Near the end of the workflow, and outsourced • Post-print conversion still accounts for the bulk of book XML • Projects with XML at the front are still viewed as “special” • Bigger, non-title specific project to move to an XML-first workflow • The 80/20 rule
Where does XML fit at Elsevier - Journals • At the beginning of the workflow, but outsourced • Structural XML is done at suppliers via Elsevier XML DTD (public domain) • Checked via Validation Tools (vTools) and submitted to central repository • Repository feeds all outputs from that point forward (including print, all online versions, etc.)
Delivery of content and services via the web Prepare final e-product for distribution(e-product generation process: input, i-conversion, downsampling, assembly, o-conversion, output) Core production processes (copyediting, tagging, images processing, validation, issue compilation) Components of the Elsevier journal workflow 4 3 2 Capture and validate content (electronic submission, first validations on format, peer review, acceptance by editor) 1
Journal Workflow – Production AUTHOR CORRECT. ISSUE COMP. LOGIN PRODUCTS: MEDIA CONV. COPY EDIT ISSUE COMP. CORRECT. PRINT SCANNING ELECTRONIC PRODUCTS S100 S200 S300 ELECTRONIC WAREHOUSE From this point forward, all data in XML
Drug JA MRW HS Book eSerial Common Element Pool • CEP is really SEP – Shared Element Pool • Why have multiple content models for Author, List, Para, Reference??? CEP
Article in print presentation (PDF) In-depth reading, note-taking Portable Serendipity Print quality / resolution Versatility in sizes
Article in presentation(ScienceDirect) Searching, browsing, scanning, surfing, glancing, verifying Speed Restrictive screendimensions
A sense of scale • More than 7,000,000 journal articles (all in XML) • More than 8 billion characters/bytes storage (8 terabyte) • Books is just starting, mostly with eSerials and MRWs • 2004 addition • 210,000 articles • 500,000,000 characters/bytes • 2005 addition • 220,000 articles • 1,500,000,000 characters/bytes (FAT PDF and Books)
Conclusions • Asking where XML is right in your workflow is like asking which flavor of ice cream is best • However, general assumptions can be made, questions asked • Use framework to ask and answer some basic questions about your organization’s needs, and its tolerance/patience for pain
Conclusions, Part Deux • Use answers to these questions to formulate your business case, ROI • Help is available • XML is a tool, to be applied where is best appropriate for your organization and workflow • Editing tools, knowledge, etc. are more ubiquitous, but expectations are higher, than 10 years ago
Thank you • Questions? Contact info: Jabin White, Elsevier 1600 John F. Kennedy Blvd, Suite 1800 Philadelphia, PA 19103 215-239.3231 jabin.white@elsevier.com jabin@jabin.com Slides available at http://www.jabin.com/presentations.html