1 / 25

HarperCollins

HarperCollins. Agenda. Content Creation Process What is DITA? What is DITA Open Toolkit? What does RSuite do? Demo Manuscript to ICML: Word -> DITA -> ICML Workflow Engine InDesign Code – Java, XSLT, XQuery, Java APIs “Groundbreaking” Topic. Current Book Composition Process.

rpaul
Download Presentation

HarperCollins

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HarperCollins

  2. Agenda • Content Creation Process • What is DITA? • What is DITA Open Toolkit? • What does RSuite do? • Demo • Manuscript to ICML: Word -> DITA -> ICML • Workflow Engine • InDesign • Code – Java, XSLT, XQuery, Java APIs • “Groundbreaking” Topic

  3. Current Book Composition Process Step 1: Editorial Manuscript docx Step 2: Composition InDesign indd

  4. New Book Composition Process Step 1: Editorial Manuscript docx Step 2 Generate DITA XML xml Step 3 Generate ICML icml Step 4: Composition InDesign indd Download ICML Transform 1 Transform 2

  5. What is DITA? • Darwin Information Typing Architecture • Is an XML Data Model for Authoring and Publishing • Topic Oriented • Each Topic is a separate XML file • DocBook is Book Oriented, more Complex, One Big XML file • DITA Initial Spec in 2001 • DocBook Initial Spec in 1991 • Core DITA Topic Types: • Concept • Task • Reference • Specialization: Subtyping – New Topics derived from existing

  6. What is DITA? • Topic must have at least: Id attribute in root, title, and body. • DITA MAP stitches topics together.

  7. What is DITA? Eliot Kimber • http://www.ditausers.org/tutorials/basics/kimber/ • http://www.xiruss.org/tutorials/dita-specialization/ Norm Walsh Post from October 2005: • http://norman.walsh.name/2005/10/21/dita Four key technical differences where DITA may be “better” than DocBook: • A topic-oriented authoring paradigm. • A cross-referencing scheme that's more practical than XML's flat ID space. • SGML's conref, reinvented. • An extensibility model based on "specialization".

  8. What is DITA Open Toolkit? • Open-source publishing system for DITA • Provides multi-channel output • https://github.com/dita-ot/dita-ot/ • https://dita-ot.github.io/ • Uses Pipeline Processing Approach using: • Java • XSLT • Rendering Engine (FOP, RTF, etc.) • DITA 4 Publishers

  9. What does RSuite do? • Centralized Repository for “all” artifacts • Provides: • Workflow • DITA Transforms • Manuscript to DITA • DITA to ICML • Multi-channel Output – PDF, ePub3, InDesign • Role Based Security • Distribution: • FTP to Commercial Printer • E-Commerce Sites

  10. SAN Drives 500 GB – 100 GB / Disk RSuite Tomcat Server Non XML Disk 1 Temp Directories 1. XSLT Transforms 2. File Uploads Non XML Disk 2 Non XML Disk 3 MySQL Disk Non XML Disk 4 Non XML Disk 5 MarkLogic Node 2 4 CPU - 2 Core / CPU MarkLogic Node 3 4 CPU - 2 Core / CPU MarkLogic Node 1 4 CPU - 2 Core / CPU Disk 1 - Forest 5 600 GB Disk 1 - Forest 3 600 GB Disk 1 - Forest 1 600 GB Disk 2 - Forest 4 600 GB Disk 2 - Forest 2 600 GB Disk 2 - Forest 6 600 GB Disk 3 - Backup 300 GB Disk 3 - Backup 300 GB Disk 3 - Backup 300 GB

  11. SAN Drives 500 GB – 100 GB / Disk Feature Request: Use XA Transaction: File Copy MySQL Update Metadata Update RSuite Tomcat Server 1 Non XML Disk 1 Temp Directories 1. XSLT Transforms 2. File Uploads Non XML Disk 2 Non XML Disk 3 MySQL Disk 2 Non XML Disk 4 Non XML Disk 5 MarkLogic Node 1 4 CPU - 2 Core / CPU MarkLogic Node 3 4 CPU - 2 Core / CPU MarkLogic Node 2 4 CPU - 2 Core / CPU 3 Disk 1 - Forest 3 600 GB Disk 1 - Forest 5 600 GB Disk 1 - Forest 1 600 GB Disk 2 - Forest 4 600 GB Disk 2 - Forest 2 600 GB Disk 2 - Forest 6 600 GB Disk 3 - Backup 300 GB Disk 3 - Backup 300 GB Disk 3 - Backup 300 GB

  12. RSuite Demo? • Upload • Transforms • PDF, ePub • ICML to InDesign • MarkLogicConfig

  13. Code? • Java • jBPM – Biz Process Management Framework • Ivy – to manage plugin dependencies • Ember.js • XQuery • Groovy • DITA-OT XSLT • Plugins • RSuite API Docs

  14. Groundbreaking Opportunity • Unleash the Tombstones! • All Content can be reused for product development

  15. DITA to RDF Transform! • Semantically Linked DITA • Link to Internal and External Content • DBPedia: http://wiki.dbpedia.org/Downloads39 • NY Times • Dublin Core • US Census • http://dbpedia.org/page/Mark_Twain • Semantic Links create a network of Knowledge • Enables Inferencing (ML8) • Uses MarkLogic Triple Index

  16. Why RDF? • RDF compliments DITA • Contains facts about DITA topics • Facts are stored in the Triple Index • Facts are used to: • Link internal and external documents • Derive other facts (inferencing) • Provide higher quality search result • RDF is efficient storage and linking mechanism • MarkLogic turns RDF into Triples

  17. Why Triples? Triple is a Subject-Predicate-Object (SPO) structure used to represent a fact. Lets computers derive facts from other facts without human involvement. Example: • Ted lives in Chicago, Illinois • Ted lives near Wrigley Field • Ted has a roommate called Sam • Ted and Sam go to Wrigley Field to watch games From these facts: • Sam lives in Chicago • Wrigley Field is in Chicago, Illinois • Chicago is in Illinois • Sam and Ted both live in the US • Etc…

  18. How to add Triples? • Facts need to be curated. • Data provenance • Editors can add facts to DITA Topic Docs. • New world of Semantic Publishing • EroniKumana

  19. Profiles in Courage Example • Add Facts to Chapter 4 DITA XML: • “Profiles in Courage” Primary ISBN value is 0060854936 • John F. Kennedy is the Author Of “Profiles in Courage” • John F. Kennedy is a Person • John F. Kennedy was at the Solomon Islands in August 1943 • EroniKumana is a Person • EroniKumana was at the Solomon Islands in August 1943 • EroniKumanarescued John F. Kennedy • EroniKumanais mentioned in Chapter 4, Profiles in Courage • Semantic Event – NY Times News Feed • EroniKumanadied on August 2, 2014 • Event Triggers Automatic Pub: • CMS automatically publishes “Profiles in Courage” web page with snippet to the specific Chapter referencing EroniKumana. • New web page also has link to like and/or purchase book.

  20. Book Process Steps Step 1: Editorial Manuscript docx Step 2 Generate DITA XML xml Step 3 Generate ICML icml Step 4: Composition InDesign indd Download ICML Transform 1 Transform 2 Word 2 DITA DITA 2 ICML Step 3 Generate Transient RDF rdf ML Triple Index Transform 3 DITA 2 RDF

  21. SAN Drives 500 GB – 100 GB / Disk RSuite Tomcat Server Non XML Disk 1 Temp Directories 1. XSLT Transforms 2. File Uploads Non XML Disk 2 Non XML Disk 3 MySQL Disk Non XML Disk 4 Non XML Disk 5 MarkLogic Node 1 4 CPU - 2 Core / CPU MarkLogic Node 2 4 CPU - 2 Core / CPU MarkLogic Node 3 4 CPU - 2 Core / CPU Index Index Index Disk 1 - Forest 1 600 GB Disk 1 - Forest 3 600 GB Disk 1 - Forest 5 600 GB Triples Triples Triples Disk 2 - Forest 2 600 GB Disk 2 - Forest 4 600 GB Disk 2 - Forest 6 600 GB Disk 3 - Backup 300 GB Disk 3 - Backup 300 GB Disk 3 - Backup 300 GB

  22. De-Silo-ize Custom APIs are used to communicate between silos. DAM Web Host Provider ISBN DB ebook Store Published Docs CMS

  23. Hub Spoke – No Silos Uses standardized RDF “connectors” to communicate. DAM ISBN DB Web Host Provider ebook Store Published Docs CMS

  24. Call To Action • Contribute to DITA RDF Project https://github.com/ColinMaudry/dita-rdf/blob/master/README.md • Build a Knowledge Engine

More Related