200 likes | 291 Views
Documentation Costs Avoided using Python and other Open Standards. Andrew Jonathan Fine Operating Systems Software Organization Engines, Systems, and Services Honeywell International. Original Core Data Flow. Single Python application set of front end translators content inserter
E N D
Documentation Costs Avoided using Python and other Open Standards Andrew Jonathan Fine Operating Systems Software Organization Engines, Systems, and Services Honeywell International
Original Core Data Flow Single Python application • set of front end translators • content inserter • post-processing formatter
Front End Translator • Selected by caller • Caller specifies input file containing corporate data • Extracts components from file • Pictures • Tables • Paragraphs • Saves to Python dictionary
Inserter • Caller selects components from Python dictionaries made by front-ends for respective documents. • Inserter creates a Word document • Inserter uses Python/Com to insert components into document
Back End Formatter • Scans corporate Word document template • Scans Word document made by inserter • Makes final style corrections.
Why? The flow was designed to cope with changes in requirements! • New projects • New teams • New data source formats • New standards for existing formats
First front-end translator Take pictures, tables, and data from a recursive property list constructed by an aerospace industry software visual programming tool called BEACON. (… actual design of translator outside the scope of this paper…)
Initial Design of Inserter • Straightforward use of principles demonstrated by Mark Hammond's book, Python Programming in Win32. • Chapter containing a thorough treatment of how to have Python use the Word 97 COM object model to create and manipulate a Word Document.
Problems!!! • Must cope with huge amounts of corporate data such as table cells.. • Speed of COM interface for new individual elements. • Reuse issues for detailed typesetting of elements.
What I wanted: • Faster conversion • Existing standard • Callable from Python What I found: • Faster conversion (OpenJade) • Existing standard (DocBook SGML)
Why Call from Python? • New scripting language to replace islands of automation (Perl, MSDOS, internal test stand controller language). • Easier to connect islands after writing in Python. • Open source thus continuously peer reviewed. • Tremendous user base! Plenty of wrappers written in Python around open source libraries supporting open standards. … so I wrote a Python wrapper around some DocBook rules …
Revised Core Data Flow • Python wrapper writes DocBook SGML • OpenJade translates DocBook SGML to Word RTF
Input to OpenJade as local DocBook SGML <!DOCTYPE informaltable SYSTEM "C:\Local.dtd"> <informaltable frame='all'> <tgroup cols='2' colsep='1' rowsep='1' align='center'> <colspec colname='Name' colwidth='75' align='left'></colspec> <colspec colname='Type' colwidth='64' align='center'></colspec> <thead> <row> <entry><emphasis role='bold'>Name</emphasis></entry> <entry><emphasis role='bold'>Type</emphasis></entry> </row> </thead> <tbody> <row> <entry><phrase role='xe' condition='italic'>statex</phrase></entry> <entry>Integer</entry> </row> <row> <entry><phrase role='xe' condition='italic'>statey</phrase></entry> <entry>Long</entry> </row> </tbody> </tgroup> </informaltable>
from DocBook import DocBook class ItalicIndexPhrase (DocBook.Rules.Phrase): "italic indexible text phrase" TITLE = DocBook.Rules.Phrase def __init__ (self, text): DocBook.Rules.Phrase.__init__ (self, 'xe', 'italic') self.data = [ text ] class NameCell (DocBook.Rules.Entry): "table row cell describing name of identifier (italic and indexible text!)" TITLE = DocBook.Rules.Entry def __init__ (self, text): DocBook.Rules.Entry.__init__ (self) self.data = [ ItalicIndexPhrase (text) ] class StorageCell (DocBook.Rules.Entry): "table row cell describing storage type of identifier (ordinary text)" TITLE = DocBook.Rules.Entry def __init__ (self, text): DocBook.Rules.Entry.__init__ (self) self.data = text class TRow (DocBook.Rules.Row): "each row in application's informal table body" TITLE = DocBook.Rules.Row def __init__ (self, binding): (identifier, storage) = binding DocBook.Rules.Row.__init__ (self, [ NameCell (identifier), StorageCell (storage) ]) class TBody (DocBook.Rules.TBody): "application's informal table body" TITLE = DocBook.Rules.TBody def __init__ (self, items): DocBook.Rules.TBody.__init__ (self, map (TRow, items)) class TGroup (DocBook.Rules.TGroup): "application's informal table group" COLSPECS = [ DocBook.Rules.ColSpec ('Name', 75, 'left'), DocBook.Rules.ColSpec ('Type', 64, 'center') ] SHAPE = [ '2', '1', '1', 'center' ] TBODY = TBody class InformalTable (DocBook.Rules.InformalTable): "application's informal table" TGROUP = TGroup class Example (DocBook): 'example application of DocBook formatting class' SECTION = str (InformalTable) def __call__ (self): self.data = [ InformalTable ()(self.data) ] return DocBook.__call__ (self) if __name__ == '__main__': print Example ([('statex', 'Integer'), ('statey', 'Long')]) () Python code to translate data into OpenJade input in local DocBook SGML (based on Python to DocBook sample wrapper class DocBook)
Using class DocBook • class DocBook from DocBook.py in Appendix F is the top-level interface callable class • Application inherits from class DocBook • Contents of application inherit from classes contained by DocBook.Rules • Use overrides to specify structure, formatting, and text.
OpenJade • OpenJade is an open source DSSSL execution engine available from SourceForge. • DSSSL is an ISO standard for typesetting specification and document conversion. • OpenJade reads DocBook DSSSL stylesheets and our local DSSSL stylesheets if any. • The DSSSL is executed by OpenJade upon SGML source text to write a final document for later loading into a word processor.
DocBook Post-Processing using Word Automation with Python/COM • DocBook/OpenJade emits RTF with different Word document style identifier names than in corporate Word DOT file. • Much faster to change document using Python/COM than to create document! • Cannibalized Python code from inserter first draft to create post-processor. • Reads RTF, changes, saves as final DOC.
Return on Investment 5 projects ranging from 30 BEACON files to 150, average about 75 files Each project has 2 releases per year where each file must generate hard copy. Previously (cut/paste by hand): Each project release: 1/5 * 75 * 4 hours = 60 hours 3/5 * 75 * 8 hours = 360 hours 1/5 * 75 * 16 hours = 240 hours ----- 660 hours Two releases per year: * 2 = 1,320 hours Five projects needing releases: * 5 = 6,600 hours Two year period (2002-2003) * 2 = 13,200 hours ------ Total effort avoided: 13,200 hours Automated: Automated releases over 2 year period: 160 hours My effort (12 * 140 hours per labor month): 1 680 hours Total investment: 1 840 hours Net effort avoided, 2002-3: 11 360 hours Net avoided by customers 2002-3 at $100/hour: 1 136 000 dollars Net labor years avoided 2002-3 at 1680 hours/year: 6.76 years Headcount avoided per year: 3.38 people ROI (Total effort avoided / total invested) 2002-3:7.17
Python and DocBook together • Python connects our department’s engineering specific islands of automation. • Python with DocBook created Word documents from engineering data. • The combination of an open language with an open standard eliminated a real-world business process bottleneck. • The return on investment was substantial.