1 / 32

Literate programming with multiple languages

Literate programming with multiple languages. Søren Højsgaard Faculty of Agricultural Sciences Aarhus University Denmark. Russel V. Lenth Department of Statistics & Actuarial Science, The University of Iowa, USA. DSC 2009, July 2009, Copenhagen, Denmark. Take-home message.

mikko
Download Presentation

Literate programming with multiple languages

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Literate programming with multiple languages Søren Højsgaard Faculty of Agricultural Sciences Aarhus University Denmark Russel V. Lenth Department of Statistics & Actuarial Science, The University of Iowa, USA DSC 2009, July 2009, Copenhagen, Denmark A A R H U S U N I V E R S I T E T Faculty of Agricultural Sciences

  2. Take-home message • Literate programming: Combining text, code and results in one document • StatWeave does this • Supports text formats: • LaTeX / OpenOffice (OpenDocument Text) • In combination with one or several of the ’engines’ • SAS, R, S-plus, Maple, Stata, Matlab, shell… • StatWeave is • ”Sweave for generalized values of LaTeX and S” • Jave based and hence portable • A great help in creating reproducible statistical analyses • Extensible: Add languages

  3. Source document Writing SAS statements More writing R statements Even more writing More SAS statements More writing… Final document Writing SAS statements SAS output SAS graphics More writing R statements R output Even more writing SAS statements SAS output More writing… Overview – Combining code, documentation and results

  4. Example: R + LaTeX

  5. Example: R + LaTeX

  6. Example: R + LaTeX

  7. Example: SAS + OpenDocument Text

  8. Example: SAS + OpenDocument Text

  9. What is literate programming • Term coined by Knuth (1979): • Create software as works of literature: • Embed source code into descriptive text (rather than the opposite) • Software should follow flow of thoughts and logic • Should be designed to be readable by humans (and not only by compilers / programs). • Some systems for literate programming (in statistics) • Sweave (Lesich 2002) • R code in LaTeX documents • odfWeave (Kuhn and Coulter 2007) • R code in OpenOffice documents • SASweave (Lenth and Højsgaard 2007) • SAS / R code in LaTeX documents • StatWeave • SAS / R / maple / S-plus / Stata / Matlab / shell… code in LaTeX and OpenOffice documents

  10. Why literate programming? • Reproducible statistical analysis • Research, consulting • Document exactly what has been done • Possible to re-run if data change • Maintain one document only (at least in principle) • Manuals, course notes etc. • Shown output guaranteed to be result of shown code

  11. StatWeave • StatWeave created by Russ Lenth, University of Iowa, USA • Available: http://www.cs.uiowa.edu/~rlenth/StatWeave/ • StatWeave is in its making, but becomming ”mature” and stable. • Source file is regular text document but with code chunks added (with special tags) • Two basic operations • Weaving: Process source file into single document with code listings, output listings, graphs… • Tangling: Extract code from source file to run later • Weaving is useful for reproducible statistical analysis

  12. Running StatWeave • Command-line interface:statweave SAS-HelloWorld-swv.odt statweave --tangle SAS-HelloWorld-swv.odtstatweave --keepall SAS-HelloWorld-swv.odt • Graphical User Interface:

  13. Example: SAS + ODT • Set global options (for SAS code) • Inline evaluation of expressions

  14. Example: SAS + ODT

  15. Example: SAS + ODT • Output can be saved for later use • - and display

  16. Code reuse and argument substitution • Save code chunks for later execution • Pass arguments to code chunks • Simplest case: Not unlike a macro…

  17. Example: SAS + ODT - code reuse and argument substitution • Costumize display and output (tables) by reusable code chunk

  18. Example: SAS + ODT - code reuse and argument substitution

  19. Example: Multiple languages - SAS, R and DOS together • Can use different engines in the same source file • Use SAS when appropriate; use R when appropriate; use Maple when appropriate… • Weaving: • SAS/R/XX chunks assembled into separate code files. • Code files are processed in order of first appearence in the source file

  20. Example: Multiple languages

  21. Example: Multiple languages

  22. Example: Multiple languages

  23. Example: Multiple languages

  24. Example: Multiple languages

  25. Example: Multiple languages • Synchronization issue: SAS chunk depends on data from R chunk which depends on data from SAS chunk…. • Solution: The restart option will restart the engines

  26. Example: Maple + LaTeX

  27. Example: Maple + LaTeX

  28. Example: Maple + ODT • Differentiate y= sin(x) xxx • Output is ugly, but it reads:

  29. Odds and ends – calling the shell • Want to list all StatWeave / Open office source files: *-swv.odt

  30. Code chunks are processed as a whole • Code chunks are processed as a ”unit” so in general one can not split a call to proc xxxx over several chunks: • Thus the following is illegal

  31. … one exception in SAS: IML

  32. Summary • Reproducible statistical analyses • Integrate text, code and results in one document • Several text formats • Several languages • This talk (and the examples) available at http://genetics.agrsci.dk/~sorenh/misc/ • All credit is due to Russ Lenth, the creator of StatWeave. Thanks!!!!

More Related