320 likes | 653 Views
PDF Generation with iText. Presented by Greg Holling. What is iText?. Java/C# library Open source Generates PDF on-the-fly Servlet- and JSP-friendly PDF can be generated by a servlet Supports lots of PDF functionality Bookmarks, watermarks PDF forms Digital signatures. The Good.
E N D
PDF Generation with iText Presented by Greg Holling
What is iText? Java/C# library Open source Generates PDF on-the-fly Servlet- and JSP-friendly PDF can be generated by a servlet Supports lots of PDF functionality Bookmarks, watermarks PDF forms Digital signatures
The Good Mature Library Deep & broad PDF support Open Source Easy to create “preview” PDF from servlet/JSP Active user base & mailing list Training & consulting available Lots of online examples Book “iText in Action” (preview 2nd edition)
The Bad Javadoc (and code comments) are often sparse Need the book to use iText effectively eBook costs $35, and has problems Current edition is somewhat dated New edition is incomplete (MEAP) Explanations are buried in examples Some information is difficult to find
The Ugly Multiple ways to accomplish something Sort of like Unix, but... Sometimes one works, and the other doesn't Responses on mailing list often begin with “did you read the book?” Or the corollary: “That method isn't intended to be used that way...” Example: getYLine() JavaDoc says “gets the Y Line” Book: no description, only a source example
Background Goal: Brochures for community college students Students create brochures Admin customizes brochure look & feel Pricing: Monthly subscription for college Web-based Deployed on Windows Server 2003 Original plan: shrinkwrap Complex distribution and pricing Inexperienced sysadmins Future: mobile deployment (students)
Software Stack JDK 1.6 + Servlet/JSP Tomcat 6.0.28 Apache Commons File Upload 1.2.1 itext 2.0.8 Jdom opencsv 2.1 jUnit 4.8.2 [tagsoup, htmlcleaner, flying saucer, mongoDB]
DEMO Student interface Generated PDF Administrative interface PDF preview
iText “Hello World” Cookie cutter steps: Create a new Document object initializes margins, other generic properties Create a PdfWriter associates a document with a file/stream Stream can be a ServletOutputStream Open the document prepares for writing Add content Close the document
iText Key Classes Document – margins, orientation, etc. PdfReader – reads an existing PDF PdfWriter – low-level output Can be written to BAOS / ServletOutputStream PdfContentByte - “layer”, for low-level output Can be overlaid PdfStamper – add content to existing PDF PdfCopy – combine pages from PDF's
More Key Classes Element – logical element Chunk – StringBuffer containing font info Phrase – ArrayList of Chunk, includes Leading Paragraph – Phrases + newline + alignment List, ListItem – Bulleted list Anchor – Hypertext link ColumnText – a column of text & images PdfPTable / PdfPCell – a table
Fonts Two primary font classes: BaseFont Font name, embedded?, font file name Font Font size, other modifiers Font is used by most text-related classes BaseFont is used by PdfWriter Font constructor takes a BaseFont or Font.FontFamily object BaseFont for embedded fonts FontFamily for predefined fonts
Predefined Fonts All PDF readers are required to handle these Readers may substitute a similar font Helvetica => Arial, e.g. Use embedded fonts to avoid substitution No space penalty for using these in PDF Fonts: Courier, Helvetica, Symbol, Times, ZapfDingbats Bold & Italic variants for all except ZapfDingbats
Leading Pronounced like “sledding” Origin: lead separator inserted above a line PDF (iText): spacing above a line of text Aliases: line spacing Note: 1 inch = 72 points (approx.) Note: Spacing before a paragraph is different than leading Can be specified in points or % of font size
Embedded Fonts Obtain font information from a file Adobe Type 1 (.afm, .pfm, .pfb), TrueType (.ttf), OpenType (.otf) OpenType gives the best cross-platform behavior Font file is specified in BaseFont constructor Increase PDF size Only the glyphs used in PDF are embedded Size increase may still be significant, esp. CJK Watch for licensing restrictions
Hypertext Links Can be included in PDF To create: Create a Chunk with the appropriate font color Chunk.setAction (new PdfAction(...)); Embed the Chunk in a Paragraph or other iText element
Graphics PdfContentByte can create rudimentary graphics Line segments, solid or dashed Color, line end/cap style, dash style Filled or unfilled polygons Fill color/tint can be specified All units are relative to the edge of the page stroke() renders the graphics Nothing is rendered until stroke() is called NOTE: LineSeparator can be used for a horizontal line in the Document
iText and Java2D PdfTemplate.createGraphicsShapes() returns a Graphics2D object Can be passed to a paint() method The template object can be passed to PdfContentByte.addTemplate() Allows arbitrary Java2D graphics in PDF AffineTransform can be passed to some iText methods: addImage() setTextMatrix() Image/text scaling, rotation, trasformation
Images iText class: com.itextpdf.text.Image Image formats: JPEG[2000], GIF, PNG, BMP, WMF, TIFF, JBIG2 Color models: RGB, CMYK NOTE: imageio throws an exception when reading CMYK images Operations: scaling, transparency, masking NOTE: Scaling doesn't reduce image quality or size Just affects rendering Big image files => big PDF's
ColumnText Logical column, positioned explicitly on the page Rectangular or complex shape Content is added top-to-bottom go() renders content Nothing happens until go() Can be used to make sure content will fit go(true) simulates output go(false) or go() renders content
PDF Preview in Servlet PdfWriter constructor takes an OutputStream argument Can be any OutputStream Including ServletOutputStream This allows servlet to generate a preview PDF PdfWriter => ServletOutputStream Small PDF's only temp file => ServletOutputStream More flexibility, can be used for larger files
PdfStamper Adds content to an existing PDF Can read and write stream or byte array Allows chaining of PDF generation ops Content can be written on top or underneath Useful for: Table of contents “Page x of y” in header/footer Watermarks or “Confidential” notation
General iText Cautions 72 points = 1” (approx) Units are float, not double Font + bold modifier ≠ bold font Spacing before paragraph ≠ leading Watch font licensing restrictions Images are automatically centered & resized if they reside in a PdfCell ∑ Image size => PDF size (approx) Scaling images doesn't affect PDF size Beware HTML caching, especially IE
PDF Size Big issue for this project Two primary things affect PDF size: Images scaling doesn't affect size/resolution Embedded fonts First example PDF was 10 MB+ Rejected by email server 5+ second download Changing image size/resolution => 300k PDF Moral: Use small, low-res images
IE Browser Caching IE Browser Caching GET requests only Symptoms: page not cleared Workaround: Use POST or HTTP headers Also consider session.invalidate() Note: doesn't help with tabs JSP workaround: <% response.setHeader (“Cache-Control”, “no-cache”); response.setHeader (“Pragma”, “no-cache”); response.setDateHeader (“Expires”, -1); %>
References iText website: http://www.itextpdf.com/ Book: http://www.itextpdf.com/book/ Examples (from the book): http://www.itextpdf.com/examples/index.php
Questions? Greg Holling 303/274-9001 greg@holling-co.com