420 likes | 598 Views
Cocoon. An XML Web Publishing Framework From the Apache Project Roland Schweitzer. Today’s Topics:. Definitions Motivation Required Tools (Java, Apache Tomcat and Cocoon) Basic Cocoon Operation Matchers, Generators, Transforms and Serializers. Oh My! sitemap.xml glues it all together.
E N D
Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer
Today’s Topics: • Definitions • Motivation • Required Tools (Java, Apache Tomcat and Cocoon) • Basic Cocoon Operation • Matchers, Generators, Transforms and Serializers. Oh My! • sitemap.xml glues it all together. OAR Web Shop
Cocoon • An XML-based WWW publishing framework implemented as a Java Servlet. • Web site content stored in XML files (or RDBMS, LDAP Server or other source) is transformed (mostly via XSLT) into new XML files (to exclude certain info for example) and then serialized into human usable output (like an HTML or PDF file). OAR Web Shop
Reusable Content OAR Web Shop
Motivation for using Cocoon • We distribute climate data • Users (including scientists) find data via public search engines like google • Public search engines index HTML content • NOAA and other scientific organization use special purpose search engines that use FGDC (or DIF derived from FGDC) OAR Web Shop
Motivation continued • These facts add up to maintaining separate “documents” for each purpose • XML and Cocoon offers a (yet another potential) way out of the morass of many special purpose document collections OAR Web Shop
Suppose info was stored as XML <page> <title>Reynolds Sea Surface Temperature </title> <prefix>data.sst</prefix> <abstract> <para> The optimum interpolation (OI) SST analysis… <para> </abstract> <contact> <name>CDC Data Management Personel</name> <address1>325 Broadway</address1> <phone>(303) 497-6244</phone> <email>cdcdata@cdc.noaa.gov</email> </contact> … </page> OAR Web Shop
The Power of XML Content • Can be parsed with standard XML tools • Can be easily used for another purpose besides the Web • Can be written with powerful XML GUI tools (e.g. XML spy) • (Might be) easier to maintain OAR Web Shop
Reusable Content OAR Web Shop
Cocoon Some other process Schematic of the Solution Using Cocoon OAR Web Shop
Required Tools • On Solaris 7 and 8 I have used the binary distributions of: • Java 1.4.0 (java.sun.com) • Tomcat 4.0.4 (www.apache.org) • Cocoon 2.0.3 (xml.apache.org) • At this time, these are the latest releases. • Follow the installation instructions for each package. OAR Web Shop
New XML File XML File A Bit of Software A Bit of Software A Bit of Software New XML File Basic Operation • Cocoon is based on pipelines: Info to client (e.g HTML to browser) OAR Web Shop
Basic Operation • Cocoon is based on pipelines. An XML document is pushed through a pipeline consisting of one Generator (read a file, create a file from an LDAP server, etc.), zero or more Transforms (for example, to leave out sensitive information for external users) and ends with a Serializer that transforms the XML to binary or character data for consumption by the client (Web browser). • The entire site could use only one pipeline. OAR Web Shop
Basic Operation • If you need more than one pipeline… • Matchers (wildcard and regular expression) and Selectors (Boolean expressions) can be used to control the pipeline used to process the XML content. OAR Web Shop
Components • Matchers, Generators, Transforms and Serializers are all Cocoon Components. • Pipelines are build out of Components. • Components are declared and pipelines are constructed in the sitemap.xmap file. • The “Bit of Software” needed for each Component is provided by Cocoon or built by you. OAR Web Shop
Components (Matchers) • Suppose you wanted these URI patterns to be handled by cocoon: • For example the wildcard patterns: • http://www.cdc.noaa.gov/cocoon/data/*.html and • http://www.cdc.noaa.gov/cocoon/data/*.pdf could result in two pipelines with two different outputs types. OAR Web Shop
Components (Matchers) • Need a “bit of software” that looks at: • http://www.cdc.noaa.gov/cocoon/data/data.sst.html • Matches the the URL www.cdc.noaa.gov/cocoon/data • And the extension “.html” • Extracts the wildcard part of the URL data.sst • Starts the pipeline to produce HTML output from the data.sst.xml file (the wildcard plus the .xml extension). OAR Web Shop
The WildCard Matcher • We’re in luck! • A Matcher Component already exists in Cocoon to do what we want. • To use a Component we must declare it in the sitemap.xmap file that controls our Cocoon installation. OAR Web Shop
Declare the WildCard Matcher In sitemap.xmap configuration file: <map:matchers default=“wildcard”> <map:matcher name=“wildcard” src= "org.apache.cocoon.matchingWildcardURIMatcher"/> … </map:matchers> OAR Web Shop
Use the Matcher on a URI • We’ve declared the Matcher Component • Use the Matcher component in our pipeline to grab the * part of the pattern and use it to specify the source XML file that will be send through the pipeline. OAR Web Shop
Use the Matcher in a Pipeline • This pipeline uses the default Matcher, which is the WildCard Matcher we declared in the previous slide <map:match pattern=“data/*.html"> <map:generate src=" data/{1}.xml"/> OAR Web Shop
Now What? • We have successfully declared and used a Matcher to decide which pipeline we will use to process the first of our two examples URIs. • Now we need to declare and use a Generator, which is always the first step of the pipeline. OAR Web Shop
Components (Generators) • Declare a generator in sitemap.xmap: <map:generators default=“file”> <map:generator name=“file” src= “org.apache.cocoon.generationFileGenerator”/> … </map:generators> OAR Web Shop
Use the Generator in a Pipeline • The File Generator was declared as the default. • Its only job is to read the a file from the file system. <map:pipelines> <map:pipeline> <match pattern=“data/*.html”> <map:generate src=“data/{1}.xml”/> … OAR Web Shop
Review: Matcher and Generator • Components (Matchers) • Need a “bit of software” that looks at: • http://www.cdc.noaa.gov/cocoon/data/data.sst.html • Matches the the URL www.cdc.noaa.gov/cocoon/data • And the extension “.html” • Extracts the wildcard part of the URL data.sst • Starts the pipeline to produce HTML output from the data.sst.xml file (the wildcard plus the .xml extension). OAR Web Shop
Review: Pipeline Components • Conditional use of pipeline via the Matcher • One Generator (FileGenerator) • Zero or more Transforms (?) • Ends with a Serializer (?) OAR Web Shop
Components (Transforms) • Declare a Transform: <map:transformers default="xslt"> <map:transformer name="xslt“ src="org. apache.cocoon.transformation.TraxTransformer"> <use-request-parameters> false </use-request-parameters> <use-browser-capabilities-db> false </use-browser-capabilities-db> </map:transformer> OAR Web Shop
The XSLT Transformer • Different from previous declarations we’ve seen. • This declaration includes two additional configuration parameters. <use-request-parameters> <use-browser-capabilities-db> OAR Web Shop
Add the Transformer to Pipeline <map:match pattern="*.html"> <map:generate src=" {1}.xml"/> <map:transform src=“datastyle/HTMLstyle.xsl"/> OAR Web Shop
The Stylesheet written in XSLT: <HTML> <HEAD> <TITLE><xsl:value-of select="/page/title"/></TITLE> </HEAD> <BODY> … <xsl:template match="/page/abstract"> <h2>Abstract:</h2> <xsl:apply-templates select="para"/> </xsl:template> OAR Web Shop
Components (Serializers) • The last step of each Pipeline is a Serializer • It consumes XML (in the form of SAX events) and generates a character stream for a client (Web browser, Acrobat Reader, etc.). OAR Web Shop
Declare the Serializer In sitemap.xmap: <map:serializers default="html"> <map:serializer mime-type="text/html" name="html" src=“...HTMLSerializer"> <buffer-size>1024</buffer-size> </map:serializer> OAR Web Shop
The Completed Pipeline <map:match pattern=“data/*.html"> <map:generate src=“data/{1}.xml"/> <map:transform src=“datastyle/HTMLstyle.xsl"/> <map:serialize/> </map:match> OAR Web Shop
Pipeline to make PDF output <map:match pattern=“data/*.pdf"> <map:generate src=“data/{1}.xml"/> <map:transform src="stylesheets/FOstyle.xsl"/> <map:serialize type="fo2pdf"/> </map:match> OAR Web Shop
http://www.cdc.noaa.gov/cocoon/data/data.sst.html OAR Web Shop
http://www.cdc.noaa.gov/cocoon/data/data.sst.pdf OAR Web Shop
The Dreaded Demo • Demo Data Set Descriptions at CDC. OAR Web Shop
Cocoon is all this and more! • Action Components to do complex initialization (e.g. get database connection pool) during pipeline setup. • Resource Components are internal reusable pipeline fragments. • XSP and Logic Sheets offer capabilities similar to JSP with further separation of the logic. OAR Web Shop
Resources • www.apache.org • Inside XSLT by Steven Holzner (New Riders) • Java and XSLT by Eric M. Burke (O’Reilly) OAR Web Shop
Reality Check! • We have not (yet) put this system in production. • Still designing the XML representation. • Still learning about using Cocoon with a relational database. • Considering using XSP pages. OAR Web Shop
Conclusions • Cocoon offers the potential to use and reuse one bit of XML content for many purposes. • Most operations for Web hosting the XML content are built-in to Cocoon. • Unlimited customization by writing your own Components. • Content is easily maintained and separated from presentation. OAR Web Shop