340 likes | 467 Views
OKC Tools for XML Metadata Management. Marlon Pierce Community Grids Lab Indiana University. Overview. We discuss systems we have built for managing XML metadata. Applications include Newsgroups Bibtex-based citation managers Glossary term and abbreviation managers
E N D
OKC Tools for XML Metadata Management Marlon Pierce Community Grids Lab Indiana University
Overview • We discuss systems we have built for managing XML metadata. • Applications include • Newsgroups • Bibtex-based citation managers • Glossary term and abbreviation managers • RIB compatible browsers • Running demos available from www.xmlnuggets.org. • Downloads of revised newsgroup application available soon. • Challenge: promote scientific metadata usage • Data provenance, HPC run archiving, etc.
Parts of the System • Each application has one or more XML schemas that serve as a data model. • The general system contains the following components: • Form wizards for creating valid XML instances for a particular application. • Publishers or “feeders” that post messages into the system. • Unique URI generators for storing each message. • Persistent storage of entries (Oracle and MySQL). • Readers that provide RSS-based catalogues of topics. • Support for threaded messages, keyword searching. • Role-based access control.
<?xml version="1.0"?> <rss version="0.91" xmlns:cg="http://grids.ucs.indiana.edu/okc/schema/cg/ver/1"> <channel> <title>Community Grids Project Reports</title> <image> <title>ptllogo</title> <url>http://www.communitygrids.iu.edu/img/smallLOGO.gif</url> <link>http://www.communitygrids.iu.edu</link> <description>Pervasive Technology Labs Logo</description> </image> <Item> <name>CORBA</name> <URI>glossary/C/CORBA</URI> <description>Common Object Request Broker Architecture is an open distrubuted object-computing infrastructure being standardised by the Object Management Group.</description> </Item> <!—Other items deleted--> </channel> </rss>
Sample Applications Overviews of newsgroup, citation manager, and BIDM applications.
Newsgroup System Features • Email and browser-based posting. • Supports attachments. • Multiple topic subscriptions • Periodic topic digests • Multiple user privileges • Read through browser only • Post through browser only • Email notification with/without attachments.
Citation Browser • Supports multiple schema descriptions based on bibtex • Journal articles, books, book chapters, conference proceedings, tech reports, theses • Import/upload bibtex into system, export topic to bibtex.
RIB Compatible Applications • Basic system can be used with any schema, so we created a version using the Basic Interoperability Data Model (BIDM) • Developed by the RIB team • IEEE standard • BIDM has two important extensions that we do not currently support. • Asset certification • Intellectual property rights
Steps for a Metadata Generator • There were common tasks that we performed for each application: • Design an object model and create a W3C XML Schema to represent it. • Create a memory object model of the schema, i.e. corresponding Java classes. • Design an interface, i.e. HTML forms, for user inputs, and bind the interface with the memory model. • Let users input data. • Finally, generate XML based on input, and publish it. • Given these repetitive tasks, we have developed a general purpose tool that automates the creation of this process.
Generating XML Form Wizards How to convert XML schemas into web applications
SchemaWizard and XML • Schema Wizard maps XML Schema elements to HTML form elements through its schema parser, and creates the framework and logic for an XML form wizard. • Users use newly generated wizards to create and publish XML instances, which follow a schema, to any destinations such as publish/subscribe messaging systems or through SMTP. • XML form wizards are Web applications that also serve as validating XML editors and are customized through schema annotations.
Steps for a Metadata Generator • There were common tasks that we performed for each application: • Design an object model and create a W3C XML Schema to represent it. • Create a memory object model of the schema, i.e. corresponding Java classes. • Design an interface, i.e. HTML forms, for user inputs, and bind the interface with the memory model. • Let users input data. • Finally, generate XML based on input, and publish it. • Given these repetitive tasks, we have developed a general purpose tool that automates the creation of this process.
SchemaWizard and XML • Schema Wizard maps XML Schema elements to HTML form elements through its schema parser, and creates the framework and logic for an XML form wizard. • Users use newly generated wizards to create and publish XML instances, which follow a schema, to any destinations such as publish/subscribe messaging systems or through SMTP. • XML form wizards are Web applications that also serve as validating XML editors and are customized through schema annotations.
SchemaWizard Architecture • The steps that take place in generating a XML form wizard • The Schema Wizard unpacks and deploys the Web application package into a Web server’s application repository (i.e. webapps under Tomcat). • User provides with a location of the Schema. • The Schema is read in to create an in-memory representation (SOM) of the schema and also to create Java classes. • SOM=Castor’s Schema Object Model • SOM API provides a convenient interface to access the W3C XML Schema structures. • Using the SOM, Castor SourceGenerator creates Java classes that correspond to the Schema structures. These classes form the memory model (i.e. Javabeans for JSP) and come with the necessary framework to parse and regenerate (marshal and unmarshal) XML instances. • Java classes are compiled, and binaries are placed into the new project’s directory structure.
Annotated XML Schema Castor Schema Unmarshaller Castor SourceGenerator Castor SOM Web Application Template Schema Parser JavaBeans Velocity Templates Java Compiler Libraries Classes JSPs XML Form Wizard created as a Web Application SchemaWizard Architecture (2) (3) (6) (4) (7) (1) (5) (8)
SchemaWizard Architecture • The steps that take place in generating a XML form wizard (cont.) • Using the SOM once again, SchemaParser traverses the in-memory schema and collects structure information, i.e. names, types, whether element or attribute, complex or simple type. • Based on this information, the parser chooses what type of template will be used, stores the information in a Velocity context, and invokes the template engine to generate the program logic presented in JSP. The parser also gathers the Schema annotations, i.e. page color, input sizes, at this level and place the parameters in the context. • The engine runs on templates placing each JSP code in its directory, creating the interface based on the user schema.
Schema object Individual types JavaBeans info Castor SOM Velocity context with type info Templates Context, template JSP SchemaParserData Flow and Action Traverse schema for types Collect type information, create a context Decide template: Project page Index page Simple type Enumerated simple type Unbounded simple type Complex type Unbounded complex type Velocity Template Engine
XML Schema location is given to SchemaWizard. XML Form Wizard is generated. XML instance is marshaled.
Schema Annotations • Users can make cosmetic changes for the final project beforehand with annotations in the schema. • W3C XML Schema allows developers to embed user defined languages into the schema using <xs:annotation> and <xs:appinfo> structures. • Annotations for the whole schema affects the whole page, i.e. page title, background color, default input sizes, leading numbers on and off, XML browsing on and off. <xs:annotation> <xs:appinfo source="title">SchemaWizard Output for Topics Schema </xs:appinfo> <xs:appinfo source="inputsize">30</xs:appinfo> <xs:appinfo source="bgcolor">#e0e0ff</xs:appinfo> <xs:appinfo source="leadingnumbers">false</xs:appinfo> <xs:appinfo source="showxml">true</xs:appinfo> </xs:annotation>
Schema Annotations • Annotations for individual structures override the schema annotations, i.e. input size for each element. Also, labels for each element can be defined, and input fields can be changed to larger text areas with a textarea parameter and row numbers, or to password fields by a password parameter whose value set to true. <xs:annotation> <xs:appinfo source=“label">User Password</xs:appinfo> <xs:appinfo source="inputsize">15</xs:appinfo> <xs:appinfo source=“password">true</xs:appinfo> </xs:annotation> … <xs:annotation> <xs:appinfo source=“label">Memo</xs:appinfo> <xs:appinfo source=“textarea">5</xs:appinfo> </xs:annotation>
Title set Smaller input size Textarea, row count set to 5 Unbounded element with its own add/delete buttons XML browsing turned on Background set to gray
Access Rights, Controls and Roles Topic based permissions
System Access Control Overview • The core of the system contains a JMS-based publish/subscribe system. • Postings are thus based on JMS topics, or channels. • Access privileges (read/write by web, read/write by email, modify privileges) are enforced for each topic.
User Privileges • Users request access to specific topics/channels. • Granted by administrator for that topic • Can request • Read/write by browser • Read/write by email (newsgroups) • Receive/dont’ receive attachments. • Topic admin can edit these requests.
Topic Administrator Privileges • Topic admins can approve or revoke access to topics. • Can also modify individual privileges • Revoke post privilege, require email notification. • Have all other rights of users for that topic. • Topics can have multiple administrators. • A person can be a regular user of one group and administer another group.
Super-Administrator Privileges • A super admin manages an entire application. • Can create new topics. • Can assign administration privileges to users. • Can act as administrator and regular user of all topics.
Contact Info • See www.xmlnuggets.org for more information. • Email: marpierc@indiana.edu.