140 likes | 258 Views
Wiley Coyote: From RDBs to XML. A Presentation by Lucas Espinosa Pam Lach & Lee Shaw. Migrating to XML. Implementation Current Data PubMed /Google Interoperability Usability Cost Discussion Cost Benefit Analysis Pros and Cons of the XML Philosophy. Implementation Process: At a Glance.
E N D
Wiley Coyote: From RDBs to XML A Presentation by Lucas EspinosaPam Lach & Lee Shaw
Migrating to XML • Implementation • Current Data • PubMed/Google Interoperability • Usability • Cost • Discussion • Cost Benefit Analysis • Pros and Cons of the XML Philosophy
Implementation Process: At a Glance • XML dumps from our existing databases will need to be transformed for use with PubMed and Google schemas. • XSLT files can be used to change the structure of XML files • To perform our administrative tasks, SQL queries need to be turned into XQuery files, which can be processed by the open source program, SAXON
Implementation Process: At a Glance • We’ll need to know: • XSLT for Transformations • XQuery for querying our documents • XPath for establishing referential integrity and key-based relationships • We’ll need to use: • oXygen for editing/authoring/validating (proprietary) • A designer to build forms and usable interfaces so that we won’t have to train entire company in XML • SAXON (open source)
Implementation: Administrative Data Current business practices that will need to be migrated from existing RDBMs to XML file structures: • Administrative • Human Resources • Accounting • Purchasing • Ordering • Author Management • Intellectual Property Rights • Inventory
Implementation: Administrative Data • Proposed Tree Structures: http://ils.unc.edu/~plach/inls623/A6WCXMLtree.pdf
Implementation: PubMed • PubMed supports XML journal entries with their provided DTD schema: • http://www.ncbi.nlm.nih.gov/entrez/query/static/PubMed.dtd • DTD updated periodically, and we found older DTD documents hosted on their site • Newest version 2.6 • Added VersionID and VersionDate to article tag
Implementation: PubMed • 2000 Mandate: “Licensees are reminded that the NLM License Agreement to Lease NLM Databases requires licensees to comply with certain conditions. In particular is the requirement to add new records to MEDLINE products at least quarterly, apply corrections (maintenance) to records at least annually, and provide notification of retracted publications and corrections to dosage errors in MEDLINE abstracts to the users of your products in a timely fashion. Licensees who have not yet submitted the Usage Report Worksheet for 2000 must do so immediately as NLM is obligated to provide usage statistics to the U.S. Congress.” http://www.nlm.nih.gov/archive/20070503/bsd/2001_overview.html • Revised 2008: “NLM best practices recommendation calls for incorporating update files more frequently: from quarterly, as stated in the previous licenses, to 30 days after the update files become available at NLM; 90 days for complete database replacement files.” http://www.nlm.nih.gov/pubs/techbull/nd08/nd08_license.html
Implementation: Google • Google Product Search • Free to use • Products submitted in XML form • Visible on Google.com • Google Commerce Search • Customized searching on our website • Supports XML • APIs for integration into current interface
Implementation: Usability • Challenges of transforming XML into more usable formats. Can we do this in-house?
Implementation: Cost • Financial: • Google Commerce/Product Search ($25,000) • oXygen Licenses (ca. $2500) • Human Resources /Time • Training company to use new system • Hiring developers to create custom systems that can create XML documents according to our specifications, as well as create interfaces, web forms necessary for day-to-day business transactions
Discussion: The XML Philosophy • Costs outweigh the benefits • Complicates our day-to-day operations but eases our ability to connect to PubMed and Google • No single technology is the solution. XML is good for some things but not others.