240 likes | 358 Views
Rainbow: Bridging XML and Relational Databases Design, Implementation, and Evaluation. MQP Project Members: Tien Vu, Mirek Cymer, John Lee. MQP Advisor: Prof. Elke A. Rundensteiner Sponsor: Verizon Laboratories Incorporated. HTML vs. XML. Microsoft, IBM, Informix, Oracle, Sun,.
E N D
Rainbow: Bridging XML and Relational DatabasesDesign, Implementation, and Evaluation MQP Project Members: Tien Vu, Mirek Cymer, John Lee MQP Advisor: Prof. Elke A. Rundensteiner Sponsor: Verizon Laboratories Incorporated
HTML vs. XML • Microsoft, IBM, Informix, Oracle, Sun, ...
XML Data Management by RDBMS • Advantages: • Efficient query and analysis tools. • Matured database tools available. • Easy integration with existing business databases. • Issues: • Map between XML and Relational Model. • Update Propagation. • Query Translation and Optimization.
Motivation for Mapping car <EMPTY> Alternate Mapping • Query Performance vary with respect to how data is mapped. • Flexible mapping: fixed translation and restructure car make make model year Ford model Ford Mustang 2001 Mustang year 2001
Rainbow Architecture Legend XMLQuery XML User RDBMS XML Query Engine XML Data Subsystem Restructuring Subsystem XML Manager DTDM Manager DTD XML
Goals of our MPQ • What: • Implement and evaluate restructuring subsystems within the large-scale Rainbow system. • How: • Learn about the database technologies and web tools. • Translate research ideas to software system design. • Practice software engineering techniques: • UML, engineer and reuse code. • Design an experimental test plan and test bed. • Conduct performance study and analysis.
Restructuring Subsystem XMLQuery XML User Legend XML Query Engine XML Model Mapping Query Storage Subsystem Restructuring Restructure Operator Library Restructurer Relational Model DTDM Manager XML Manager Internal Process DTD XML
Restructuring Operators • 11 Restructuring Operators: • Rename Item/Attribute • Switch Nesting • Pushup/Pushdown Attribute • Pushup/Pushdown Nesting • Split/Merge Nesting • Reference/Dereference
Mapping: Sequence of Restructure Operators invoice summary <empty> account_num bill_period account_num bill_period • Mapping is modeled as a sequence of reversable restructuring operators, Operator Name + Parameters. • For Example: value value pushUpAttribute(‘account_number’, ‘value’, ‘invoice’, ‘account_number’); pushUpAttribute(‘bill_period’, ‘value’, ‘invoice’, ‘bill_peroid’); renameItem(‘invoice’, ‘summary’);
SQLs for Push-Up Attributes A A CREATE VIEW new.A (<all-columns>, a) AS SELECT A.<all_columns>, B.b FROM old.A, old.B WHERE B.pid = A.iid CREATE VIEW new.B (<all-columns-but-b>) AS SELECT B.<all-columns-but-b> FROM old.B Push-up B a B b
Example SQLs • Inline: make.value into car as Attribute make. • Mapping: • pushUpAttribute(‘account_number’, ‘value’, ‘invoice’, ‘account_number’); • SQL statements: CREATE VIEW new.invoice (iid, pid, account_number) AS SELECT invoice.iid, invoice.pid, account_number.value FROM old.invoice, old.account_number WHERE account_number.pid = invoice.iid CREATE VIEW new.account_number (iid, pid) AS SELECT account_number.iid, account_number.pid FROM old.account_number
Development Tools Java: Visual Café2, Javadocs, JAVA2 Oracle 8i, XML 4J, JDBC1.2, SQL Queries Code Facts 44 total system classes 17 classes of Rainbow 27 classes reused ? lines of system code ? lines of Rainbow code ? lines of code reused Rainbow Implementation
Rainbow Test & Experimental Evaluation • Experimental Setup • Oracle 8i • Windows NT • Data • Created a DTD • Randomly generated XML • Hand translated queries • Factors • Type of query • Number of operations
Rainbow Conclusions • Technical accomplishments • Functional prototype system • Feasibility of Rainbow concepts • Automated test bed designed • Performance evaluations show that: • (Ideal) Moving up data on the embedded-relational-level yields better query performance for Join queries. • Knowledge gained • OO, Java, JDBC, SQL, RDBMS, XML, DTD • Teamwork & S/W Engineering & Software Reuse • Logistics of setting up an experiment • Future work • Experiment test plans and test beds to realize the full potential of the restructuring component.
Rainbow: XML and Relational DatabaseDesign, Implementation, and Evaluation Project Members: Tien Vu, Mirek Cymer, John Lee Advisor: Elke A. Rundensteiner Ph. D Student: Xin Zhang Sponsor By: Verizon Laboratories Incorporated Visit Rainbow at http://davis.wpi.edu/dsrg/TJM/
Benefits: Efficient query and analysis tools. Matured Data Warehousing support. Easy Integration with existing business database. Applications: E-commerce Web-based industries <invoice> <account_number>555 777-3158 573 234 </account_number> <bill_period>Jun 9 - Jul 8, 2000</bill_period> <carrier>Sprint</carrier> <itemized_callno=”1” date=”JUN 10” number_called=”973 555-8888” time=”10:17pm” rate=”NIGHT” min=”1” amount=”0.05” /> <itemized_callno=”2” date=”JUN 13” number_called=”973 650-2222” time=”10:19pm” rate=”NIGHT” min=”1” amount=”0.05” /> <itemized_callno=”3” date=”JUN 15” number_called=”206 365-9999” time=”10:25pm” rate=”NIGHT” min=”3” amount=”0.15” /> <total>$0.25</total> </invoice> XML: The Future of the Web
XML and Relational Database • Problem • Many Application usually change its data very frequently. • e.g., flight reservation, online billing, inventory. • Current Solution • Reloading the complete XML document when changed which is very expensive. • Rainbow Solution • Incrementally propagate XML Document Updates to Stored XML Data. • Goal: XML Repository Implemented using RDBMS • Approach: Flexible Mapping • Features: • DTD Metadata Management in RDB • Automatic Schema Creation • Incremental Update Propagation • XML Query Optimization
HTML <h1>Car</h1> <h2>Make</h2> <p>Ford Mustang <h2>Seats</h2> <p>5 <h2>Top Speed</h2> <p>70 m.p.h XML <h1>Car</h1> <make>Ford Mustang</make> <seats>5<seats> <speed units=“mph”>70</speed> HTML vs. XML