290 likes | 462 Views
XML 과 Database. 홍기형 성신여자대학교. 차례. Database, Web, and XML XML Database Systems Data Models Query Language and Processing Storage and Index Other issues. Database and Web, before XML. DB : a back-end server for Web Applications CGI JDBC Embedded SQL Web Information Retrieval
E N D
XML과 Database 홍기형 성신여자대학교 성신여자대학교 홍기형
차례 • Database, Web, and XML • XML Database Systems • Data Models • Query Language and Processing • Storage and Index • Other issues 성신여자대학교 홍기형
Database and Web, before XML • DB : a back-end server for Web Applications • CGI • JDBC • Embedded SQL • Web • Information Retrieval • Target to manage (Web DB) Scripts ThinClient HTML MiddleTier Web Server Template Engine Scripts HTMLTemplates Application Server ApplicationCode MappingCode BackEnd 성신여자대학교 홍기형
XML • eXtensible Markup Language • A new emerging standard for data representation and exchange on the internet • See the XML catalog , http://www.xml.org • Separating content from presentation • Easy to provide multiple view of the same data • Easily parsed and self-describing 성신여자대학교 홍기형
XML • Extensible — a dynamic data model • Simple — human-readable, easy to use • Flexible — for handling complex data • Portable — for cross-platform data exchange • Standard — easy to integrate, widely adopted 성신여자대학교 홍기형
HTML과 XML 문서 비교 성신여자대학교 홍기형
XML Is All About Data HTML example: <heading1> Invoice </heading1> <bold>To: Joe Bloggs <P> From: J. Abrams <P> Date: 2/1/1999<P> Amount: $100 <P> Tax: 21% <P> Total $121 </bold> Datamixed withpresentation 성신여자대학교 홍기형
XML Is All About Data XML example: <Invoice> <Customer> Joe Bloggs </Customer> <From> J. Abrams </From> <Date year=‘1999’ month=‘2’ day = ‘1’ /> <Amount unit = ‘Dollars’> 100 </Amount> <TaxRate> 21 </TaxRate> <Total currency = “Dollars”>121 </Total> </Invoice> HumanReadable Comeswith Tags 성신여자대학교 홍기형
XML Is All About Data XML example: <Invoice> <Customer> <Name>Joe Bloggs </Name> <Address> 25 Mall Road </Address> </Customer> <From> J. Abrams </From> <Date year=‘1999’ month=‘2’ day = ‘1’ /> <Amount unit = ‘Dollars’> 100 </Amount> <TaxRate> 21 </TaxRate> <Total unit = “Dollars”>121 </Total> </Invoice> Extensible • <Name>Joe Bloggs </Name> • <Address> 25 Mall Road </Address> 성신여자대학교 홍기형
XML Family of Standards • XML • DOM (Document Object Model) • XML Namespaces • XSL (style language) • XQL (XSL query language) • XML Data / DCD / Schema • XUL (updates, future) • …many more 성신여자대학교 홍기형
Building Web Applications with XML Scripts ThinClient • Quickly react to changes • Lower maintenance costs • Does not depend on a single vendor HTML MiddleTier Web Server / App Server XSL ApplicationCode DOM XML Server Standard API andTemplate Language XML BackEnd 성신여자대학교 홍기형
Legacy DBs for XML Applications • XML as a new data-exchange format • for legacy DB applications • DB2XML • Transforming the results of database queries or complete databases into XML documents or into HTML documents using XSLT stylesheets. • DB2XML can be used: • as a standalone tool (with GUI or command line), • as a servlet to dynamically generate XML-documents • using the DB2XML API 성신여자대학교 홍기형
XML Database Systems • 3 approaches • Build special-purpose systems • Lore, Strudel • Best performance for XML data • Use object-oriented database systems • eXelon, Monet, Ozone • Object-oriented modeling • Use relational database systems • Oracle, Microsoft • Matured large market 성신여자대학교 홍기형
Applications Textual Interface HTML GUI API Query Processor queries Preprocessing (Lorel2OQL) Query Plan Generator Query Optimizer Parsing Data Engine Non-query Requests External, Read-only Data Sources Object Manager Query Operators Utilities External Data Manager Physical Storage Lore 성신여자대학교 홍기형
eXelon 성신여자대학교 홍기형
Oracle 8i 성신여자대학교 홍기형
Data Models for XML • XML is not a data model • Structure of an XML document • an ordered list of elements • each element • may have a set of attributes • may have (sub)elements (nested elements) • Structured data and full text mixed together • DOM defines how to translate an XML document into a data structure for processing • Need a true data model for XML data 성신여자대학교 홍기형
OEM: a Semi-structured Data Model • Object Exchange Model (Lore) • Semi-structured Data • Self-describing structure, the lack of schema • the structure changes rapidly and unpredictably • Labeled direct graph • Node : Object (OID) or atomic value (leaf) • Labeled Edge : object-subobject relationship 성신여자대학교 홍기형
OEM, an example <DBGroup> <Member Name=“유” Advisor=“m1”> <Age>28</Age> </Member> <Member ID=“m1”, Project=“p1”> <Name>박</Name> <Advisor>홍</Advisor> </Member> <Project ID=“p1” Member=“m1”> <Title>XML DB</Title> </Project> </DBGroup> DBGroup &1 Member Project Member &2 &3 &4 {ID=“m1, Project=“p1”} {Name=“Smith”, Advisor=“m1”} {ID=“p1, Project=“m1”} Age Name Advisor Title &5 &6 &7 &8 Text Text Text Text &12 &9 &10 &11 “28” “박” “홍” 성신여자대학교 홍기형
Issues in Data Modeling • How to simultaneously view XML information in both • a set of documents • a single large database • No loss of information in XML • How to represent the Ordering of elements • external/internal entities, processing instructions 성신여자대학교 홍기형
XML DB Design • When should attributes (subelements) be used? • Is a 1-to-1 relationship best represented using element nesting or IDREFs? • How to translate the conceptual model (OEM?) into an XML encoding? • Need to identify the relationship between DTDs and traditional DB schema 성신여자대학교 홍기형
Query Languages for XML DB • Requirements • Path Expressions • Queries over • the structured and semistructured data • full text • the mixture of data elements and full text • W3C, Query Languages for the Web, 1998 • QL for semistructured data • Lorel, UnQL • XQL, XML-QL 성신여자대학교 홍기형
XML-QL • Syntax Select <variable-list> where <XML-pattern>+ • Example select $n, $h where <person> <age=$a> <name> $n </name> <address> 서울 성북구 동선동3가</address> [<hobby> $h </hobby>] </person>, $a > 18 성신여자대학교 홍기형
Issues in Query Processing • The true requirements for XML QL is not known • Need to review all facets of traditional query processing • Need to Develop a new IR model • proximity in XML documents • similarity measure between XML elements 성신여자대학교 홍기형
How to integrate • traditional (DB) query processing model and • information retrieval model • Optimization Schemes • for not well-structured XML data • for queries mixed with full text retrieval and structured/semistructured search 성신여자대학교 홍기형
Storage Structure and Indexing • Clustering schemes for storing XML data • New index types • for quickly finding certain elements, attributes, and more complex structural patterns • element orderings • Determine the level of parsing for storing XML documents • Based on the analysis of encoding pattern • merging identical text strings (sub-patterns) by using appropriate IDREFs • compression based on regular patterns 성신여자대학교 홍기형
Issues in Various DB Features • Full view support for XML • both virtual and materialized views • incremental maintenance • XSL as a view definition language • Data integrity issue • What are constraints on XML data? • key, referential, domain • How to represent the constraints • How to check them when changes occur 성신여자대학교 홍기형
Trigger • active database capabilities in XML • Transaction Control over XML database • Performance Evaluation • need to make an appropriate benchmark for XML data • XML data set • query types • mix of queries and updates 성신여자대학교 홍기형
References • Research issues • Data Management for XML: Research Directions, http://www-db.stanford.edu/~wisom/xml-whitepaper.html • More on Data Management for XML, http://www.cs.washington.edu/homes/alon/widom-response.html • Storing XML data into RDBMSs • A Performance Evaluation of Alternative Mapping Schemes for Storing XML Data in a Relational Database, ercim.inria. publications/RR-3680 • XML Database Systems • http://www.xmlsoftware.com/database/ 성신여자대학교 홍기형