320 likes | 462 Views
XCube. XML For Data Warehouses By Sven Groot. Data warehouses. Contains data drawn from several databases and external sources Provide a comprehensive view of all aspects of an enterprise Complemented by increased emphasis on powerful analysis tools SQL is inadequate
E N D
XCube XML For Data Warehouses By Sven Groot
Data warehouses • Contains data drawn from several databases and external sources • Provide a comprehensive view of all aspects of an enterprise • Complemented by increased emphasis on powerful analysis tools • SQL is inadequate • OLAP: OnLine Analytic Processing
Data Warehousing External Data Sources Visualisation Extract Clean Transform Load Refresh OLAP Serves Metadata repository Data Mining Data Warehouse Operational Databases
OLAP • Multidimensional data model locid 8 10 10 13 pid 30 20 50 12 25 8 15 11 1 2 3 timeid
OLAP (cont’d) Sales • Multidimensional data as a relation Locations Products
OLAP (cont’d) • Dimension as hierarchies PRODUCT TIME LOCATION year country quarter category state week month city pname date
OLAP (cont’d) • Typical OLAP queries • Find the total sales • Find total sales for each city • Find total sales for each state • Find the top five products ranked by total sales • Possible to drill-down and roll-up on dimensions • Pivoting
eXtensible Markup Language • Contains nodes that may be processing instructions, elements, attributes, CDATA sections or comments. • Must be well-formed • Format can be defined by a DTD or XSD. • Multiple formats in one document using namespaces. • Can be transformed using XSLT <?xml version=“1.0” encoding=“utf-8”?> <!-- Library XML File --> <Library xmlns=“http://www.liacs.nl/~sgroot/library.xsd”> <Book isbn=“0072322063” title=“Database Management Systems”> <Author name=“Raghu Ramakrishnan” /> <Author name=“Johannes Gehrke” /> <Notes>Second Edition</Notes> </Book> </Library>
Data Warehouses Reloaded • Data warehousing occurs across departments all over the globe, and also across companies • External datasources might include WWW and other data warehouses • One flexible format for exchanging data cubes would be useful: XCube
XCube Scenarios • Download
XCube Scenarios (cont’d) • Query
XCube Scenarios (cont’d) • Generating • Conversion of any data into data cube • Using data from a warehouse in data cube
Requirements for online cubes • Support for multidimensional data model. • Support for conceptual distinction between schema, dimension and fact data. • Transportable over the network. • For flexibility and reuse linking and inclusion concepts needed • Extensible to adapt to different data models or new concepts • Easily convertible to and from various sources and formats • Possibly allow OLAP processing to reduce data transfer
XCube formats • XCubeSchema
XCube formats (cont’d) <multidimensionalSchemaversion="0.4" xmlns="http://www.xcube-open.org/V0_4/XCubeSchema.xcsd"> <cubeSchemaid="sale"> <factid="sales"/> <factid="revenue"/> <dimensionid="geography"granularity="branch"/> <dimensionid="product"granularity="article"/> </cubeSchema> <classSchema> <!-- geography --> <classLevelid="branch"> <attributeid="manager"/> <rollUptoLevel="city"/> </classLevel> <classLevelid="city"> <rollUptoLevel="region"/> </classLevel> <!-- ... --> <!-- product --> <classLevelid="article"> <attributeid="articleName"/> <attributeid="brand"/> <rollUptoLevel="productGroup"/> </classLevel> <classLevelid="productGroup"> <rollUptoLevel="productFamily"/> </classLevel> <!-- ... --> </classSchema> </multidimensionalSchema>
XCube formats (cont’d) • XCubeDimension
XCube formats (cont’d) <dimensionDataversion="0.4" xmlns="http://www.xcube-open.org/V0_4/XCubeDimension_base.xcsd"> <units> <entryunitType="currency"unit="EUR"/> </units> <classification> <!-- dimension: geography --> <levelid="country"> <nodeid="Germany"/> <nodeid="Switzerland"/> <nodeid="France"/> <!-- ... --> </level> <levelid="region"> <nodeid="Northern Germany"> <rollUptoNode="Germany"level="country"/> </node> <nodeid="Western Germany"> <rollUptoNode="Germany"level="country"/> </node> <nodeid="Eastern Germany"> <rollUptoNode="Germany"level="country"/> </node> <nodeid="Southern Germnamy"> <rollUptoNode="Germany"level="country"/> </node> <!-- ... --> </level> <!-- ... --> </classification> </dimensionData>
XCube formats (cont’d) • XCubeFact
<cubeFactsversion="0.4"xmlns="http://www.xcube-open.org/V0_4/XCubeFact_base.xcsd"><cubeFactsversion="0.4"xmlns="http://www.xcube-open.org/V0_4/XCubeFact_base.xcsd"> <cubeid="sale"> <cell> <dimensionid="geography"node="branch48"/> <dimensionid="product"node="MA-450"/> <dimensionid="time"node="2003-07-24"/> <factid="sales"value="3"/> <factid="revenue"value="960"/> </cell> <cell> <dimensionid="geography"node="branch75"/> <dimensionid="product"node="MA-450"/> <dimensionid="time"node="2003-07-24"/> <factid="sales"value="2"/> <factid="revenue"value="640"/> </cell> <!-- ... --> </cube> <!-- ... --> </cubeFacts>
XCube extended formats • XCubeText • Adds textual description for nearly every element. • Future version will allow separate files. • Allows different levels of detail (short, medium, long, html)
XCube extended formats (cont’d) • XCubeQuery • Organise interactive dialog between client and server • Meant to facilitate more efficient exchange of data • Consists of seven different query formats
XCubeQuery • List of available cubes • Request:<request> <getCubeSchemaList /></request> • Response:<cubeSchema id=”sale” /><cubeSchema id="purchase“ /><cubeSchema id="stock“ />
XCubeQuery (cont’d) • Getting the schema of a special cube • Request:<request> <getCubeSchema id=”sale”></request> • Response:<cubeSchema id="sale“ xmlns:xs=”http//www.w3.org/2001/XMLSchema”> <fact id="sales"> <defaultAggregate> <aggregation operator="sum"/> <aggregation operator="max"/> <aggregation operator="min"/> </defaultAggregate> </fact> <fact id="revenue"/> <dimension id="geography" granularity="branch"/> <dimension id="product" granularity="article"/> <dimension id="time" granularity="xs:date" stdLevel="true"/></cubeSchema><dataTypes/><unitTypes/>
XCubeQuery (cont’d) • Querying the Classification Schema • Request:<request> <getClassSchema> <dimension id=”time”/> <dimension id=”geography”/> </getClassSchema></request> • Response:
XCubeQuery (cont’d) <classSchemaxmlns:xs="http://www.w3.org/2001/XMLSchema"> <stdTimeClassLevelid="xs:gYearMonth"> <rollUptoLevel="quarter"/> </stdTimeClassLevel> <timeClassLevelid="quarter"timeBase="quarter"> <rollUptoLevel="xs:gYear"stdLevel="true"/> </timeClassLevel> <classLevelid="branch"> <attributeid="manager"/> <rollUptoLevel="city"/> </classLevel> <classLevelid="city"> <rollUptoLevel="region"/> <addKeylevel="region"/> </classLevel> <classLevelid="region"> <rollUptoLevel="country"/> </classLevel> <classLevelid="country"/> </classSchema> <dataTypes> <dataTypename="quarter"> <xs:restrictionbase="xs:gYearMonth"> <xs:patternvalue="[0-9]{4}-0[1-4]"/> </xs:restriction> </dataType> </dataTypes> <unitTypes/>
XCubeQuery (cont’d) • Querying Classification Nodes • Request:<request> <getClassNodes level=”branch”/></request> • Response:<level id="branch"> <node id="branch48"> <rollUp toNode="Frankfurt" level="city"/> <attribute id="manager" value="Meier"/> </node> <node id="branch75"> <rollUp toNode="Frankfurt" level="city"/> <attribute id="manager" value="Bauer"/> </node> <!-- ... --></level>
XCube extended formats (cont’d) • XCubeFunction • Still under development • Query XCube server about it’s functionality
XCube formats summary XCubeSchema XCubeDimension XCubeFact XCubeText XCubeQuery XCubeFunction
Related work • Common Warehouse Metamodel • MetaCube-X • XML for Analysis
Where from here • Basis for more complex and efficient infrastructure. • Combination with XML Web Services • Evolution of XCubeText • Create new data warehouses with XCube standards.
References • Wolfgang Hümmer, Andreas Bauer & Gunnar Hard; XCube – XML For Data Warehouses; DOLAP’03, November 7, 2003. • http://www.xcube-open.org • Raghu Ramakrishnan & Johannes Gehrke; Database Management Systems, second edition; McGraw-Hill, 2000 • T. Bray, J. Paoli, C.M. Sperberg-McQueen; E. Maler; Extensible Markup Language (XML) 1.0 (Second Edition) W3C Recommendation 6 October 2000http://www.w3.org/TR/REC-xml