1 / 57

The magic is in the glue XQuery+Cloud Daniela Florescu Oracle

The magic is in the glue XQuery+Cloud Daniela Florescu Oracle. My personal history. PhD in object-oriented query processing/optimization Loved the database theory and practice (relational, object-oriented, semi-structured) Got really interested in it, and thought it was important…

watson
Download Presentation

The magic is in the glue XQuery+Cloud Daniela Florescu Oracle

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The magic is in the glueXQuery+CloudDaniela Florescu Oracle

  2. My personal history • PhD in object-oriented query processing/optimization • Loved the database theory and practice (relational, object-oriented, semi-structured) • Got really interested in it, and thought it was important… • ….then I joined Oracle.

  3. … after 4 years in Oracle • Applications are the really important issue • How to develop, deploy, maintain, evolve, customize • Databases are a side effect • Customers are educated to think they need them • DB are only useful as part of a general application architecture • Customer is the king • If they don’t make $$$, you don’t either • Customers are in pain building apps right now

  4. Agenda • Current pain in building apps • What can XQuery do for customers ? • What can the Cloud do for customers ? • How do we put them together ? • How do XQuery+Cloud solve the problem ? • Some open research problems

  5. Imagine I am a customer, I need to build a new app. • How much does it cost • Cost of developing the app (salaries) • Cost of deploying the app • Hardware, software licenses, maintenance • Loss of income because of mis-provisioning • Do I have to pay up front? • Is the cost proportional with the income ?

  6. Other questions ? • How fastcan I deliver the app • Quicker on the market then my competitors ? • How good the application is • More customers for the app. => more income • Acceptable operational characteristics ? • Can Iadapt if something changes ? • Operational characteristics • Functionality • Can I customize the same app in a different vertical / different set of customers ? • Is there a risk in the technology ?

  7. Customers concerns • Cost • Time to market • Flexibility • Customizability • Sustainability • Risk • Often a tradeoff

  8. Different classes of customers • Enterprise (e.g. Bank of America) • Cost • Sustainability • Risk • Customizability • Flexibility • Time to market • Government agency (eg. DoD) • Sustainability • Cost • Time to market (?) • Flexibility (?) • Customizability • Risk • Consumer (e.g Craiglist) • Time to market • Cost • Flexibility • Customizability • Sustainability • Risk

  9. Typical enterprise app stack Communication (XML, REST, WS) Oracle IBM SAP Microsoft Application logic (Java, C#) Database SQL)

  10. Cost ? $$$$! Cost of developing the app Cost of deploying the app (hardware, software licenses, maintenance) Loss of income because of mis-provisioning Do I have to pay up front? Is the cost proportional with the income ? Communication (XML, REST, WS) Application logic (Java, C#) Database SQL)

  11. Time to market ? Years! How fast can I deliver the app Communication (XML, REST, WS) Application logic (Java, C#) Database SQL)

  12. Flexibility ? Customizability? Hardly any ! Communication (XML, REST, WS) • Can I adapt if something changes ? • Operational characteristics • Functionality • Can I customise it to a different vertical? Application logic (Java, C#) Oracle experience: for every $1M for Oracle app licenses, customers pay $2M to customize it. (SAP experience even worse :-) Database SQL)

  13. Two major evil points • Multi layer infrastructure • Schemas a pre-requisite • New apps: • Even the Oracle apps ! • New platforms: • Salesforce, GoogleApps, Facebook Communication Application Logic (schema-less) put get Persistent (key, value) store (schema-less) XQuery a possible solution.

  14. Another evil point • Lack of cost elasticity • Cost proportional with income • Lack of elasticity in performance • Response time independent of # clients The Cloud is the beginning of a solution.

  15. Agenda • Current pain in building apps • What can XQuery do for customers ? • What can the Cloud do for customers ? • How do we put them together ? • How do XQuery+Cloud solve the problem ? • Some open research problems

  16. Why XML ? • Covers all spectrum from structured data to textual information • Schema independent • Platform independent • Continuity with the basic Internet infrastructure (URI, HTML, HTTP)

  17. What is XQuery ? • A programming language for XML processing • Functional in style • Turing complete • Contains: • Navigation • Declarative query and aggregation (FLWOR) • Search (full text) • Declarative updates • Transforms • Scripting • Streaming and windowing • Error handling and second order expressions • Packaging (modules) • Has limitations (further)

  18. History and status • Standard of the W3C • Good and bad • 10 years old • 40 existing implementations • Implemented in major databases • Best implementations in open source • If you have XML data, it is hard to avoid.

  19. Navigation • fn:doc("catalog.xml") /items/itemfn:doc("catalog.xml")/items//item • fn:doc("catalog.xml")/items//* • fn:doc("catalog.xml")/items/@item • fn:doc("parts.xml")/parts/part[partno = $i/partno] • $x/items/item

  20. FLWOR for $i in fn:doc("catalog.xml")/items/item, $p in fn:doc("parts.xml")/parts/part[partno = $i/partno], $s in fn:doc("suppliers.xml")/suppliers /supplier[suppno = $i/suppno] order by $p/description, $s/suppname return $ s Groupby, having, outerjoins, etc

  21. Creation of new information <descriptive-catalog> { for $i in fn:doc("catalog.xml")/items/item, $p in fn:doc("parts.xml")/parts/part[partno = $i/partno], $s in fn:doc("suppliers.xml")/suppliers /supplier[suppno = $i/suppno] order by $p/description, $s/suppname return <item> { $p/description, $s/suppname, $i/price } </item> } </descriptive-catalog>

  22. Textual search • $doc ftcontains ( ( "mustang" ftand ({("great", "excellent")} any word occurs at least 2 times) ) window 11 words ftand ftnot "rust" ) same paragraph

  23. Declarative updates for $p in /inventory/part let $deltap := $changes/part[partno eq $p/partno] return replace value of node $p/quantity with $p/quantity + $deltap/quantity

  24. Transforms let $oldx := /a/b/x return copy $newx := $oldx modify (rename node $newx as "newx", replace value of node $newx by $newx * 2) return ($oldx, $newx)

  25. Streams and windowing for sliding window $w in (2, 4, 6, 8, 10, 12, 14) start at $s when fn:true() only end at $e when $e - $s eq 2 return <window>{ $w }</window> • Result of the above query: <window>2 4 6</window> <window>4 6 8</window> <window>6 8 10</window> <window>8 10 12</window> <window>10 12 14</window>

  26. Scripting expressions block { declare $a as xs:integer := 0; declare $b as xs:integer := 1; declare $c as xs:integer := $a + $b; declare $fibseq as xs:integer* := ($a, $b); while ($c < 100) { set $fibseq := ($fibseq, $c); set $a := $b; set $b := $c; set $c := $a + $b; }; $fibseq; }

  27. Where can it be used in today’s architectures? • Databases • Middle tiers • Information dispatch • Transformation • Data integration • Browsers (see XQIB demo, WWW’09 paper) • Mobile devices (XQuery on iPhone anyone ?)

  28. XQuery’s real potential XML XML • Standalone programming language for information intensive applications • Can build extremely rich applications Application Logic (XQuery) XML

  29. Cost Time to market Flexibility Customizability Sustainability Risk Why XQuery ? • Because of XML • Schema independent • Continuity with basic Internet infrastructure • Continuity structured data <--> textual information • XQuery’s own advantages • Declarative • Single layer code • Open source friendly • Extra Goodies • Opportunity to rethink ACID transactions • Unique opportunities for introspection • Code and data migration

  30. Declarativity • Small number of lines of code • Development cost • Time to market • # bugs • Easy to optimize automatically • Easy to parallelize automatically • Especially important in the cloud • Easier to achieve elasticity in performance • Easier to generate automatically • Important for smart/non-developers UIs

  31. Declarativity, negative side • Less number of developers capable of writing such code • Easy to write, harder to read • Tools harder to make (e.g. debuggers) • Performance can be unstable • Despite that, in the history of CS we evolve in the direction of declarativity • Assembly, C, C++, Java, Haskell • Cobol, SQL

  32. Rethink transactions and data consistency • XQuery silent as ACID transactions go • On purpose ! • Are ACID transactions really needed ? • Are they really enforced in Web apps ? • No. • Open research field • Interaction of programming languages with new transactional models and new data consistency models

  33. Sigmod’08 • Data consistency is something to optimize, not an absolute requirement • Data consistency models [Tanembaum] • Shared-Disk (Naïve approach) • No concurrency control at all • Eventual Consistency (Basic Protocol) • Updates become visible any time and will persist • No lost update on page level • Atomicity • All or no updates of a transaction become visible • Monotonic reads, Read your writes, Monotonic writes, ... • Strong Consistency • database-style consistency (ACID) via OCC • Data consistency a la carte

  34. Introspection opportunities • Closed world • Everything is (or will be) XML • Data, schemas, code, PULs, metadata, configs, runtime information • Unique opportunity to: • introspect at runtime all of them • reason about them • change them dynamically (not only data, but schemas, code and configuration) • Open research field: • Consequences on programming

  35. Why NOT XQuery • XML is complicated • XML Schema is hard/impossible to understand • XQuery is complicated • XQuery is incomplete (maybe research opport.?) • Missing a standard persistent data model • Missing DDL functionality (indexes, integrity constraints) • Missing basic functionalities (e.g. eval, function overloading) • Missing basic data modeling functionality (n:m relationships) • XQuery lacks a standard environment (e.g. J2EE) (maybe research opport.?) • No tools (debuggers, profilers) (maybe research opport.?) • Performance is not clear yet (certainly research opport !) • There are few XQuery developers (teaching opport  )

  36. Agenda • Current pain in building apps • What can XQuery do for customers ? • What can the Cloud do for customers ? • How do we put them together ? • How do XQuery+Cloud solve the problem ? • Some open research problems

  37. What is Cloud Computing ? • The „rental cars“ paradigm for computing • Commoditization of (certain aspects of ) Computing • CPU, storage, and network • Goal 1: Reduction of Cost • principle: fine-grained renting of resources • „pay as you go“ (elasticity of cost) • Goal 2: Simplification of Management • potentially infinite/unbreakable computing resources • potentially no administration • Goal 3: Elasticity of performance • Same resp time independently of workload • Note: does not work yet for DB or apps

  38. Case Study: Amazon AWS • EC2 : scalable virtual private servers using Xen. • S3 : WS based storage for applications • SQS : hosted message queue for web applications • SimpleDB : the core functionality of a database • Hadoop based functionality • Similar providers: IBM Blue Cloud, Microsoft Azure, (GoogleApp engine)

  39. The limits of the (Amazon) Cloud Cloud Computing a great starting point Unfortunately, only a fraction of the stack Customization, Training, ... Application Application Server DBMS Hardware

  40. Making use of the Cloud • Solution 1 (conservative) • Take an existing application (Java+SQL, etc) and try to make it run on the cloud (e.g. make Oracle run on AWS) • Solution 2 (reactionary) • Create an fresh new infrastructure, specially designed for Web apps requirements, to be deployed in the cloud Risk Benefit

  41. Solution 1 (conservative) take a traditional DBMS (e.g., Oracle, MySQL, ...) install it on an EC2 instance use S3 or EBS as a persistent store Advantages traditional databases are available proven to work well; many tools people trained and confident with them Disadvantages traditional DBMS solve the wrong problem anyway (e.g. focus on consistency) traditional DBMS make the wrong assumptions (DB optimizers fail on virtualized hardware)

  42. Solution 2 (reactionary) Rethink the whole system architecture do NOT use a traditional DBMS and app server create new breed of application server (with DB) run application server on n EC2 instances use S3 + distributed consistency protocols Advantages and Disadvantages requires new breed of (immature) systems + tools solves the right problem and gets it right Examples: GoogleApps (Python in the cloud) Sausalito (www.28msec.com) (XQuery in the cloud)

  43. Agenda • Current pain in building apps • What can XQuery do for customers ? • What can the Cloud do for customers ? • How do we put them together ? • How do XQuery+Cloud solve the problem ? • Some open research problems

  44. XQuery + AWS Cloud • Cookbook: • Take an existing XQuery processor • Partition the XML data on S3 • Map REST calls to XQuery programs • Run the XQuery programs on EC2 • Use SQS for (asyncronous) updates • Voila. • The magic is in the glue (XQuery proc. + AWS ) • Application Server + Web Server + Database • integrated XQuery based application stack for Web-based apps • fully SOA enabled • all pre-configured and lean (ZERO admin)

  45. XQuery in the Cloud (connected)

  46. Customers concerns • Cost • Time to market • Flexibility • Customizability • Sustainability

  47. XQuery in the Cloud (no Server)

  48. XQuery in the Cloud (offline)

  49. Demo at www.28msec.com ! • Look at www.programmableweb.com for use cases ( consumer and enterprise mashups)

  50. Competitors: Internet • Web 2.0 Development Frameworks • E.g., Ruby on Rails, PHP / LAMP, ... • Deployment in the cloud still problematic • Google AppEngine, Facebook Apps • Proprietary programming model (Python-based) • Limited functionality • Vendor lock-in, privacy issues • Oracle on AWS, do-it-yourself on AWS • limited functionality and/or scalability

More Related