1 / 43

Middleware Diarrhea and Other Ailments

Middleware Diarrhea and Other Ailments. Michael Stonebraker Adjunct Professor Massachusetts Institute of Technology (stonebraker@lcs.mit.edu). Outline. Too much middleware XML ailments Web services ills Our professional sickness. Client-Server Got Replaced by N-Tier Computing. The Web

tegan
Download Presentation

Middleware Diarrhea and Other Ailments

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Middleware Diarrhea and Other Ailments Michael Stonebraker Adjunct Professor Massachusetts Institute of Technology (stonebraker@lcs.mit.edu)

  2. Outline • Too much middleware • XML ailments • Web services ills • Our professional sickness

  3. Client-Server Got Replaced by N-Tier Computing • The Web • Gizmos • Scalability and management problems with client server

  4. Humility Lesson • We all sold client-server hard • during the 80’s • and even into the 90’s • Less than 10 years later • it is the worst idea on the planet We should feel really dumb!

  5. N-Tier Computing Produced Lots of Middleware • App servers • EAI/messaging • ETL • Federators • Workflow • CMS • Portals • DBMS

  6. Middleware Diarrhea • Average enterprise has • one (or more) app servers • one (or more) EAI packages • one (or more) ETL packages • one (or more) portal products • one (or more) application packages • and maybe someday a federated DBMS

  7. All of these systems • Contain transformation engines • And often do function activation (app service) • And often have adapters to legacy systems Huge overlap in functionality!!

  8. Less Moving Parts • Less systems • More uniformity • Less duplication

  9. Less Systems • Less system administrators • Less training • Less manuals • Less bugs • Less cross system issues

  10. More Uniformity • Every island has • memory management • security model • threading model • Less is better

  11. Less Duplication • Most of the islands support transformations • reasonable chance you will do each one 6 or more times • maintenance headache

  12. So How To Consolidate…… • Converge app server into OR DBMS • dumbest OR query is execute function Remember that everything looks like a nail to the guy with the hammer!

  13. component component component Pictorially client DBMS DBMS

  14. This Requires…. • DBMS to send queries to other DBMSs • I.e. be a data federator • Load balance also requires a federator

  15. Best of Breed Federators • Support schema heterogeneity • by executing OR functions • Support materialized views • to cache static data

  16. Less Moving Parts…. • Federators dominate ETL • ETL only supports “push” • federators do both “push” and “pull”

  17. Workflow • A collection of rules • who’s allowed to buy what • and who must approve it • Best considered as a boxes and arrows diagram • And compiled into components to run on an app server

  18. Workflow Framework -- PO’s IT? manager no no PO Big? Laven yes yes

  19. Data Intensive Workflow Should Move Inside an OR DBMS • GUI for “boxes and arrows” • Compiler for the diagram • processing steps become components • business rules become triggers • all data flow inside the DBMS • Worked great in Media/360

  20. Why? • Big Big Big performance advantage • no polling of the DBMS • no data movement • easy to change! Watch for Informix product in this area!

  21. Nirvana • One integrated system that does • federation • EAI • app service • With a single transformation system • Based on DBMS technology (or something else….)

  22. XML • Good for content storage and movement • Good as “on the wire” format for data movement • as long as you don’t need to send a lot of stuff fast • Bad for data storage!

  23. History Lesson • 1960’s • IMS and IDMS get traction • customers start complaining about rewriting everything when schema changes

  24. History Lesson • 1970 • Codd writes pioneering paper • starts a decade long argument between IMS/CODASYL advocates and Codd supporters

  25. Net-Net of Argument • Putting semantics into data order is bad • restricts storage options • Hidden meaning bad • no self-defining fields

  26. Net-Net of Argument • Data independence is good • schemas change often • don’t want to rewrite anything when this happens

  27. Net-Net of Argument • Complexity is bad • high level query languages are good • KISS arguments • Call these three premises “Codd’s laws”

  28. History Lesson • 1983(?) • Codd wins Turing award • acknowledgement for being right

  29. XML in This Historical Light • Most of the bad features of IMS/Codasyl • allows semantics in data order • data independence will be a challenge • try updates on inverted hierarchies • look at IMS LDBs • more complex than Codasyl

  30. Our Field • We look a little silly saying • an idea renounced in the 1970’s • is back • Leading our colleagues to ask “What’s different?” • if somebody disproved Codd’s laws; they didn’t tell me…..

  31. How to Win the Turing Award Circa 2020 • 2000’s • XML data storage gets traction • 2010 • dust off Codd’s paper • Wait 10 years to be proven right

  32. In Any Case • In line tags turn 1Tbyte of EMP data into 10 Tbytes of EMP data • Won’t store anything big in native XML • will use something else…. • like what?

  33. OR DBMS • XML is merely this year’s data type • Next year it will be WML or … • and there will be a next year….

  34. XMLSchema • Contains the kitchen sink • Complexity run amok • diarrhea from the SGML types • Includes lots of known hard stuff • e.g. union types

  35. Xquery • Mostly syntactic sugar on OR SQL • // is a user-defined function in Informix OR engine • Try to keep the semantics close to OR SQL

  36. Another History Lesson • Typical enterprise wanted data integration for business analysis badly • needed data in a variety of systems • in a variety of formats • often with no unique ids • often with incompatible semantics • 2 day delivery means lots of things • often dirty

  37. ETL Warehouse Projects of the 90’s • Well into 8 digits • Usually a factor of three behind schedule • Delivering a factor of 3 less stuff • Everybody dented their pick on semantic heterogeneity • which is hard, hard, hard • and not solved by the blizzard of 3 letter acronyms from Redmond

  38. Web Services • Will be a long time coming outside of simple domains (where there is no data integration to deal with) • E.g. catalog management • Grainger perspiration….

  39. The Depressing State of Affairs • ~50-75% of IT projects fail • if we built bridges, our profession would be fired • and the same mistakes are repeated over and over (excessive ambition, rolling specs, bad design, failure to load a large data set early)

  40. What To Do? • We typically don’t teach this stuff (and do a serious disservice to our students) • probably because we don’t (can’t) spend any time in industry to figure it out Action item: at the very least read a couple of Robert L. Glass’s books

  41. The Depressing State of Affairs • Hardware “half-life” is 18 months • Software half-life is 18 years (or more)! • In 25 years we moved from • C to Java • SQL to Xquery

  42. What To Do? • Much higher level design environments • vis • workflow • special purpose languages (report writers,…) • And stop turning down papers on this stuff

  43. Grand Challenge • Improve application productivity (probability of success * programmer productivity) by 2 this decade

More Related