Middleware Diarrhea and Other Ailments

Middleware Diarrhea and Other Ailments Michael Stonebraker Adjunct Professor Massachusetts Institute of Technology (stonebraker@lcs.mit.edu)

Outline • Too much middleware • XML ailments • Web services ills • Our professional sickness

Client-Server Got Replaced by N-Tier Computing • The Web • Gizmos • Scalability and management problems with client server

Humility Lesson • We all sold client-server hard • during the 80’s • and even into the 90’s • Less than 10 years later • it is the worst idea on the planet We should feel really dumb!

N-Tier Computing Produced Lots of Middleware • App servers • EAI/messaging • ETL • Federators • Workflow • CMS • Portals • DBMS

Middleware Diarrhea • Average enterprise has • one (or more) app servers • one (or more) EAI packages • one (or more) ETL packages • one (or more) portal products • one (or more) application packages • and maybe someday a federated DBMS

All of these systems • Contain transformation engines • And often do function activation (app service) • And often have adapters to legacy systems Huge overlap in functionality!!

Less Moving Parts • Less systems • More uniformity • Less duplication

Less Systems • Less system administrators • Less training • Less manuals • Less bugs • Less cross system issues

More Uniformity • Every island has • memory management • security model • threading model • Less is better

Less Duplication • Most of the islands support transformations • reasonable chance you will do each one 6 or more times • maintenance headache

So How To Consolidate…… • Converge app server into OR DBMS • dumbest OR query is execute function Remember that everything looks like a nail to the guy with the hammer!

component component component Pictorially client DBMS DBMS

This Requires…. • DBMS to send queries to other DBMSs • I.e. be a data federator • Load balance also requires a federator

Best of Breed Federators • Support schema heterogeneity • by executing OR functions • Support materialized views • to cache static data

Less Moving Parts…. • Federators dominate ETL • ETL only supports “push” • federators do both “push” and “pull”

Workflow • A collection of rules • who’s allowed to buy what • and who must approve it • Best considered as a boxes and arrows diagram • And compiled into components to run on an app server

Workflow Framework -- PO’s IT? manager no no PO Big? Laven yes yes

Data Intensive Workflow Should Move Inside an OR DBMS • GUI for “boxes and arrows” • Compiler for the diagram • processing steps become components • business rules become triggers • all data flow inside the DBMS • Worked great in Media/360

Why? • Big Big Big performance advantage • no polling of the DBMS • no data movement • easy to change! Watch for Informix product in this area!

Nirvana • One integrated system that does • federation • EAI • app service • With a single transformation system • Based on DBMS technology (or something else….)

XML • Good for content storage and movement • Good as “on the wire” format for data movement • as long as you don’t need to send a lot of stuff fast • Bad for data storage!

History Lesson • 1960’s • IMS and IDMS get traction • customers start complaining about rewriting everything when schema changes

History Lesson • 1970 • Codd writes pioneering paper • starts a decade long argument between IMS/CODASYL advocates and Codd supporters

Net-Net of Argument • Putting semantics into data order is bad • restricts storage options • Hidden meaning bad • no self-defining fields

Net-Net of Argument • Data independence is good • schemas change often • don’t want to rewrite anything when this happens

Net-Net of Argument • Complexity is bad • high level query languages are good • KISS arguments • Call these three premises “Codd’s laws”

History Lesson • 1983(?) • Codd wins Turing award • acknowledgement for being right

XML in This Historical Light • Most of the bad features of IMS/Codasyl • allows semantics in data order • data independence will be a challenge • try updates on inverted hierarchies • look at IMS LDBs • more complex than Codasyl

Our Field • We look a little silly saying • an idea renounced in the 1970’s • is back • Leading our colleagues to ask “What’s different?” • if somebody disproved Codd’s laws; they didn’t tell me…..

How to Win the Turing Award Circa 2020 • 2000’s • XML data storage gets traction • 2010 • dust off Codd’s paper • Wait 10 years to be proven right

In Any Case • In line tags turn 1Tbyte of EMP data into 10 Tbytes of EMP data • Won’t store anything big in native XML • will use something else…. • like what?

OR DBMS • XML is merely this year’s data type • Next year it will be WML or … • and there will be a next year….

XMLSchema • Contains the kitchen sink • Complexity run amok • diarrhea from the SGML types • Includes lots of known hard stuff • e.g. union types

Xquery • Mostly syntactic sugar on OR SQL • // is a user-defined function in Informix OR engine • Try to keep the semantics close to OR SQL

Another History Lesson • Typical enterprise wanted data integration for business analysis badly • needed data in a variety of systems • in a variety of formats • often with no unique ids • often with incompatible semantics • 2 day delivery means lots of things • often dirty

ETL Warehouse Projects of the 90’s • Well into 8 digits • Usually a factor of three behind schedule • Delivering a factor of 3 less stuff • Everybody dented their pick on semantic heterogeneity • which is hard, hard, hard • and not solved by the blizzard of 3 letter acronyms from Redmond

Web Services • Will be a long time coming outside of simple domains (where there is no data integration to deal with) • E.g. catalog management • Grainger perspiration….

The Depressing State of Affairs • ~50-75% of IT projects fail • if we built bridges, our profession would be fired • and the same mistakes are repeated over and over (excessive ambition, rolling specs, bad design, failure to load a large data set early)

What To Do? • We typically don’t teach this stuff (and do a serious disservice to our students) • probably because we don’t (can’t) spend any time in industry to figure it out Action item: at the very least read a couple of Robert L. Glass’s books

The Depressing State of Affairs • Hardware “half-life” is 18 months • Software half-life is 18 years (or more)! • In 25 years we moved from • C to Java • SQL to Xquery

What To Do? • Much higher level design environments • vis • workflow • special purpose languages (report writers,…) • And stop turning down papers on this stuff

Grand Challenge • Improve application productivity (probability of success * programmer productivity) by 2 this decade

Middleware Diarrhea and Other Ailments

Middleware Diarrhea and Other Ailments

Presentation Transcript

Diarrhea

DIARRHEA

diarrhea

DIARRHEA

Vomiting and Diarrhea (Gastroenteritis)

Illnesses and Ailments

Ailments

SPIRITUAL AILMENTS

Diarrhea

DIARRHEA

SPIRITUAL AILMENTS

Diarrhea and Constipation

Diarrhea

Antibiotics associated Diarrhea ( Nosocromial Diarrhea)

Chronic diarrhea and malabsorption

Snorkel Ailments

DIARRHEA

DIARRHEA

Diarrhea

Acute and Chronic Diarrhea

Diarrhea