570 likes | 822 Views
Stuck in the middle: challenges and trends in optimizing middleware. Daniel M. Yellin Director of Software Technology IBM Research June 2001. With thanks to a cast of players. Paul Dantzig Joe Hellerstein Doug Kimelman Chet Murthy Darrell Reimer Gabi Zodik …. Outline.
E N D
Stuck in the middle: challenges and trends in optimizing middleware Daniel M. Yellin Director of Software Technology IBM Research June 2001
With thanks to a cast of players • Paul Dantzig • Joe Hellerstein • Doug Kimelman • Chet Murthy • Darrell Reimer • Gabi Zodik • …
Outline • Middleware introduction • Paradigm shift from programs to compusystems • Optimizing middleware
In the middle of distributed components Directory App 1 App 3 Data base Customer repository App 2 GLUE Horizontal Composition
In the middle of application and network services Application Layer of abstraction Vertical Composition Middleware Service 3 Service 1 Service 2 Service 4
Middleware sales growing by leaps and bounds! Example: Revenue for Integration Broker Suites In millions of $ Source: Application Integration Middleware Market, Gartner, 9/01
Why the demand? • Business trends requiring integration • Enterprise Resource Planning (ERP), Supply Chain Management (SCM), Customer Relationship Management (CRM)
Why the demand? • Business trends requiring integration • ERP, SCM, CRM • Internet has accelerated the drive • New e-business processes that tie together multiple applications
Why the demand? • Business trends requiring integration • ERP, SCM, CRM • Internet has accelerated the drive • New e-business processes that tie together multiple applications • Business to business (B2B) will further accelerate this trend
Web Browser CICS Connect Web Application Server Back End Business Applications Business Integration Services Messaging Intelligent Routing Message Transform. Agents 2 C O M M A N D B E A N MQ Se r i e s MessageBased MessageBased Servlet 1 3 Interfaces Interfaces Etc. Session Pool J N D I 4 JDBC JSP JSP 5 JDBC LDAP Directory Personalization Data Base Data Bean Remote Database HTML 1a Typical company web architecture
Outline • Middleware “introduction” • Paradigm shift from programs to compusystems • Three principles to optimization
What does it all mean? • From micro to macro programming • Building the “application logic” itself often a small part of the overall development process • More and more of the development process is about making the pieces work together (connecting the dots)! void main() {int i = i+1; }
From programs to compusystems • Ecosystem: a community of animals and plants and the environment with which it is interrelated (Webster’s) • Compusystem: a community of software and hardware components and the middleware with which it is related (Yellin)
Characteristics of compusystems • They are complex • Many parts • Component interactions are often hard to understand • They are forever changing • Cyclic events throughout day, month, year,… • Permanent changes • New “species” introduced, compusystems may be combined,… See Information Week, April 2, 2001 “Conquering Complexity”
Characteristics of compusystems • They run forever • 24 x 7 • No such thing as stopping to test, debug, fix • They have a lot of history (Calcification) • This constrains what you can do • E.g., too much data to reformat • E.g., policy system cannot be shut down until last policy holder dies
Outline • Middleware “introduction” • Paradigm shift from programs to compusystems • Optimizing middleware
Typical performance issues • “Why are we getting such erractic and poor performance?” • Complex system • Can’t separate app v. platform issues • Component expertise doesn’t extend to system
3 principles to optimizing middleware • How can we better understand a compusystem to optimize it?
3 principles to optimizing middleware • Understand • How do you minimize the overhead of all the middleware? • avoid being stuck in the middle! understand
3 principles to optimizing middleware • Understand • New programming models • And how can we design for optimality when the system is forever changing?
3 Principles to optimizing middleware • Understand • New programming models • Adapt
Understand • Understanding systems is different than understanding individual organisms/platforms • Understanding components or even pointwise interactions is not the same as understanding the compusystem • Effects can • be non-local • interfere with one another • Only visible under heavy loads
J N D I JDBC LDAP Person. DB Flows within a compusystem can be hard to follow, can spawn many threads Web Application Server Web Browser MQ Se r i e s Back End Applications Business Integration Services 2 C O M M A N D B E A N Servlet Interfaces 3 Interfaces Session Pool JSP JSP Data Bean Database HTML
J N D I JDBC LDAP Person. DB And you have a lot of concurrent flows Web Application Server Web Browser MQ Se r i e s Back End Applications Business Integration Services 2 C O M M A N D B E A N Servlet Interfaces 3 Interfaces Session Pool JSP JSP Data Bean Database HTML
App Server Browser Back End Apps Messaging Servlet JVM LDAP JSP JSP DB DB HTML Example: Websphere Performance Toolkit monitor dashboard monitor
Websphere Performance Toolkit • Key concepts • Monitor: real time collection of “vital” signs from across the compusystem • Dashboard: information consolidated into a unified display • shows what is happening in the compusystem, and can help make apparent correlations between different parts of the compusystsm
Websphere Performance Toolkit • “My site is slow, now what?” • dashboard gives immediate clues as to what to look for • what to display is critical • Helps find correlations • example: router was broken, was not “spraying” requests appropriately • could not tell from looking at single site, only by burst pattern (“wave” traveling from one machine to next …)
App Server Browser Back End Apps Messaging Servlet JVM LDAP JSP JSP DB DB HTML Stress testing the system • Spray the system with increasing loads • See what happens! Stress rig dashboard monitor
Next steps • Analysis • Can we automate the discovery of common performance errors? • misconfiguration • mismatch of parameters for particular load signatures • Interactive trouble shooter • Autonomic middleware • Combine with predictive modeling techniques • Refine models as better data becomes available • Use models to suggest tests to be performed by stress rig
Understand (2) • Understanding not only the dynamic behavior of computsystems but also its static structure (code base) • Trade off exactness for scale • Need to understand dependencies across heterogeneous code artifacts (html, servlets, Java, Cobol,…) • Example: change in LDAP indexing had ripple effect on application. From 10->20 ms/request.
Example: Asset Locator Composed of two major modules • information gathering engine (crawler) • semantic search server eColabra: An Enterprise Collaboration & Reuse Environment", Orit Edelstein, Avi Yaeli, Gabi Zodik, 4th International Workshop, NGITS'99, Zikhron-Yaakov, Israel, July 1999, http://iew3.technion.ac.il:8080/~ngits/
Repository analysis phase Server Server Relationship Categorizer Analysis DB2 Categorization find relations database & Engine index server hierarchy reference Java html ... ... rules rules rules Relationship Analysis
HTTP Server / Servlet Engine Query/Result Processors Query Processor DB2 database index server XML Eclipse Plugin HTML Lightweight JavaScript based Database Client Search Server Server Run-time architecture Query (SQL, IR) query/results Navigator Processor Relationship Recursive Analysis package, category, Client hierarchy Data base server Search server
Asset Locator • Impact analysis • “What resources are affected if I …?” • Resource statistics • Reuse • Finding assets of a particular type, function,… • Understanding • Relationships between many different assets of many different types
Outline • Middleware “introduction” • Paradigm shift from programs to compusystems • Optimizing middleware • Understanding • New programming models • Adaptive systems
New programming models • Compusystems have a lot of redundancy • In performing the same computations • In marshalling in and and out of normal forms • Use caching intelligently • Cache consistency can be hard • Computations through the compusystem can have a long journey • Minimize length of pipeline • Shift computation and queues “stage left” • Minimize the holding of resources
Pipelines cthreads plugin Queue App server Http server Servlet engine Queue Queue Orb Net router Resource connections
Pipelines App server Http server Net router Shift computations and queuing to the left
Massive web sites with dynamic content • Evolution of web sites • From static html, gif, jpeg • To dynamic html, gif, jpeg + database accesses, transactions + rich media • Massive number of changing “objects” • Objects may be assembled into many pages • Challenges • management • throughput • consistency
Using caching effectively • Four tier web serving architecture • Content management servers • Origin Servers • Origin Caches • Point of distribution caches • Generation upon demand Production upon availability • Intelligent cache expiration policies
The effectiveness of caching "High-Performance Web Site Design Techniques", Arun Iyengar Jim Challenger, Daniel Dias, and Paul Dantzig, In IEEE Internet Computing, vol. 4 #2, March/April 2000.
Outline • Middleware “introduction” • Paradigm shift from programs to compusystems • Optimizing middleware • Understanding • New programming models • Adaptive systems
Adapt • It is hard to predict how compusystems will behave • The performance characteristics of the individual components are often not known • Performance characteristics change due to changing conditions • Periodic events (e.g., monthly billing cycles) • Permanent changes (e.g., new systems added)
Examples: adaptive server tuning • Goals • Generic agent for automated tuning • Automatically learn performance characteristics of target system(s) • Adapt system to changing workload and environment • Able to handle distribution, heterogeneity