Extensible Scalable Monitoring for Clusters of Computers

Extensible Scalable Monitoring for Clusters of Computers Eric Anderson U.C. Berkeley Summer 1997 NOW Retreat

Overall Problem • Monitoring a cluster of cooperating computers • Different from client-server where only server’s matter • Requires substantial information from all machines • 100’s-1000’s of nodes • Client-server becomes subset of this problem

Problems & Solutions • Cluster software and hardware is constantly evolving • Monitoring software must be extensible and flexible • Use relational tables • Failures will occur in the cluster • Monitoring software must detect and recover from failures • Use timestamps for weak synchronization • Scalability needed to hundreds of nodes • Need to efficiently transfer data from sources to sinks • Use hierarchy & hybrid push-pull protocol • Need to display statistics and information from all nodes • Use statistical aggregation + color,shade to minimize info. loss

Overview • Details of solutions • Handling evolving software • Detecting and recovering from failures • Scaling data management • Scaling visualization • Implementation • Architecture • Programs • Snapshot • Experience • Conclusion & Future Work

Problem: Clusters Evolve • Solution: Relational tables • Increases flexibility by decoupling data users from data providers • Increases extensibility by structuring data into independent tables • Increases extensibility by allowing additional columns in tables without breaking old programs • Retains performance through transparent use of indicies • Improvement over tree structures in previous systems

Problem: Failures Occur • Solution: Use timestamps • Loss of periodic updates to timestamps allow remote nodes to detect failures • Timestamps allow weak synchronization between databases • Better availability during failures, simpler recovery • Timestamps allow stale data to be eliminated • Only requires purges run every so often rather than relying on programs to clean up after themselves • Reasons 2 & 3 are useful even in normal operation

Problem: Scalable Data Access • Solution: Hierarchy + efficient protocol • Hierarchy allows • Batching of data from different nodes (all data from routers) • Specialization to particular data (all data on processes) • Efficient protocol (Hybrid of push/pull) • Sink sends (SQL select command, interval, count ) to source • Changed data is extracted via SQL every interval seconds and forwarded to the sink count times • Sink can cancel requests at any time • Achieves the best of pull and push protocols in terms of wasted data transfers, freshness, and network bandwidth

Problem: Scalable Visualization • Solution: Statistical aggregation + use of shade & color to minimize information loss • Aggregate across similar variables (average load of 10 machines); show dispersion (std. dev.) as shade • Aggregate across variables from one node (utilization = max{disk,network,cpu}) • Both forms of aggregation at the same time — hierarchical aggregation • Use color to draw attention to special things (nodes down) to limit visual overload

forwarder forwarder forwarder Java applet top-level DB javaserver node-level DB node-level DB node-level DB joinpush Java applet forwarder forwarder mid-level DB mid-level DB joinpush joinpush gather gather gather gather forwarder forwarder node-level DB Implementation Architecture

Implementation Details • Databases are MiniSQL • Freely available with source code • Implements subset of SQL • Forwarder implements source part of hybrid protocol • Using polling to get data from database • Joinpush implements merging part of hierarchy • Control of merge sources external to the program • Both forwarder & joinpush implemented in threaded C • Simpler implementation for blocking operations • Could be merged in with the database

Implementation Details, cont. • Gather implemented in perl • Simpler to add new data sources, but would like threading • Somewhat inefficient, might re-implement in C • Javaserver implemented in perl • Easier to extend with additional aggregation forms • Application level proxy because Java can’t access network • Javaclient implemented in Java • Allows clients to run in browser anywhere in the world • Weak feedback to javaserver to control information displayed

Implementation Snapshot

Experience • Configuration information should be in database • Had them in random files; database collects it together • Reset-world operation very important • Puts system in known state • Useful for default destination of statistics of remote database • Minimizes load on monitored nodes • Potentially reduces fault tolerance • Browser user interface very useful • Limitations of Java very obnoxious

Conclusion • Four problems & solutions important for any cluster monitoring system • Evolution inherent in uses of clusters • Independent failures occur in all clusters • Scalability of data management needed for large clusters • Scalability of visualization also needed for large clusters • Implementation works, and initially useful, further deployment needed • Experience identified problems, places for improvements.

Future Work • Automatic identification of statistics relevant to problems • Expect to be able to use Boolean disjunction learning algorithms • Tracking of long term trends and statistical measures • Self tuning of specialized databases based on usage • Addition of notification, repair components • Gathering of more statistics (via SNMP for example) • Distribution of system to external sites

Extensible Scalable Monitoring for Clusters of Computers