180 likes | 337 Views
High Performance Monitoring. WG on Storage Federations December 6, 2012 Andrew Hanushevsky, SLAC http://xrootd.org. Setting The Context. High Performance Monitoring
E N D
High Performance Monitoring WG on Storage Federations December 6, 2012 Andrew Hanushevsky, SLAC http://xrootd.org
Setting The Context • High Performance Monitoring • Collecting real-time information at statistically significant detail without impacting client or server performance that works at scale. • The relevant phrases • Real-time information • Statistically significant • Without impacting performance • At scale
At Scale • 1000’s of users • 10,000 or more simultaneous jobs • 100,000 or more active files • Geographically distributed across • Thousands of data servers • Hundreds of millions of files • Hundreds of peta-bytes of data • Potentially billions of events every second!
Without Impacting Performance • This requires careful collection & reporting • Many trade-offs but generally • Highly encoded data to minimize traffic • Typically implies binary encoding • Offloading information serialization • More on this at the end • Network protocol that is fast and does not block • Typically implies using UDP
Statistically Significant I • All events need not be 100% time accurate • No need to time-stamp each event • We can’t as server performance would suffer • So, we can report events in time-windows • Events are statistically post-distributed in the window • Note that events are reported in occurrence order • Any event is disposable • This means we can loose events • Allows use of non-blocking UDP packets for reporting
Statistically Significant II • Statistical significance relies on a large sample • We want the big picture • This is monitoring not accounting! • Build it up using a large number of events • And we can get a large number every second • But we don’t expect to get every event • This helps us achieve high performance • Yet provides a reasonably accurate picture
Real Time Information • Reporting events close to the time they happen • Regulated by the size of the window • Typically, in the seconds (e.g. 5 or 10, maybe longer) • What information? • Practically anything that might happen. . . . • Logins and logouts • File operations (open, close, remove, etc) • File I/O (i.e. reads and writes) • Request redirections
A Practical Implementation • xrootdprovides a wide range of monitoring data at high performance • Information is broken out into streams • Asynchronous information packets for • Periodic summary data • Summary stream • Low event rate allows for it to be xml based • Real time detail data • F, M, R, T streams • Potentially high event rates necessitates binary format
Why Streams? • Allows one to easily • Group related information together • Independently select the level of detail in each group • Route information to different collectors • These can be specialized for each stream • Control the performance impact of each stream • Streams can be selectively enabled • Makes it easier to handle the raw data
The Summary Stream • Summary data periodically reported • Very large amount of data available • http://xrootd.org/doc/prod/xrd_monitoring.htm • Selectableby category • Centrally collected • Collector merges reporters • Fed into your favorite monitoring system • Ganglia, GRIS, Nagios, MonALISA, etc • Relatively low amount of traffic – negligible impact
The Real Time Streams • Easily> 50 MB/Sec of complex inter-related asynchronous monitoring data • Collector needs to be fast and robust • May need to cross-reference certain streams • Store the data is an easily analyzable format • E.g.mySQLor root files • Condense the information for suitable rendering • Send it to the rendering agent • E.g. via active MQ to the dashboard • High amount of traffic – high impact
The Real Time M Stream • The Map stream • Server, user, and file names mapped to binary id’s • The id’s are used in other streams as backward refs • Allows >100x compression of redundant information • Gross file events • Purges (auto-removals) & stage-ins (auto-transfers) • Client generated event data • Job name, site, and performance data • Selectable detail levels • Typically, less than 1% overhead
The Real Time F Stream • The File stream • Per-file I/O summary information • Bytes read, written vs method used • Sigma values for byte and operation counts • Per-file I/O progress information • Periodic report on bytes transferred • Selectable detail levels • 1 to 3% overhead
The Real Time R Stream • The Redirect stream • Source to destination redirect information • Operation causing the redirect • Generated by any server that redirects clients • No selectable detail levels • Pretty much all of the information is needed • About 1% overhead
The Real Time T Stream • The Trace stream • Per-file I/O information • Offset and bytes read or written for each operation • Identical to a seek trace • Selectable detail levels • 3 to 5% overhead
Back To Offloading • Recall xrootdmonitoring is async multi-stream • This means that the collector must time order the data as the server does not do this • Each packet has enough information to do this • We do this because serialization is very expensive • Extremely high impact in a multi-threaded application • The hard work is offloaded to another server • Allows the data server to concentrate on delivering user data not monitoring data
Conclusion I • High performance monitoring is hard work • It requires minute attention to detail • Data formats • Work load distribution • Non-blocking internal data structures • Information flow • We estimate that for xrootdit took about four person years to achieve an extremely low level of server performance impact • Making real-time monitoring practical at scale
Conclusion II • Federations create an extreme scale system • Viewed as a single complex big data system • The outlined information is needed to asses it • Only practical with high performance monitoring • In essence • High performance real-time monitoring is a must to properly track federated storage systems