150 likes | 421 Views
Magpie: Profiling for Performance Analysis of Distributed Systems. Rebecca Isaacs (joint work with Paul Barham) 4 July 2002. What is Magpie?. A tool for characterising the workload of a distributed system based on detailed observations of system activity
E N D
Magpie: Profiling forPerformance Analysisof Distributed Systems Rebecca Isaacs (joint work with Paul Barham) 4 July 2002
What is Magpie? • A tool for characterising the workload of a distributed system based on detailed observations of system activity • Online measurements are taken by a set of distributed profiling components • System resource consumption is accounted to individual requests • e.g. CPU, disk accesses and network bandwidth used by HTTP request in web server • Offline processing of the recorded data derives a characterization of the system workload
Motivation: Performance Modelling • Goal is to derive a generative model of the system workload suitable for input to a performance modeller • Scope (currently) is multi-tier server farms running .NET web sites • Advantages of Mapgie: • Acquire a workload description with less human effort than conventional benchmarking • Extract a detailed model from a ‘representative’ system • Not just a long-term average across all transactions • Measure with a realistic mix of transaction types • Build a probabilistic model of the usage profile which includes “hidden” transaction types, eg error conditions • Complex behaviour may not be easily observable manually, eg web transaction type discriminator is not necessarily the URL
Profiling Components (1) • Windows XP has efficient low-level event tracing built in to the kernel • Perfinfo is a command-line tool for turning on or off tracing of specific system activities • Magpie runs perfinfo on both servers to capture • Context switches • File IO • Disk IO • Network send and receive • Process and thread creation and deletion
Profiling Components (2) • ISAPI filter • DLL loaded into IIS (web server) process • Filter registers with IIS to receive particular event notifications • Can examine and modify both incoming and outgoing streams of data • Magpie ISAPI filter • Allocates a unique identifier to each incoming request and adds it to the HTTP header • Records cycle counter + resource usage at entry and exit
Profiling Components (3) • HTTP Module • Part of ASP.NET • Each request is processed by multiple HTTP modules, eg session, authentication etc • Magpie HTTP Module • Stores request identifier in (managed) thread local state • Records cycle counter, managed thread id + resource usage
Profiling Components (4) • Common Language Runtime Profiling API • Two COM interfaces: • Profiler implements the “notifications API” eg function enter/leave, thread mapping, garbage collection • Runtime implements API which allows profiler to get more information • Magpie CLR Profiler • Monitors CLRg OS thread mappings • Records thread ids, cycle counter + resource usage • Intercepts JIT compilation of relevant ADO.NET functions • Inserts calls to profiling functions • Modifies SQL stored procedure invocations
Profiling Components (5) • SQL Profiler • Logs selected events (can be user defined) to table or file • Magpie SQL Profiling • Wraps original stored procedures • Runs extended store procedure to get cycle counter + resource usage stats before and after executing original request • Generates trace events before and after executing original request • Recorded by the SQL Profiler in output trace • Data includes request identifier, cycle counter + resource usage
w3wp HTTPModule Patched IL ISAPI Filter ASP.Net Biz Logic ADO.Net front end Wrappers CLR / Jit Compiler SQL Profiler Log CLR Profiler & IL Patcher Log Log PerfInfo PerfInfo Log Log Log Log Magpie: Measurement Infrastructure Store request id in TLS Wrap stored procs with profiling Modify SQL RPC Tag each request Web Server(s) SQL Server(s) Client(s) Stored Procs DBMS Cache Observations are ordered by cycle counter Kernel Kernel Context switches, disk and file IO, network send and receive
What really happens in a simple request? Web Server SQL Server Client http://someurl.aspx SQL request data web page
IIS ASP.NET ADO.NET Other Ready Blocked Magpie observations of CPU used by one request IIS threads ASP.564 bad0019d IIS.918 IIS.9b4 39.65s 40.15s SQL threads SQL.fa4 bad0019d bad0019d SQL.f5c bad0019d 38.32s 38.68s
Models of the simple request Typically assumed structure 20% 80% IIS 100% SQL Actual structure observed by Magpie IIS SQL
Simulation Case Study • Compare SEQUENTIAL transaction with PIPELINED transaction: • Saturation test with 1000 requests • Equal resource demands (22ms comp IIS, 20ms SQL, 3x1k net)
Process Req Blocked C Tx B B Rx C B Send Pkt B B Tx Rx B B C,D S S S S S S S S S S S S S S S S S S S S S 3 1 3 2 2 3 3 3 4 1 2 1 3 3 1 1 4 2 2 1 2 Constructing Models with Machine Learning? • Learn probabilistic models of resource usage by different request types • Possibly apply coupled hidden Markov models? Web Compute, Disk IO Receive Pkt SQL Waiting Send Pkt etc. time
Future Work • Investigate ways of extracting models from the data, esp. machine learning • Use Magpie to learn parameters in the “live” system order to calibrate hardware device models (very speculative) • Explore other types of distributed system, eg peer-to-peer