300 likes | 469 Views
Fay. Extensible Distributed Tracing from Kernels to Clusters. Úlfar Erlingsson, Google Inc. Marcus Peinado , Microsoft Research Simon Peter, Systems Group, ETH Zurich Mihai Budiu , Microsoft Research. Wouldn’t it be nice if…. We could know what our clusters were doing?
E N D
Fay Extensible Distributed Tracing from Kernels to Clusters Úlfar Erlingsson, Google Inc. Marcus Peinado, Microsoft Research Simon Peter, Systems Group, ETH Zurich MihaiBudiu, Microsoft Research
Wouldn’t it be nice if… • We could know what our clusters were doing? • We could ask any question, … easily, using one simple-to-use system. • We could collect answers extremely efficiently … so cheaply we may even ask continuously.
Let’s imagine... • Applying data-mining to cluster tracing • Bag of words technique • Compare documents w/o structural knowledge • N-dimensional feature vectors • K-means clustering • Can apply to clusters, too!
Cluster-mining with Fay • Automatically categorize cluster behavior, based on system call activity
Cluster-mining with Fay • Automatically categorize cluster behavior, based on system call activity • Without measurable overhead on the execution • Without any special Fay data-mining support
Fay K-Means Behavior-Analysis Code var kernelFunctionFrequencyVectors = cluster.Function(kernel, “syscalls!*”) .Where(evt => evt.time < Now.AddMinutes(3)) .Select(evt => new { Machine = fay.MachineID(), Interval = evt.Cycles / CPS, Function = evt.CallerAddr }) .GroupBy(evt => evt, (k,g) => new { key = k, count = g.Count() }); Vector Nearest(Vector pt, Vectors centers) { var near = centers.First(); foreach (var c in centers) if (Norm(pt – c) < Norm(pt – near)) near = c; return near; } Vectors OneKMeansStep(Vectors vs, Vectors cs) { return vs.GroupBy(v => Nearest(v, cs)) .Select(g => g.Aggregate((x,y) => x+y)/g.Count()); } Vectors KMeans(Vectors vs, Vectors cs, int K) { for (int i=0; i < K; ++i) cs = OneKMeansStep(vs, cs); return cs; }
Fay K-Means Behavior-Analysis Code var kernelFunctionFrequencyVectors = cluster.Function(kernel, “syscalls!*”) .Where(evt => evt.time < Now.AddMinutes(3)) .Select(evt => new { Machine = fay.MachineID(), Interval = evt.Cycles / CPS, Function = evt.CallerAddr }) .GroupBy(evt => evt, (k,g) => new { key = k, count = g.Count() });
Fay vs. Specialized Tracing • Could’ve built a specialized tool for this • Automatic categorization of behavior (Fmeter) • Fay is general, but can efficiently do • Tracing across abstractions, systems (Magpie) • Predicated and windowed tracing (Streams) • Probabilistic tracing (Chopstix) • Flight recorders, performance counters, …
Key Takeaways Fay: Flexible monitoring of distributed executions • Can be applied to existing, live Windows servers • Single query specifies both tracing & analysis • Easy to write & enables automatic optimizations • Pervasively data-parallel,scalable processing • Same model within machines & across clusters • Inline, safe machine-code at tracepoints • Allows us to do computation right at data source
K-Means: Single, Unified Fay Query var kernelFunctionFrequencyVectors = cluster.Function(kernel, “*”) .Where(evt => evt.time < Now.AddMinutes(3)) .Select(evt => new { Machine = fay.MachineID(), Interval = evt.Cycles / CPS, Function = evt.CallerAddr}) .GroupBy(evt => evt, (k,g) => new { key = k, count = g.Count() }); var kernelFunctionFrequencyVectors = cluster.Function(kernel, “*”) .Where(evt => evt.time < Now.AddMinutes(3)) .Select(evt => new { Machine = MachineID(), Interval = w.Cycles / CPS, Function = w.CallerAddr}) .GroupBy(evt => evt, (k,g) => new { key = k, count = g.Count() }); Vector Nearest(Vector pt, Vectors centers) { var near = centers.First(); foreach (var c in centers) if (|pt – c| < |pt – near|) near = c; return near; } Vector Nearest(Vector pt, Vectors centers) { var near = centers.First(); foreach (var c in centers) if (Norm(pt – c) < Norm(pt – near)) near = c; return near; } Vectors OneKMeansStep(Vectors vs, Vectors cs) { return vs.GroupBy(v => Nearest(v, cs)) .Select(g => g.Aggregate((x,y) => x+y)/g.Count()); } Vectors KMeans(Vectors vs, Vectors cs, int K) { for (int i=0; i < K; ++i) cs = OneKMeansStep(vs, cs); return cs; } Vectors OneKMeansStep(Vectors vs, Vectors cs) { return vs.GroupBy(v => Nearest(v, cs)) .Select(g => g.Aggregate((x,y) => x+y)/g.Count()); } Vectors KMeans(Vectors vs, Vectors cs, int K) { for (int i=0; i < K; ++i) cs = OneKMeansStep(vs, cs); return cs; }
Fay is Data-Parallel on Cluster • View trace query as distributed computation • Use cluster for analysis
Fay is Data-Parallel on Cluster • System call trace events • Fay does early aggregation & data reduction • Fay knows what’s needed for later analysis
Fay is Data-Parallel on Cluster • System call trace events • Fay does early aggregation & data reduction • K-Means analysis • Fay builds an efficient processing plan from query
Fay is Data-Parallel within Machines • Early aggregation • Inline, in OS kernel • Reduce dataflow & kernel/user transitions • Data-parallel per each core/thread
Processing w/o Fay Optimizations K-Means: System calls K-Means: Clustering • Collect data first (on disk) • Reduce later • Inefficient, can suffer data overload
Traditional Trace Processing K-Means: System calls K-Means: Clustering • First log all data (a deluge) • Process later (centrally) • Compose tools via scripting
Takeaways so far Fay: Flexible monitoring of distributed executions • Single query specifies both tracing & analysis • Pervasively data-parallel,scalable processing
Safety of Fay Tracing Probes • A variant of XFI used for safety [OSDI’06] • Works well in the kernel or any address space • Can safely use existing stacks, etc. • Instead of language interpreter (DTrace) • Arbitrary, efficient, statefulcomputation • Probes can access thread-local/global state • Probes can try to read any address • I/O registers are protected
Key Takeaways, Again Fay: Flexible monitoring of distributed executions • Single query specifies both tracing & analysis • Pervasively data-parallel,scalable processing • Inline, safe machine-code at tracepoints
Installing and Executing Fay Tracing • Fay runtime on each machine • Fay module in each traced address space • Tracepoints at hotpatched function boundary Tracing Runtime query Createprobe ETW User-Space Kernel Target Fay Probe XFI Hotpatching 200 cycles
Low-level Code Instrumentation Module with a traced function Foo Caller: ... e8ab62ffff call Foo ... ff1508e70600 call[Dispatcher] Foo: ebf8 jmp Foo-6 cccccc Foo2: 57 push rdi ... c3 ret • Replace 1stopcode of functions
Low-level Code Instrumentation Module with a traced function Foo Fay platform module Dispatcher: t = lookup(return_addr) ... call t.entry_probes ... call t.Foo2_trampoline ... call t.return_probes ... return /* to after call Foo */ Caller: ... e8ab62ffff call Foo ... ff1508e70600 call[Dispatcher] Foo: ebf8 jmp Foo-6 cccccc Foo2: 57 push rdi ... c3 ret • Replace 1stopcode of functions • Fay dispatcher called via trampoline
Low-level Code Instrumentation Module with a traced function Foo Fay platform module Fay probes Dispatcher: t = lookup(return_addr) ... call t.entry_probes ... call t.Foo2_trampoline ... call t.return_probes ... return /* to after call Foo */ Caller: ... e8ab62ffff call Foo ... ff1508e70600 call[Dispatcher] Foo: ebf8 jmp Foo-6 cccccc Foo2: 57 push rdi ... c3 ret PF3 XFI PF4 PF5 XFI XFI • Replace 1stopcode of functions • Fay dispatcher called via trampoline • Fay calls the function, and entry & exit probes
What’s Fay’s Performance & Scalability? • Fay adds 220 to 430 cycles per traced function • Fay adds 180% CPU to trace all kernel functions • Both approx 10x faster than Dtrace, SystemTap Slowdown (x) Null-probe overhead Cycles
Fay Scalability on a Cluster • Fay tracing memory allocations, in a loop: • Ran workload on a 128-node, 1024-core cluster • Spread work over 128 to 1,280,000 threads • 100% CPU utilization • Fay overhead was 1% to 11% (mean 7.8%)
More Fay Implementation Details • Details of query-plan optimizations • Case studies of different tracing strategies • Examples of using Fay for performance analysis • Fay is based on LINQ and Windows specifics • Could build on Linux using Ftrace, Hadoop, etc. • Some restrictions apply currently • E.g., skew towards batch processing due to Dryad
Conclusion • Fay: Flexible tracing of distributed executions • Both expressiveand efficient • Unified trace queries • Pervasive data-parallelism • Safe machine-code probe processing • Often equally efficient as purpose-built tools
A Fay Trace Query from ioin cluster.Function("iolib!Read") where io.time < Now.AddMinutes(5) let size = io.Arg(2) // request size in bytes group ioby size/1024 into g select new { sizeInKilobytes = g.Key, countOfReadIOs = g.Count() }; • Aggregates read activity in iolib module • Across cluster, both user-mode & kernel • Over 5 minutes
A Fay Trace Query from ioin cluster.Function("iolib!Read") where io.time < Now.AddMinutes(5) let size = io.Arg(2) // request size in bytes group ioby size/1024 into g select new { sizeInKilobytes = g.Key, countOfReadIOs = g.Count() }; • Specifies what to trace • 2nd argument of read function in iolib • And how to aggregate • Group into kb-size buckets and count