170 likes | 381 Views
Monkey See, Monkey Do A Tool for TCP Tracing and Replaying. Yu-Chung Cheng Stefan Savage Geoffrey M. Voelker Dept. of Computer Science & Engineering University of California, San Diego. Urs H ö lzle Neal Cardwell. Motivation. Server benchmarks: the “what if” question
E N D
Monkey See, Monkey DoA Tool for TCP Tracing and Replaying Yu-Chung Cheng Stefan Savage Geoffrey M. Voelker Dept. of Computer Science & Engineering University of California, San Diego Urs Hölzle Neal Cardwell USENIX '04
Motivation • Server benchmarks: the “what if” question • Ex: how will changing the server in some way impact the client experience? • Current solutions • Synthetic workload testing: risk-free, convenient, not very realistic • Real user testing: realistic, risky, more hassle • Best of both worlds? • Realistic, convenient and risk-free USENIX '04
Goal • Develop a tool, Monkey, that provides the best of both worlds • Monkey See: captures real traces, infers network and client parameters • Monkey Do: replays traces via emulation • Applied Monkey to Google workload • Captured Google search requests • Replayed and evaluated against Google servers • Evaluation • How realistic is Monkey’s replay? • Can Monkey answer the “what if” questions? USENIX '04
Monkey See Monkey Do Address Remapping 132.239.1.1 192.168.0.1 Google Web Server Client 132.239.1.1 Trace Capturing Extracting Trace Info Client Emulator Network Emulator Server Emulator HTTP GET /search?q=Monkey TCP delay ACK policy, receiver buffer size Google Server kernel Client server: delay 90 ms, 1% loss, 384 kbps, MTU 1500 Server client: delay 90 ms, 0% loss, 128 kbps, MTU 1500 LAN LAN Monkey Design USENIX '04
Monkey See • Goal: capturing Google search traces and infer client/network parameters • Trace capture in front of server farm • Random connection sampling • Extracting info for each TCP connection • Search query from HTTP header • Path delays from RTT • Bandwidth by counting ACK spacing • Loss rate by counting retransmissions • Path MTU from TCP MSS option • Use of TCP delay ACKs by observing ACK sequences USENIX '04
Assumptions 3. Well-provisioned Google Web Server 1. Congestion/loss only happens on data path but not on ACK path client Google Web server Google DB 2. No congestion in Google intranet TCP data packet TCP ACK packet USENIX '04
Monkey Do • Goal: replay traces by emulating the client and network environment • Network emulator (dummynet) • Setup virtual dummynet pipes as data and ACK paths • Client emulator • Establishes connections to the server and send out search queries • Server emulator • Emulates the effect of caching on DB search delay • Runs on same kernel (Linux) as the real Google web daemon does USENIX '04
TCP Handshake HTTP GET write(HEADER) HTTP 200 OK DB Query delay write(RESULT) Replay in action: real Google search Web server DB server Client connect() Static header Search result Time USENIX '04
TCP Handshake HTTP GET write(HEADER) HTTP 200 OK DB Query delay write(RESULT) Replay in action: replay version Client Emulator Server Emulator connect() Static header Search result Time USENIX '04
Evaluation • Replay Validation • How well does Monkey reproduce the workload? • Predictive Replay • How well does Monkey predict? • Metric: Google search response time • Time between the server receives the request and completes sending the result • Trace taken a weekday of Nov 2003 USENIX '04
Monkey See Monkey Do Google Web Daemon Monkey Server Emulator Monkey Client/Network Emulator Real Clients Google Server Kernel Google Server Kernel Torig Treplay Replay Validation Experiment • How well does Monkey reproduce the workload? • Replay the workload without any changes and compare the results USENIX '04
Replay Validation Result % connections • CDF of relative error in search response time per connection • 86% of connections have response time within ±20% error Relative error of search response time [ms] • CDF of relative error in search response time per connection • Monkey tends to underestimate for long RTT connections USENIX '04
Monkey See at time 13:30-14:15 Monkey See at time 14:16-15:06 Monkey Do Google Web daemon Monkey Server Emulator Google Web daemon Monkey Client/Network Emulator Real Clients Google Server Kernel (cwin=1) Google Server Kernel (cwin=3) Real Clients Google Server Kernel (cwin=3) Treplay Tw3 Tw1 Predictive Replay • Is Monkey’s replay predictive? • Compare replay result with result from real server changes • Experiment • Search response time vs. TCP initial congestion window size (cwin) USENIX '04
Predictive Replay Result % connections Search response time [ms] CDF of search response times for Tw3, Tw1, and Treplay USENIX '04
Discussion • Monkey can replay Google search traces well, but … • Can Monkey replay non-Google web or any TCP traffic? • Possible, if the assumptions hold • Caveats: dependencies in disk/network IO events, multiple TCP connections • Is server emulator required? • No, if the server performance distribution is “memory-less”. Otherwise, yes. USENIX '04
Conclusion • Monkey is a new tool for server benchmarking (“what if” questions) • Currently is domain-specific • Accurate for short flows • Best of both worlds • Realism of real user testing • Risk-free and convenience of synthetic tools • Possible to build such tools for specific domain USENIX '04
Q & A • Questions? http://ramp.ucsd.edu/projects/monkey/ USENIX '04