Example: Rumor Performance Evaluation

Example: Rumor Performance Evaluation Andy Wang CIS 5930-03 Computer Systems Performance Analysis

Motivation • Optimistic peer replication is popular • Intermittent connectivity • Availability of replicas for concurrent updates • Convergence and correctness for updates • Example: Rumor, Coda, Ficus, Lotus Notes, Outlook Calendar, CVS

Background • Replication provides high availability • Optimistic replication allows immediate access to any replicated item, at the risk of permitting concurrent updates • Reconciliation process makes replicas consistent (i.e., two replicas for peer-to-peer)

Background Continued • Conflicts occur when different replicas of the same file are updated subsequent to the previous reconciliation

Log on Desktop 10:00 Update 10:25 Update 10:40 Update Log on Portable 10:00 Update 10:25 Update 10:51 Update disconnected Optimistic Replication Example Log on Portable 10:00 Update 10:25 Update Log on Desktop 10:00 Update 10:25 Update connected

Log on Desktop 10:00 Update 10:25 Update 10:40 Update Log on Portable 10:00 Update 10:25 Update 10:51 Update disconnected Log on Desktop 10:00 Update 10:25 Update 10:40 Update 10:51 Update Log on Portable 10:00 Update 10:25 Update 10:40 Update 10:51 Update • connected • Run reconciliation • Detect a conflict • Propagate updates Example Continued

Goal • Understand the cost characteristics of the reconciliation process for Rumor

Services • Reconciliation • Exchange file system states • Detect new and conflicting versions • If possible, automatically resolve conflicts • Else, prompt user to resolve conflicts • Propagate updates

Outcomes • Two reconciled replicas become consistent for all files and directories • Some files remain inconsistent and require user to resolve conflicts

Metrics • Time • Elapsed time • From the beginning to the completion of a reconciliation request • User time (time spent using CPU) • System time (time spent in the kernel) • Failure rate • Number of incomplete reconciliations and infinite loops (none observed)

Metrics not Measured • Disk access time • Require complex instrumentations • E.g., buffering, logging, etc. • Network and memory resources • Not heavily used • Correctness • Difficult to evaluate

Monitor Implementation Reconciliation Process • Top-level Perl time command Perl library Spool-to-dump Recon Spool-to-dump C++ Scanner Rfindstored Rrecon Server

Parameters • System parameters • CPU (speed of local and remote servers) • Disk (bandwidth, fragmentation level) • Network (type, bandwidth, reliability) • Memory (size, caching effects, speed) • Operating system (type, version, VM management, etc.)

Parameters (Continued) • Workload parameters • Number of replicas • Number of files and directories • Number of conflicts and updates • Size of volumes (file size)

Workloads • Update characteristics extracted from Geoff Kuenning’s traces

Experimental Settings • Machine model: Dell Latitude XP • CPU: x486 100 MHz • RAM: 36MB • Ethernet: 10Mb • Operating system: Linux 2.0.x • File system: ext3

Experimental Settings • Should have documented the following as well • CPU: L1 and L2 cache sizes • RAM: Brand and type • Disk: brand, model, capacity, RPM, and the size of on-disk cache • File system version

Experimental Design • 255 full factorial design • Linear regression or multivariate linear regression to model major factors • Target: 95% confidence interval

255 Full Factorial Design • Number of replicas: 2 and 6 • Number of files: 10 and 1,000 • File size: 100 and 22,000 bytes • Number of directories: 10 and 100 • Number of updates: 10 and 450 • Capped at 10 updates for 10 files • Number of conflicts: 0 /* typical */

255 Full Factorial Analysis • Experiment errors < 3%

Variation of Effects • All major effects significant at 95% confidence interval

Residuals vs. Predicted Time • Clusters caused by dominating effects of files

Residuals vs. Experiment Numbers • Residuals show homoscedasticity, almost

Quantile-Quantile Plot • Residuals are normally distributed, almost

Multivariate Regression • Number of replicas: 2 • Number of files: 4 levels, 10-600 • File size: 22,000 bytes • Number of directories: 4 levels, 10-60 • Number of updates: 0 • Number of conflicts: 0 /* typical */ • Number of repetitions: 5 per data point

Multivariate Regression • Experiment errors < 7% • All coefficients are significant

Residuals vs. Predicted Time • Elapsed time shows a bi-model trend • User time shows an exponential trend

Residuals vs. Experiment Numbers • Not so good for elapsed time and user time

Quantile-Quantile Plot • Residuals are not normally distributed for elapsed time and user time

Log Transform (User Time) • ANOVA tests failed miserably

Residual Analyses (User Time) • No indications that transforms can help…

Possible Explanations • i-node related factors • Number of files per directory block • Crossing block boundary may cause anomalies • Caching effects • Reboot needed across experiments

Linear Regression • Number of files: 100, 150, 200, 250, 252, 253,300, 350, 400, 450 • Test for the boundary-crossing condition as the number of files exceeds one block • Note that Rumor has hidden files • Number of repetitions: 5 per data point • Flush cache (reboot) before each run

Linear Regression • R2 > 80% • All coefficients are significant

Residuals vs. Predicted Time • Elapsed time shows a bi-model trend • User time shows an exponential trend

Residuals vs. Experiment Numbers • Elapsed time shows a rising bi-modal trend • Randomization of experiments may help

Quantile-Quantile Plot • Error residuals for elapsed time is not normal • Perhaps piece-wise normal

Possible Explanations • i-node related factors: No • Caching effects: No • Hidden factors: Maybe • Bugs: Maybe

Conclusion • Identified the number of files as the dominating factor for Rumor running time • Observed the existence of an unknown factor in the Rumor performance model

White Slide

Example: Rumor Performance Evaluation