Experimental Lifecycle

Experimental Lifecycle Vague idea Initialobservations “groping around” experiences 1. Understand the problem,frame the questions, articulate the goals.A problem well-stated is half-solved.Why, not just what Hypothesis Data, analysis, interpretation Model Results & finalPresentation Experiment

What can go wrong at this stage? • Never understanding the problem well enough to crisply articulate the goals / questions / hypothesis. • Getting invested in some solution before making sure a real problem exists. Getting invested in any desired result. Not being unbiased enough to follow proper methodology. • Any biases should be working against yourself. • Fishing expeditions (groping around forever). • Having no goals but building apparatus for it 1st. • Swiss Army knife of simulators?

Strong InferenceJ. Pratt • Progress in science advances by excluding among alternate hypotheses. • Experiments should be designed to disprove a hypothesis. • A hypothesis which is not subject to being falsified doesn’t lead anywhere meaningful • Any conclusion which is not an exclusion is insecure

Steps • Devise alternative hypotheses • Devising experiments with alternative outcomes which will exclude hypothesis • Carrying our experiment to get clean result • Repeat with subhypotheses

Steps 0. Identify problem, observed phenomenon • Devise alternative hypotheses • Devising experiments with alternative outcomes which will exclude hypothesis • Carrying our experiment to get clean result • Repeat with subhypotheses

Steps Intellectual Challenge – to do this efficiently 0. Identify problem, observed phenomenon • Devise alternative hypotheses • Devising experiments with alternative outcomes which will exclude hypothesis • Carrying our experiment to get clean result • Repeat with subhypotheses

Alt1a Alt1b Logical Tree Problem • Our conclusion X might be invalid if alternative hypothesis 1, alternative hypothesis 2, … alternative hypothesis n • We describe experiments to eliminate alternatives. • We proceed along the branches not eliminated. … Alt n Alt 1

Multiple Hypotheses • One can become emotionally “attached” to a single hypothesis • Temptation to demonstrate it is right, make facts fit the theory. • Multiple working hypotheses turns research into a competition among ideas rather than among personal agendas • Gets at the issue of bias

“Support Activities” in Science • Surveys and taxonomy • Experimental infrastructure development • Measurements and tables (e.g. file system usage studies) • Theoretical/abstract models Useful, provided they contribute to chain of discovery but not as ends in themselves.

The Question Apply to one’s own thinking (but useful in someone else’s talk) • What experiment could disprove your hypothesis? or • What hypothesis does your experiment disprove?

Applying Strong Inference to Computer Systems Research This has not been our culture • “Mine is better than theirs” and experiments that show this affirmatively (not honestly attempted to show otherwise) • Non-hypotheses – statements that really can’t be shown to be false.“This system does what it was designed to do” (by definition). • Negative results are hard-sells to publish Issue is scientific effectiveness.

A Good Example Wolman et al, On the scale and performance of cooperative web proxy caching, SOSP 99 Question: Should multiple proxies cooperate in order to increase client populations, improve hit ratios, and reduce latency?

Logical tree Coop web caching works Decreaseobjectlatency, ideal case Increasehit ratio,ideal case … Increasehit ratio,real case

Experiments • Web traces at UW and Microsoft • Simulation: • Infinite cache size (no capacity misses) • Single proxy (sees all information, no overhead) • 2 cases • Ideal caching – all documents in spite of “cachability” • Respecting cacheability • Upper bound on performance

Beyond the knee, no significant improvement Singleproxyenough here

Little impact on latency beyond small populations

Discussion • What do you think computer scientists are doing wrong? • Why doesn’t this approach seem natural to us? • How can we improve? • Will system research look significantly different if strong inference can be applied regularly?

Discussion Next Time:Exercise in Strong Inference • Pick one paper that seems like an important scientific advance and recast its experimental evaluation in terms of hypotheses and experiments to exclude (as a logical tree).

Experimental Lifecycle Vague idea Initialobservations “groping around” experiences 1. Understand the problem,frame the questions, articulate the goals.A problem well-stated is half-solved.Why, not just what Hypothesis Data, analysis, interpretation Model Results & finalPresentation Experiment

Back of the Envelope(SEESAW) Sending sW Receiving rW Listening iW Sleeping zW What information do we need to know?

Hypothesis(SEESAW) • Asymmetric MAC protocol can extend network lifetime by balancing energy consumption (battery depletion) • An asymmetric protocol does not waste energy • in control overhead, • in message loss and retransmission. • An asymmetric protocol can be automatically tuned. • can be hand-tuned. • can be tuned off-line algorithmically • An asymmetric protocol has acceptable performance • message latency • Message throughput • There is opportunity in balancing.

Experimental Lifecycle