1 / 12

Experimentation in Computer Systems Research

This article discusses the importance of experimentation in computer systems research and provides a systematic approach for conducting experiments. It covers topics such as problem framing, selecting metrics and parameters, choosing measurement or simulation techniques, data analysis, and workload characterization. The article also highlights common mistakes to avoid and offers insights into the range of experimental systems research at Duke University.

gcampos
Download Presentation

Experimentation in Computer Systems Research

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Experimentation in Computer Systems Research Why:“It doesn’t matter how beautiful your theory is, it doesn’t matter how smart you are – if it doesn’t agree with the experiment, it’s wrong.” R. Feynman

  2. Why? W. Tichy in “Should Computer Scientists Experiment More?” argues for experimentation. • Model / theory testing As Feynman’s quotation suggests, an experiment can identify flaws in a theory (e.g. underlying assumptions that are violated by reality) • Exploration Go where no system has gone before (i.e., open whole new areas for investigation) Demonstrate / identify the importance of a potential new research problem

  3. “groping around” experiences Vague idea Initialobservations Hypothesis Model Experiment Data, analysis, interpretation Results & finalPresentation Experimental Lifecycle Evidence of real problem, justification, opportunity, feasibility, understanding Boundary of system under test, workload & system parameters that affect behavior. Questions thattest the model.metrics to answer questions, factors to vary, levels of factors.

  4. Vague idea Model Experiment Data, analysis, interpretation Results & finalPresentation Common Practice No preliminary investigation No hypothesisprecisely articulated No iteration

  5. Lots of Ways to Attack Experimentation • Not general – only applies to the “system under test”. • Not forward-looking – motivations and observations based on the past not the future. • Lack of representative workloads – inadequate benchmarks. • No culture of independent replication of other’s experiments. • Real data can be messy. Learn to do it “right”

  6. A Systematic Approach • Understand the problem, frame the questions, articulate the goals.A problem well-stated is half-solved. • Must remain objective • Be able to answer “why” as well as “what” • Select metrics that will help answer the questions. • Identify the parameters that affect behavior • System parameters (e.g., HW config) • Workload parameters (e.g., user request patterns) • Decide which parameters to study (vary).

  7. A Systematic Approach • Select technique: • Measurement of prototype implementationHow invasive? Can we quantify interference of monitoring? Can we directly measure what we want? • Simulation – how detailed? Validated against what? • Repeatability • Select workload • Representative? • Community acceptance • Availability

  8. A Systematic Approach • Run experiments • How many trials? How many combinations of parameter settings? • Sensitivity analysis on other parameter values. • Analyze and interpret data • Statistics, dealing with variability, outliers • Data presentation • Where does it lead us next? • New hypotheses, new questions, a new round of experiments

  9. Choosing measurement or simulation techniques Metrics Workload selection Standard benchmark suites Micro benchmarks Synthetic benchmarks Representativeness Monitoring Instrumentation techniques Timing issues Data collection Intrusiveness Data analysis / statistics Misleading with data Workload characterization Workload generators Experimental design Data presentation Different kinds of simulators Event driven Trace driven Execution driven Validation Summary ofTopics

  10. Range of Experimental Systems Research at Duke • In support of • Benchmarking (e.g., Fstress) • Workload characterization / tools (e.g., Cprof, Trickle-down) • Simulators (e.g., Modelnet) • Testbeds (e.g., PlanetLab) • Used in evaluation of systems • Almost everything else we do here.

  11. How Course Will Work • Webpage //www.cs.duke.edu/courses/spring03/cps296.6 • Approximately 40 pages per week in Jain • Reading list – papers from the literature (coming soon) • Course project (leverage whatever experimental projects you have to do for thesis, 2nd year project, coursework). • Systematic approach, experimental design decisions made explicitly and justified • Mini-conference during exam week. • 2 exams over readings • Class sessions: 40 minutes lecture, discussion • Assigned expertise

  12. Discussion • Introduce yourself. • What are your current research interests? • What conference represents your research community? • Do you have an on-going project? • What are your reasons for taking the course? • e.g. my advisor made me do it • Do you have prior expertise we can benefit from?

More Related