Empirical Research Methods in Computer Science

Empirical Research Methods in Computer Science Lecture 1, Part 1 October 12, 2005 Noah Smith http://nlp.cs.jhu.edu/~nasmith/erm

Empiricism empeiros: experienced (peira = trial or test) cf. rationalism

Exploration & Experiment • Exploratory Data Analysis (lecture ≈5) • Hypothesis Testing (lectures 1,2) explore visualize summarize model experiment confirm yes/no?

Computer What? • Theory • Algorithms, Computation • Practice • Software Engineering, Application Areas • Systems • OS, Architecture

Who cares? • anyone who wants to do research • anyone who wants to follow research (i.e., read papers) • anyone who wants to be able to make smart decisions / draw conclusions • anyone who likes thinking critically

Basic Research Questions

Basic Research Questions int foo() { ... }

Why bother? int foo() { ... } int foo() { ... } int foo() { ... } int foo() { ... } int foo() { ... } int foo() { ... }

Variation → Statistics determinism isn’t good enough any more! int foo() { ... }

Statistics, in this Course • Nonparametric tests • Sampling Later: • Parametric tests (when and why)

Warning • Theory (complexity analysis, etc.) is important, too! • Many phenomena aren’t surprising if you know your math.

Goals • Know how to look for the interesting experiments • Know how to construct experiments • Know how to analyze the results • Be critical of all claims • Develop an aesthetic for good empirical work!

Empiricism is FUN! Especially in computer science!

Basic Course Information • instructors: Noah and David {n,d}asmith@cs.jhu.edu • Wednesdays 4-5:15 pm • no class Thanksgiving week • homeworks (65%); final exam (30%)

About Us • Combined 19 years of experience in CS; 36 years programming • Autodidact empiricists • Research interests in statistical modeling and machine learning (Eisner/Yarowsky lab) • NEB 332

Plan • Hypothesis testing, statistics (2) • Case study: runtime (2) • Exploratory data analysis (1) • Parametric testing, modeling (1-2) • Statistical analysis of computer programs (1)

MO • Come to class. • Send us feedback anytime. • What do you want to know? • Bring us papers.

Empirical Research Methods in Computer Science Lecture 1, Part 2 October 12, 2005 David Smith

Terminological Prelude • Populations • Population distributions • “All possible files”. How big? • Samples • Sampling distributions • “Files on my system” • Statistics • Functions of data • “Size of my files” • Models • Parameters

And now for some data

Abnormality

The Bootstrap • Simulates the sampling distribution • Proposed by Efron in 1979 • Anticipated by permutation tests, jackknife, cross-validation • From original sample of size n, draw B samples of size n with replacement and calculate the statistic on each

Sampling Distributions μ μ μ μ μ

Bootstrapping the Mean

What’s Going On? • Why is bootstrap distribution normal? • Remember, this is a mean • Linearity of Expectation • Central Limit Theorem • Closed form standard error for means

More Heavy Tails

Sampling Still Normal

Bivariate Data

Compression Performance

Bootstrapping Correlation

Error, Confidence, Testing • Standard error from sampling distribution • Confidence intervals: bounding error probability (e.g. to 5%) • Hypothesis testing: how likely is a particular statistic under our assumptions?

Hypothesis Testing • One-sample • “Are these data normal/Poisson/…?” • Two-sample • “Are these two samples from the same distribution?” • Paired-sample • “Is this technique better than that?”

Your First Assignment • Data compression • Three-way tradeoff • Compression • Speed • Loss • Degenerate cases (cat, echo ‘’, …) • Unknown distribution of input

Empirical Research Methods in Computer Science