120 likes | 264 Views
Programming and Simulations. Frank Witmer 6 January 2011. Outline. General programming tips Programming loops Simulation Distributions Sampling Bootstrapping. General Programming Tips. Use meaningful variable names Include more comments than you think necessary Debugging your code
E N D
Programming and Simulations Frank Witmer 6 January 2011
Outline • General programming tips • Programming loops • Simulation • Distributions • Sampling • Bootstrapping
General Programming Tips • Use meaningful variable names • Include more comments than you think necessary • Debugging your code • Since R is interpreted, non-function variables are available for inspection if execution terminates • Built-in debugging support: debug(), browser(), trace() • But generally adding print statements in functions is sufficient • Syntax highlighting! • http://sourceforge.net/projects/npptor/
Loops • Because R is an interpreted language, all variables in the system are evaluated and stored at every step • So avoid loops for computationally intense analysis
For & While loop syntax for (variable in sequence) { expression expression } while (condition) { expression expression }
if/else control statements if ( condition1 ) { expression1 }else if ( condition2 ) { expression2 } else { expression3 }
Ways to avoid loops (sometimes) • tapply:apply a function (FUN) to a variable based on a grouping variable • lapply:apply a function (FUN) to each variable in a given list • sapply:same as lapply but output is more user-friendly
Data simulation • Can simulate data using standard distribution functions, e.g. core names norm, pois • Use ‘r’ prefix to generate random values of the distribution • rnorm(numVals, mean, sd) • rpois(numVals, mean) • Use set.seed() if you want your simulated data to be reproducible
Sampling • Sample from a dataset using: sample(dataset, numItems, replace?) • Can use to simulate survey results or bootstrap statistical estimates
Bootstrap overview • Method to measure accuracy of estimates from a sample empirically • For a sample of size n, draw many random samples, also of size n, with replacement • Two ways to bootstrap regression estimates • residual resampling: add resampled regression residuals to the original dep. var. & re-estimate • data resampling: sample complete cases of original data and estimate coefficients