430 likes | 555 Views
Introduction to R Lecture 1: Getting Started. Andrew Jaffe 8/30/10. Lecture 1. Course overview What is R? Installing R Installing a text editor Interfacing text editor with R Writing scripts Using R as a calculator. About the Course. Series of 7 seminars Covers the usage of R
E N D
Introduction to RLecture 1: Getting Started Andrew Jaffe 8/30/10
Lecture 1 • Course overview • What is R? • Installing R • Installing a text editor • Interfacing text editor with R • Writing scripts • Using R as a calculator
About the Course • Series of 7 seminars • Covers the usage of R • Platform for beginning analyses • NOT covering statistics • Good programming etiquette • Bring your laptop – there will be breaks to allow you to practice the code
About the Course • This seminar is 1 unit pass/fail • To pass, attend 5 out of 7 seminars • Very little outside work
About the Course • Some learning objectives include: • Importing/exporting data • Data management • Performing calculations • Recoding variables • Producing graphics • Installing packages • Writing functions
About the Course • Course communication via E-mail • Lectures and code will be hosted on my webpage • http://www.biostat.jhsph.edu/~ajaffe/rseminar.html
About the Instructor • 3rd year PhD student in Genetic Epi program, concurrent MHS in Bioinformatics • Learned R five years ago, been using regularly the last two
Lecture 1 • Course overview • What is R? • Installing R • Installing a text editor • Interfacing text editor with R • Writing scripts • Using R as a calculator • Assignment
What is R? • R is a language and environment for statistical computing and graphics • R is the open source implementation of the S language, which was developed by Bell laboratories • R is both open source and open development http://www.r-project.org/
What is R? • Pros: • Free • Tons of packages, very flexible • Multiple datasets at any given time • Cons: • Much more “programming” oriented • Minimal interface These are my personal opinions
What is R? • Often times, a good first step for data cleaning and manipulation • Then, export data to STATA or SAS for Epi analyses
What is R? Console Script
Lecture 1 • Course overview • What is R? • Installing R • Installing a text editor • Interfacing text editor with R • Writing scripts • Using R as a calculator • Assignment
Installing R • http://cran.r-project.org/
Installing R - Windows • Windows: click “base” and download
Installing R - Windows • Click the link to the latest build
Installing R - Mac • Mac: click the latest package’s .pkg file
Installing R • Double click the downloaded file • Hit ‘next’ a few times • Use default settings • Finish installing
Lecture 1 • Course overview • What is R? • Installing R • Installing a text editor • Interfacing text editor with R • Writing scripts • Using R as a calculator • Assignment
Installing a Text Editor • Windows: R’s built-in text editor is terrible • It’s essentially Window’s notepad • We will download a much better one • Mac: R’s built-in text editor is sufficient • Color coding, signals parenthesis closing, etc • I suggest using this until you think you need a better one
Installing a Text Editor • I prefer Notepad++: • http://notepad-plus-plus.org/ • Download the current version: http://download.tuxfamily.org/notepadplus/5.7/npp.5.7.Installer.exe • Install on your computer using defaults
Lecture 1 • Course overview • What is R? • Installing R • Installing a text editor • Interfacing text editor with R • Writing scripts • Using R as a calculator • Assignment
Interfacing with R • Scripts: documents that contain reproducible R code and functions that you can send to the console (and save) • Files are designated with the “.R” extension • You can “source” scripts (more later) • Console: Type commands directly into the console • Good for looking at your data, trying things, and plotting
Interfacing with R - Mac • Mac: File New Script • This opens the default text editor • To send a line of code to the R console, press Apple+Enter when the cursor is anywhere on that line • Highlight chunks of code and press Apple+Enter to send
Interfacing with R - Windows • Using the default text editor, pressing Ctrl+R sends lines to the console • However, we want to use Notepad++ • We need to download one more thing…
Interfacing with R - Windows • “NppToR”: Notepad++ to R • http://sourceforge.net/projects/npptor/ • It must be running when R and Notepad++ are open • When properly configured, press F8 to send lines of code, or highlighted chunks, to the console • I will help configure this after class today
Interfacing with R – Windows • More detailed instructions for installing NppToR • http://sourceforge.net/apps/mediawiki/npptor/index.php?title=Installing
Lecture 1 • Course overview • What is R? • Installing R • Installing a text editor • Interfacing text editor with R • Writing scripts • Using R as a calculator • Assignment
Writing Scripts • The comment symbol is # (pound) in R • Comment liberally - you should be able to understand a script after not seeing it for 6 months • Lines of #’s are useful to separate sections • Useful for designating headers
Writing Scripts ################# # Title: Demo R Script # Author: Andrew Jaffe # Date: 7/30/10 # Purpose: Demonstrate comments in R ################### # this is a comment, nothing to the right of it gets read # this # is still a comment – you can use many #’s as you want # sometimes you have a really long comment, like explaining what you # are doing for a step in analysis. Take it to a second line
Writing Scripts • Some common etiquette: • You can use spaces (more generally “white space”) within functions and commands liberally as well • Try to keep a reasonable number of characters per column – many commands can be broken into multiple lines • More to come later…
Lecture 1 • Course overview • What is R? • Installing R • Installing a text editor • Interfacing text editor with R • Writing scripts • Using R as a calculator • Assignment
R as a Calculator • The R console functions as full calculator • Try to play around with it: +, -, /, * are add, subtract, multiply, and divide ^ or ** is power ( and ) work with order of operations
Lecture 1 • Course overview • What is R? • Installing R • Installing a text editor • Interfacing text editor with R • Writing scripts • Using R as a calculator • Assignment
Assignment • The assignment… operator: assigning a value to a name • R accepts two operators “<-” and “=“ • Ie: x=8 (remember whitespace!: x = 8, x <- 8) • Variable names are case-sensitive • Ie: X and x are different • Set x = 8, and try using calculator functions on x
Assignment • ‘Assignment’ literally puts whatever is on the right side of the operator into your left-hand side variable • Note that although you can name variables anything, you might run into some issues naming things the same as default R functions Np++ turns functions red/pink so you know…
Examples of assignment, introducing R data Enough to get R up and running if this is the only class you attend. We will see them in much more detail over the next three sessions
Assignment • status <- c(“case”,”case”,”case”, “control”,”control”,”control”) status class(status) table(status) factor(status) [alternatively: status <- c(rep(“case”,3), rep(“control”,3))]
Assignment • web <- “http://www.biostat.jhsph.edu/~ajaffe/code/lec1_code.R” • class(web) • source(web) • You also don’t have to save tables/data you find online to your disk (note read.table works for most things – below aren’t tables though) • scan(web, what=character(0), sep = "\n") • scan(“http://www.google.com”, what=character(0))
Assignment mat <- matrix(c(1,2,3,4), nrow = 2, ncol = 2, byrow = T) # this is sourced in class(mat) mat mat + mat mat * mat mat %*% mat
Assignment • class(dat) # dat is also sourced in • head(dat) • table(dat$sex, dat$status) • …To be continued…