1 / 43

Introduction to R Lecture 1: Getting Started

Introduction to R Lecture 1: Getting Started. Andrew Jaffe 8/30/10. Lecture 1. Course overview What is R? Installing R Installing a text editor Interfacing text editor with R Writing scripts Using R as a calculator. About the Course. Series of 7 seminars Covers the usage of R

ashtyn
Download Presentation

Introduction to R Lecture 1: Getting Started

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to RLecture 1: Getting Started Andrew Jaffe 8/30/10

  2. Lecture 1 • Course overview • What is R? • Installing R • Installing a text editor • Interfacing text editor with R • Writing scripts • Using R as a calculator

  3. About the Course • Series of 7 seminars • Covers the usage of R • Platform for beginning analyses • NOT covering statistics • Good programming etiquette • Bring your laptop – there will be breaks to allow you to practice the code

  4. About the Course • This seminar is 1 unit pass/fail • To pass, attend 5 out of 7 seminars • Very little outside work

  5. About the Course • Some learning objectives include: • Importing/exporting data • Data management • Performing calculations • Recoding variables • Producing graphics • Installing packages • Writing functions

  6. About the Course • Course communication via E-mail • Lectures and code will be hosted on my webpage • http://www.biostat.jhsph.edu/~ajaffe/rseminar.html

  7. About the Instructor • 3rd year PhD student in Genetic Epi program, concurrent MHS in Bioinformatics • Learned R five years ago, been using regularly the last two

  8. Lecture 1 • Course overview • What is R? • Installing R • Installing a text editor • Interfacing text editor with R • Writing scripts • Using R as a calculator • Assignment

  9. What is R? • R is a language and environment for statistical computing and graphics • R is the open source implementation of the S language, which was developed by Bell laboratories • R is both open source and open development http://www.r-project.org/

  10. What is R? • Pros: • Free • Tons of packages, very flexible • Multiple datasets at any given time • Cons: • Much more “programming” oriented • Minimal interface These are my personal opinions

  11. What is R? • Often times, a good first step for data cleaning and manipulation • Then, export data to STATA or SAS for Epi analyses

  12. What is R? Console Script

  13. Lecture 1 • Course overview • What is R? • Installing R • Installing a text editor • Interfacing text editor with R • Writing scripts • Using R as a calculator • Assignment

  14. Installing R • http://cran.r-project.org/

  15. Installing R - Windows • Windows: click “base” and download

  16. Installing R - Windows • Click the link to the latest build

  17. Installing R - Mac • Mac: click the latest package’s .pkg file

  18. Installing R • Double click the downloaded file • Hit ‘next’ a few times • Use default settings • Finish installing

  19. Lecture 1 • Course overview • What is R? • Installing R • Installing a text editor • Interfacing text editor with R • Writing scripts • Using R as a calculator • Assignment

  20. Installing a Text Editor • Windows: R’s built-in text editor is terrible • It’s essentially Window’s notepad • We will download a much better one • Mac: R’s built-in text editor is sufficient • Color coding, signals parenthesis closing, etc • I suggest using this until you think you need a better one

  21. Installing a Text Editor • I prefer Notepad++: • http://notepad-plus-plus.org/ • Download the current version: http://download.tuxfamily.org/notepadplus/5.7/npp.5.7.Installer.exe • Install on your computer using defaults

  22. Installing a Text Editor

  23. Lecture 1 • Course overview • What is R? • Installing R • Installing a text editor • Interfacing text editor with R • Writing scripts • Using R as a calculator • Assignment

  24. Interfacing with R • Scripts: documents that contain reproducible R code and functions that you can send to the console (and save) • Files are designated with the “.R” extension • You can “source” scripts (more later) • Console: Type commands directly into the console • Good for looking at your data, trying things, and plotting

  25. Interfacing with R - Mac • Mac: File  New Script • This opens the default text editor • To send a line of code to the R console, press Apple+Enter when the cursor is anywhere on that line • Highlight chunks of code and press Apple+Enter to send

  26. Interfacing with R - Windows • Using the default text editor, pressing Ctrl+R sends lines to the console • However, we want to use Notepad++ • We need to download one more thing…

  27. Interfacing with R - Windows • “NppToR”: Notepad++ to R • http://sourceforge.net/projects/npptor/ • It must be running when R and Notepad++ are open • When properly configured, press F8 to send lines of code, or highlighted chunks, to the console • I will help configure this after class today

  28. Interfacing with R – Windows • More detailed instructions for installing NppToR • http://sourceforge.net/apps/mediawiki/npptor/index.php?title=Installing

  29. Lecture 1 • Course overview • What is R? • Installing R • Installing a text editor • Interfacing text editor with R • Writing scripts • Using R as a calculator • Assignment

  30. Writing Scripts • The comment symbol is # (pound) in R • Comment liberally - you should be able to understand a script after not seeing it for 6 months • Lines of #’s are useful to separate sections • Useful for designating headers

  31. Writing Scripts ################# # Title: Demo R Script # Author: Andrew Jaffe # Date: 7/30/10 # Purpose: Demonstrate comments in R ################### # this is a comment, nothing to the right of it gets read # this # is still a comment – you can use many #’s as you want # sometimes you have a really long comment, like explaining what you # are doing for a step in analysis. Take it to a second line

  32. Writing Scripts • Some common etiquette: • You can use spaces (more generally “white space”) within functions and commands liberally as well • Try to keep a reasonable number of characters per column – many commands can be broken into multiple lines • More to come later…

  33. Lecture 1 • Course overview • What is R? • Installing R • Installing a text editor • Interfacing text editor with R • Writing scripts • Using R as a calculator • Assignment

  34. R as a Calculator • The R console functions as full calculator • Try to play around with it: +, -, /, * are add, subtract, multiply, and divide ^ or ** is power ( and ) work with order of operations

  35. Lecture 1 • Course overview • What is R? • Installing R • Installing a text editor • Interfacing text editor with R • Writing scripts • Using R as a calculator • Assignment

  36. Assignment • The assignment… operator: assigning a value to a name • R accepts two operators “<-” and “=“ • Ie: x=8 (remember whitespace!: x = 8, x <- 8) • Variable names are case-sensitive • Ie: X and x are different • Set x = 8, and try using calculator functions on x

  37. Assignment • ‘Assignment’ literally puts whatever is on the right side of the operator into your left-hand side variable • Note that although you can name variables anything, you might run into some issues naming things the same as default R functions  Np++ turns functions red/pink so you know…

  38. Examples of assignment, introducing R data Enough to get R up and running if this is the only class you attend. We will see them in much more detail over the next three sessions

  39. Assignment • status <- c(“case”,”case”,”case”, “control”,”control”,”control”) status class(status) table(status) factor(status) [alternatively: status <- c(rep(“case”,3), rep(“control”,3))]

  40. Assignment • web <- “http://www.biostat.jhsph.edu/~ajaffe/code/lec1_code.R” • class(web) • source(web) • You also don’t have to save tables/data you find online to your disk (note read.table works for most things – below aren’t tables though) • scan(web, what=character(0), sep = "\n") • scan(“http://www.google.com”, what=character(0))

  41. Assignment mat <- matrix(c(1,2,3,4), nrow = 2, ncol = 2, byrow = T) # this is sourced in class(mat) mat mat + mat mat * mat mat %*% mat

  42. Assignment • class(dat) # dat is also sourced in • head(dat) • table(dat$sex, dat$status) • …To be continued…

  43. Questions?

More Related