1 / 90

A gentle introduction to R

A gentle introduction to R. Loading R. If you want to install R on your own computer go to http://www.r-project.org/. Course slides. http://core.brc.iop.kcl.ac.uk/events/a-gentle-introduction-to-r / Short link: http://bit.ly/ XcHS46

caraf
Download Presentation

A gentle introduction to R

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A gentle introduction to R

  2. Loading R • If you want to install R on your own computer go to http://www.r-project.org/

  3. Course slides http://core.brc.iop.kcl.ac.uk/events/a-gentle-introduction-to-r/ Short link: http://bit.ly/XcHS46 Link to slides and data from previous course also on here, these cover reading data into R and some basic plots.

  4. Course Structure • NOTE: not all slides will be covered, extra slides have been included for those who wish to learn more. We will spend ~20 minutes on each topic. • Five sections • R help and documentation • Loading R packages • Data types in R • Working with data structures • Package example – partial correlation

  5. section 1 R Help and Documentation

  6. Overview • Typically you will want help, documentation or to see examples, demos, tutorials on • Packages e.g. stats, biobase • Functions e.g. mean, sd • Datasets e.g. trees, iris • Operators e.g. + - <- &

  7. Online Help help.start() Start the hypertext (currently HTML) version of R's online documentation. There is a wealth of information here… You may want to start with "An Introduction to R", "R Data Import/Export", and "Search Engine and Keywords"

  8. Tips For Finding Things • apropos("topic") returns the names of function (part)matching topic • apropos("mean") > [1] "colMeans" ".colMeans" "kmeans" "mean" > [5] "mean.data.frame" "mean.Date" "mean.default" "mean.difftime" ...etc • Search for Help on a topic • help.search("topic") search the help system • or the shorthand ??"topic" help.search("mean")

  9. Tips For Finding Things • Double-tap-TAB key <tab x 2>: When typing a function, pressing tab twice will show auto-completion options, and available arguments when in the parenthesis of the function. e.g. auto-completion data.<tab x 2> > data.classdata.entrydata.frame then press tab twice to list the possible arguments taken by data.framefunction. data.frame( <tab x 2> > ...= row.names= check.rows= > check.names= stringsAsFactors=

  10. Displaying Documentation • When you already know the name of the package, dataset, function you are interested in you can jump directly to the documentation, examples and demos etc. • To open Help Documentation on "topic" help(topic) or the shorthand ?topic e.g. the mean function: help(mean) or ?mean • look at the help page for the help function and see what else you can do with it • ?help For special characters put use quotes "" e.g. to get the help page for the assignment operator: help("<-") or ?"<-"

  11. Displaying Documentation • To see the list of functions in a package • library(help="stats") • If you initial search fails to find anything try: • help("topic", try.all.packages=TRUE) • In future examples, you may encounter messages such as: • Waiting to confirm page change... • TO MOVE FORWARDS PRESS ENTER

  12. Examples, Demos, VignettesR packages are usually supplied with executable demonstrations (demo) and mini-tutorials (vignette). • Most* functions have a quick set of runnable examples, these typically use the pre-packed data to demonstrate the function. The example code can also be seen in the documentation page for the function. • example(functionName) e.g. example(mean) example(hist) example(persp) mean> x <- c(0:10, 50) mean> xm <- mean(x) mean> c(xm, mean(x, trim = 0.10)) [1] 8.75 5.50 • Package Demos (only available for some packages) • Runs a selection of examples in a package • List all demos • demo(package = .packages(all.available = TRUE)) # for all packages • demo() #only shows loaded packages • Run a demo from the list (the package must be loaded…) • demo(packageName) library("graphics") demo(graphics) library(lattice) demo(lattice) * excluding internal functions that are not typically exposed to the user

  13. Examples, Demos, VignettesR packages are usually supplied with executable demonstrations (demo) and mini-tutorials (vignette). • Package Vignettes • These are like tutorials on a particular package (usually a pdf + runnable code). They are an excellent way to learn R. • To see if the package has one use: • help(packageName) • vignette() # lists vignettes for all loaded packages. • To open the vignette • vignette(topic, packagename) • e.g. vignette("ExpressionSetIntroduction", package="Biobase") • You can copy the R-code from the PDF and past them into your R console, to run the tutorial alternatively, you can also extract the R code • Extract the R code from a vignette • rcode <- vignette("ExpressionSetIntroduction", package="Biobase") • edit(rcode) # to see a text file extraction of all the R-code from the vignette • Biobase (bioconductor) also has an openVignette() function too.

  14. The iris dataset • You’ll later be using the iris dataset which comes as standard with R. Let’s take a look at the documentation for this. • Show the documentation for iris • ?iris • The iris package does not have a demo (it is just a dataset), but a number of other packages make use of it. • demo(graphics) • example(pairs)

  15. Help by Topic • Sometimes you want to find out how to do something based on a particular workflow or area and would like to know what R provides in this domain: • CRAN Task Views • http://cran.ma.imperial.ac.uk/web/views/ • Bioconductor Workflows • http://www.bioconductor.org/help/workflows/

  16. Reference Material • QuickR website • http://www.statmethods.net/ • Bioconductor • Home: • http://www.bioconductor.org/ • Course material: • http://www.bioconductor.org/help/course-materials/ • Community help resource: • http://www.bioconductor.org/help/community/ • R-project • http://www.r-project.org/

  17. section 2 Loading R Packages

  18. Loading Packages • Confusingly library and package are not the same thing in R. • the library is where the package repository is located: • .libPaths() #will show you where these are located • Apackage is a "library" or collection of: functions, datasets, documentation etc. These can be downloaded into a local library and loaded into memory to provide access to their contents. • library() #packages in your local repositories

  19. Loading Packages • Once you have installed a package, you then have access to it, but first you must load it! • To load a package into memory: • search() #will list all loaded libraries • library() #will show all libraries in each repository • library("packageName") #will load the package packageName

  20. Installing Packages • To access a package we first have to download it from an online repository * (or install it from a local file - not covered here). • install.packages("e1071") • select your CRAN mirror and it should download in install. * other online repositories can be setup by adding the URL to .libPaths()

  21. Alternatively Or, to install Biocondutor packages go to: You can then install Bioconductor packages.

  22. Data Packages • R packages may also provide datasets (these are of the small demonstrative variety..) • to list the data packages available: • data() • to load a dataset • data("aDataSet") • e.g. • data("iris") • Here are a selection of function to inspect the dataset: • summary(iris) • class(iris) # tells me this is a data.frame • pairs(iris[1:4],pch = 21, bg = c("red", "green3", "blue")[unclass(iris$Species)])

  23. Iris dataset • To find out more about this dataset go to: http://en.wikipedia.org/wiki/Iris_flower_data_set

  24. Iris setosa Pairs Iris versicolor Iris virginica

  25. section 3 DATA TYPES IN R

  26. Vectors 1-dimensional data structure. Vectors are the most basic structure in R. Even single strings or numbers are vectors of length 1

  27. Vectors: Modes Character char.eg <- c("one","two","three") Numeric num.eg <- c(1,2,10,32,4) Logical log.eg <- c(TRUE,TRUE,FALSE)

  28. Vectors: Creation, Information > eg1 <- "test" > eg2 <- c(10,20,321) > eg3 <- c(eg2,eg2) > mode(eg1) #[1] "character" > mode(eg2) #[1] "numeric" > mode(eg3) #[1] "numeric" > length(eg1) #[1] 1 > length(eg2) #[1] 3 > length(eg3) #[1] 6

  29. Vectors: Creation, Information The mode of an object is the basic type of data it can contain. Objects also have a class. For simple vectors, the class is the same as the mode: > class(eg1) #[1] "character" > class(eg2) #[1] "numeric" > class(eg3) #[1] "numeric"

  30. Vectors: Subsetting Vectors are numbered from one. Elements accessed with subset operator [ ]: > a<-c("one","two","three") > a[1] [1] "one"

  31. Vectors: Subsetting > a<-c("one","two","three") >a[3] [1] "three" > a[1:2] [1] "one" "two" > a[c(FALSE,TRUE,FALSE)] [1] "two"

  32. Vectors: Naming >names(a) <- c("A","B","C") >a["B"] B "two" >a[c("A","C")] A C "one" "three"

  33. Factors Factors are vectors that know they contain nominal or ordinal variables Many R functions know how to treat factors appropriately

  34. Factors: Creation Create a nominal factor: >a <- c("pink","pink","blue","blue") >my.fac <- factor(a) >my.fac [1] pink pink blue blue Levels: blue pink Create an ordinal factor >a<-c("small","small","medium","large","large") >my.fac<-ordered(a) >my.fac [1] small small medium large large Levels: large < medium < small

  35. Factors: Useful functions >class(factor(c("one","one","two"))) [1] "factor" >class(ordered(c("one","one","two")) [1] "ordered" "factor" > summary(my.fac) large medium small 2 1 2 >levels(my.fac) [1] "large" "medium" "small"

  36. Matrices 2-dimensional data structure. All elements in a matrix must be the same mode Basically just a vector with a dimension attribute.

  37. Matrices: Creation > a<-c(1,2,3) > rbind(a,a,a) [,1] [,2] [,3] a 1 2 3 a 1 2 3 a 1 2 3 > cbind(a,a,a) a a a [1,] 1 1 1 [2,] 2 2 2 [3,] 3 3 3 Combine vectors by row Combine vectors by col

  38. Matrices: Creation Use the matrix function to turn a vector into a matrix of specified dimensions: >a<-c(1,2,3,4,5,6) >matrix(a,nrow=2,ncol=3, byrow=TRUE) [,1] [,2] [,3] [1,] 1 2 3 [2,] 4 5 6 >matrix(a,nrow=2,ncol=3, byrow=FALSE) [,1] [,2] [,3] [1,] 1 3 5 [2,] 2 4 6

  39. Matrices: Creation Directly set a dimension attribute on a vector: >a<-c(1,2,3,4,5,6) >dim(a)<- c(2,3) [,1] [,2] [,3] [1,] 1 3 5 [2,] 2 4 6 >class(a) [1] "matrix"

  40. Matrices: Useful Functions >m <- matrix(a,nrow=2,ncol=3, byrow=FALSE) >dim(m) [1] 2 3 >class(m) [1]"matrix" >mode(m) [1] "numeric" >nrow(m) [1] 2 >ncol(m) [1] 3

  41. Matrices: Naming >colnames(m) <- c("A","B","C") >rownames(m) <- c("One","Two") >m A B C One 1 3 5 Two 2 4 6

  42. Matrices: Naming > dimnames(a)<-list(x=c("One","Two"), y=c("A","B","C")) > a y x A B C One 1 3 5 Two 2 4 6

  43. Matrices: Subsetting >m[1,1] [1] 1 # first col, first row >m[1:2,2] # second col of rows 1 and 2 >m["Two",] # row "Two", all cols >m[,c("B","C")] # all rows, cols "B"&"C" Will return a matrix or a vector (if you're only selecting a single row/col)

  44. Arrays Extends the concept of a matrix to more than 2 dimensions All elements must be the same class. A vector with a dimension attribute with >2 dimensions You probably won't ever need one.

  45. Arrays: Creation Use the array function >a<-1:24 >b<-array(a,c(3,4,2)) >b , , 1 [,1] [,2] [,3] [,4] [1,] 1 4 7 10 [2,] 2 5 8 11 [3,] 3 6 9 12 , , 2 [,1] [,2] [,3] [,4] [1,] 13 16 19 22 [2,] 14 17 20 23 [3,] 15 18 21 24

  46. Arrays: Creation Add a dim attribute to a vector >a<-1:24 >dim(a)<-c(3,4,2) >class(a) [1] "array" >mode(a) [1] "numeric"

  47. Lists An ordered collection of R objects Allows you to group related objects under a single name. No constraints on the components of the list - they don't have to have the same mode, class or dimensions.

  48. Lists: Creation names <- c("bob smith", "sarah jones") colours <- c("red","green","blue") nums <- matrix(1:25,c(5,5)) my.list <- list(names, colours, nums) my.list [[1]] [1] "bob smith" "sarah jones" [[2]] [1] "red" "green" "blue" [[3]] [,1] [,2] [,3] [,4] [,5] [1,] 1 6 11 16 21 [2,] 2 7 12 17 22 [3,] 3 8 13 18 23 [4,] 4 9 14 19 24 [5,] 5 10 15 20 25

  49. Lists: Naming my.list <- list(people=names, fave.cols=colours) my.list $people [1] "bob smith" "sarah jones" $fave.cols [1] "red" "green" "blue"

  50. Lists: Naming my.list <- list(names, colours) names(my.list)<-c("people","fave.cols") my.list $people [1] "bob smith" "sarah jones" $fave.cols [1] "red" "green" "blue"

More Related