900 likes | 1.11k Views
A gentle introduction to R. Loading R. If you want to install R on your own computer go to http://www.r-project.org/. Course slides. http://core.brc.iop.kcl.ac.uk/events/a-gentle-introduction-to-r / Short link: http://bit.ly/ XcHS46
E N D
Loading R • If you want to install R on your own computer go to http://www.r-project.org/
Course slides http://core.brc.iop.kcl.ac.uk/events/a-gentle-introduction-to-r/ Short link: http://bit.ly/XcHS46 Link to slides and data from previous course also on here, these cover reading data into R and some basic plots.
Course Structure • NOTE: not all slides will be covered, extra slides have been included for those who wish to learn more. We will spend ~20 minutes on each topic. • Five sections • R help and documentation • Loading R packages • Data types in R • Working with data structures • Package example – partial correlation
section 1 R Help and Documentation
Overview • Typically you will want help, documentation or to see examples, demos, tutorials on • Packages e.g. stats, biobase • Functions e.g. mean, sd • Datasets e.g. trees, iris • Operators e.g. + - <- &
Online Help help.start() Start the hypertext (currently HTML) version of R's online documentation. There is a wealth of information here… You may want to start with "An Introduction to R", "R Data Import/Export", and "Search Engine and Keywords"
Tips For Finding Things • apropos("topic") returns the names of function (part)matching topic • apropos("mean") > [1] "colMeans" ".colMeans" "kmeans" "mean" > [5] "mean.data.frame" "mean.Date" "mean.default" "mean.difftime" ...etc • Search for Help on a topic • help.search("topic") search the help system • or the shorthand ??"topic" help.search("mean")
Tips For Finding Things • Double-tap-TAB key <tab x 2>: When typing a function, pressing tab twice will show auto-completion options, and available arguments when in the parenthesis of the function. e.g. auto-completion data.<tab x 2> > data.classdata.entrydata.frame then press tab twice to list the possible arguments taken by data.framefunction. data.frame( <tab x 2> > ...= row.names= check.rows= > check.names= stringsAsFactors=
Displaying Documentation • When you already know the name of the package, dataset, function you are interested in you can jump directly to the documentation, examples and demos etc. • To open Help Documentation on "topic" help(topic) or the shorthand ?topic e.g. the mean function: help(mean) or ?mean • look at the help page for the help function and see what else you can do with it • ?help For special characters put use quotes "" e.g. to get the help page for the assignment operator: help("<-") or ?"<-"
Displaying Documentation • To see the list of functions in a package • library(help="stats") • If you initial search fails to find anything try: • help("topic", try.all.packages=TRUE) • In future examples, you may encounter messages such as: • Waiting to confirm page change... • TO MOVE FORWARDS PRESS ENTER
Examples, Demos, VignettesR packages are usually supplied with executable demonstrations (demo) and mini-tutorials (vignette). • Most* functions have a quick set of runnable examples, these typically use the pre-packed data to demonstrate the function. The example code can also be seen in the documentation page for the function. • example(functionName) e.g. example(mean) example(hist) example(persp) mean> x <- c(0:10, 50) mean> xm <- mean(x) mean> c(xm, mean(x, trim = 0.10)) [1] 8.75 5.50 • Package Demos (only available for some packages) • Runs a selection of examples in a package • List all demos • demo(package = .packages(all.available = TRUE)) # for all packages • demo() #only shows loaded packages • Run a demo from the list (the package must be loaded…) • demo(packageName) library("graphics") demo(graphics) library(lattice) demo(lattice) * excluding internal functions that are not typically exposed to the user
Examples, Demos, VignettesR packages are usually supplied with executable demonstrations (demo) and mini-tutorials (vignette). • Package Vignettes • These are like tutorials on a particular package (usually a pdf + runnable code). They are an excellent way to learn R. • To see if the package has one use: • help(packageName) • vignette() # lists vignettes for all loaded packages. • To open the vignette • vignette(topic, packagename) • e.g. vignette("ExpressionSetIntroduction", package="Biobase") • You can copy the R-code from the PDF and past them into your R console, to run the tutorial alternatively, you can also extract the R code • Extract the R code from a vignette • rcode <- vignette("ExpressionSetIntroduction", package="Biobase") • edit(rcode) # to see a text file extraction of all the R-code from the vignette • Biobase (bioconductor) also has an openVignette() function too.
The iris dataset • You’ll later be using the iris dataset which comes as standard with R. Let’s take a look at the documentation for this. • Show the documentation for iris • ?iris • The iris package does not have a demo (it is just a dataset), but a number of other packages make use of it. • demo(graphics) • example(pairs)
Help by Topic • Sometimes you want to find out how to do something based on a particular workflow or area and would like to know what R provides in this domain: • CRAN Task Views • http://cran.ma.imperial.ac.uk/web/views/ • Bioconductor Workflows • http://www.bioconductor.org/help/workflows/
Reference Material • QuickR website • http://www.statmethods.net/ • Bioconductor • Home: • http://www.bioconductor.org/ • Course material: • http://www.bioconductor.org/help/course-materials/ • Community help resource: • http://www.bioconductor.org/help/community/ • R-project • http://www.r-project.org/
section 2 Loading R Packages
Loading Packages • Confusingly library and package are not the same thing in R. • the library is where the package repository is located: • .libPaths() #will show you where these are located • Apackage is a "library" or collection of: functions, datasets, documentation etc. These can be downloaded into a local library and loaded into memory to provide access to their contents. • library() #packages in your local repositories
Loading Packages • Once you have installed a package, you then have access to it, but first you must load it! • To load a package into memory: • search() #will list all loaded libraries • library() #will show all libraries in each repository • library("packageName") #will load the package packageName
Installing Packages • To access a package we first have to download it from an online repository * (or install it from a local file - not covered here). • install.packages("e1071") • select your CRAN mirror and it should download in install. * other online repositories can be setup by adding the URL to .libPaths()
Alternatively Or, to install Biocondutor packages go to: You can then install Bioconductor packages.
Data Packages • R packages may also provide datasets (these are of the small demonstrative variety..) • to list the data packages available: • data() • to load a dataset • data("aDataSet") • e.g. • data("iris") • Here are a selection of function to inspect the dataset: • summary(iris) • class(iris) # tells me this is a data.frame • pairs(iris[1:4],pch = 21, bg = c("red", "green3", "blue")[unclass(iris$Species)])
Iris dataset • To find out more about this dataset go to: http://en.wikipedia.org/wiki/Iris_flower_data_set
Iris setosa Pairs Iris versicolor Iris virginica
section 3 DATA TYPES IN R
Vectors 1-dimensional data structure. Vectors are the most basic structure in R. Even single strings or numbers are vectors of length 1
Vectors: Modes Character char.eg <- c("one","two","three") Numeric num.eg <- c(1,2,10,32,4) Logical log.eg <- c(TRUE,TRUE,FALSE)
Vectors: Creation, Information > eg1 <- "test" > eg2 <- c(10,20,321) > eg3 <- c(eg2,eg2) > mode(eg1) #[1] "character" > mode(eg2) #[1] "numeric" > mode(eg3) #[1] "numeric" > length(eg1) #[1] 1 > length(eg2) #[1] 3 > length(eg3) #[1] 6
Vectors: Creation, Information The mode of an object is the basic type of data it can contain. Objects also have a class. For simple vectors, the class is the same as the mode: > class(eg1) #[1] "character" > class(eg2) #[1] "numeric" > class(eg3) #[1] "numeric"
Vectors: Subsetting Vectors are numbered from one. Elements accessed with subset operator [ ]: > a<-c("one","two","three") > a[1] [1] "one"
Vectors: Subsetting > a<-c("one","two","three") >a[3] [1] "three" > a[1:2] [1] "one" "two" > a[c(FALSE,TRUE,FALSE)] [1] "two"
Vectors: Naming >names(a) <- c("A","B","C") >a["B"] B "two" >a[c("A","C")] A C "one" "three"
Factors Factors are vectors that know they contain nominal or ordinal variables Many R functions know how to treat factors appropriately
Factors: Creation Create a nominal factor: >a <- c("pink","pink","blue","blue") >my.fac <- factor(a) >my.fac [1] pink pink blue blue Levels: blue pink Create an ordinal factor >a<-c("small","small","medium","large","large") >my.fac<-ordered(a) >my.fac [1] small small medium large large Levels: large < medium < small
Factors: Useful functions >class(factor(c("one","one","two"))) [1] "factor" >class(ordered(c("one","one","two")) [1] "ordered" "factor" > summary(my.fac) large medium small 2 1 2 >levels(my.fac) [1] "large" "medium" "small"
Matrices 2-dimensional data structure. All elements in a matrix must be the same mode Basically just a vector with a dimension attribute.
Matrices: Creation > a<-c(1,2,3) > rbind(a,a,a) [,1] [,2] [,3] a 1 2 3 a 1 2 3 a 1 2 3 > cbind(a,a,a) a a a [1,] 1 1 1 [2,] 2 2 2 [3,] 3 3 3 Combine vectors by row Combine vectors by col
Matrices: Creation Use the matrix function to turn a vector into a matrix of specified dimensions: >a<-c(1,2,3,4,5,6) >matrix(a,nrow=2,ncol=3, byrow=TRUE) [,1] [,2] [,3] [1,] 1 2 3 [2,] 4 5 6 >matrix(a,nrow=2,ncol=3, byrow=FALSE) [,1] [,2] [,3] [1,] 1 3 5 [2,] 2 4 6
Matrices: Creation Directly set a dimension attribute on a vector: >a<-c(1,2,3,4,5,6) >dim(a)<- c(2,3) [,1] [,2] [,3] [1,] 1 3 5 [2,] 2 4 6 >class(a) [1] "matrix"
Matrices: Useful Functions >m <- matrix(a,nrow=2,ncol=3, byrow=FALSE) >dim(m) [1] 2 3 >class(m) [1]"matrix" >mode(m) [1] "numeric" >nrow(m) [1] 2 >ncol(m) [1] 3
Matrices: Naming >colnames(m) <- c("A","B","C") >rownames(m) <- c("One","Two") >m A B C One 1 3 5 Two 2 4 6
Matrices: Naming > dimnames(a)<-list(x=c("One","Two"), y=c("A","B","C")) > a y x A B C One 1 3 5 Two 2 4 6
Matrices: Subsetting >m[1,1] [1] 1 # first col, first row >m[1:2,2] # second col of rows 1 and 2 >m["Two",] # row "Two", all cols >m[,c("B","C")] # all rows, cols "B"&"C" Will return a matrix or a vector (if you're only selecting a single row/col)
Arrays Extends the concept of a matrix to more than 2 dimensions All elements must be the same class. A vector with a dimension attribute with >2 dimensions You probably won't ever need one.
Arrays: Creation Use the array function >a<-1:24 >b<-array(a,c(3,4,2)) >b , , 1 [,1] [,2] [,3] [,4] [1,] 1 4 7 10 [2,] 2 5 8 11 [3,] 3 6 9 12 , , 2 [,1] [,2] [,3] [,4] [1,] 13 16 19 22 [2,] 14 17 20 23 [3,] 15 18 21 24
Arrays: Creation Add a dim attribute to a vector >a<-1:24 >dim(a)<-c(3,4,2) >class(a) [1] "array" >mode(a) [1] "numeric"
Lists An ordered collection of R objects Allows you to group related objects under a single name. No constraints on the components of the list - they don't have to have the same mode, class or dimensions.
Lists: Creation names <- c("bob smith", "sarah jones") colours <- c("red","green","blue") nums <- matrix(1:25,c(5,5)) my.list <- list(names, colours, nums) my.list [[1]] [1] "bob smith" "sarah jones" [[2]] [1] "red" "green" "blue" [[3]] [,1] [,2] [,3] [,4] [,5] [1,] 1 6 11 16 21 [2,] 2 7 12 17 22 [3,] 3 8 13 18 23 [4,] 4 9 14 19 24 [5,] 5 10 15 20 25
Lists: Naming my.list <- list(people=names, fave.cols=colours) my.list $people [1] "bob smith" "sarah jones" $fave.cols [1] "red" "green" "blue"
Lists: Naming my.list <- list(names, colours) names(my.list)<-c("people","fave.cols") my.list $people [1] "bob smith" "sarah jones" $fave.cols [1] "red" "green" "blue"