150 likes | 226 Views
Introduction to R. Why use R. Its FREE!!! And powerful, fairly widely used, lots of online posts about it Uses S -> an object oriented programing language which allows one to create objects with ‘ proerpties ’ , ‘ methods ’ , and ‘ classes ’ Community based development and debugging.
E N D
Why use R • Its FREE!!! • And powerful, fairly widely used, lots of online posts about it • Uses S -> an object oriented programing language which allows one to create objects with ‘proerpties’, ‘methods’, and ‘classes’ • Community based development and debugging
Documentation and Installation • Official R website: • http://r-project.org/ • FAQs are very useful • Has mailing lists focusing on various aspects of R • Help, development, packages • Installation: http://cran.r-project.org • Choose precompiled binary for your operating system • PC: best to install it under c:\R\ versus c:\Program Files\R\ • R studio, makes it easier to interact with R: • http://rstudio.org/
Let the fun begin! • Double click on the R icon • Opens up a ‘command window’ • The > sign is the prompt, which indicates you are ready to enter commands • Now we must learn how to ‘talk’ to R so that it understands us
Objects • R operates on entities that are technically known as objects that correspond to specfic classes and methods for the storage of information, for example: • vectors (class “numeric”): vectors contains successions of elements of the same nature: numeric (real number), complex, logical (True, False) or character) • matrices (class “matrix”): matrices are successions of column vectors that all contain elements of the same nature • dataframe (class “data.frame”): dataframes are also successions of column vectors but unlike vectors and matrices they may contain vectors of different natures. A character vector (for example gender) can thus border a numeric vector (for example the age) • lists (class “list”): lists are ordered sequences of objects which can be of any mode: the first object of the list may be a vector, the second a matrix and the third another list.
Functions • Functions are also objects, everything’s an object!! • Format: • Functionname(argument1, argument2,…..) • Arguments can be of any class
Useful functions • Basic functions: • >c(x,y,z) # create a vector of the elements x, y and z in paranthesis. • >cbind(x,y,z) # create a matrix with the vectors x, y and z. • >length(x) # indicate the length of a vector or a list x • >sum(x) # calculate the sum of the elements of the vector or matrix x. • >colMeans(x) # calculate the mean of the elements of the vector or matrix x. • >max(x);min(x) # indicate the highest or smallest value of the vector or matrix x. • >summary(x) # provides generic summary information for an object • Functions for finding out info about objects • >ls() # indicate the name of all the objects available in the environment of R. • >dimnames(x) # indicate the name of the rows and columns of.a matrix or a dataframe x. • >dimnames(x)[[1]] # indicate the name of the rows of.a matrix or a dataframe x. • >dimnames(x)[[2]] # indicate the name of the columns of.a matrix or a dataframe x. • >class(x) # list the type of an object • Misc useful functions: • >help(x) # open the help of R about the function (or other objects) x • >barplot(x) # display the barplot of the elements of the vector x • >as.matrix(x) # change the class of a dataframe x to class “matrix”. Often useful as the operations available for objects of class “data.frame” and “matrix” differ. • >as.data.frame(x) # change the class of a matrix x to class “data.frame” • >q() # quit R • >rm(x) # delete the object x of the environment of R. • >x11() # open an empty graphical window.
Mathematical operations • + addition • / division • - substraction • ^ power • * multiplication • sqrt () square root
Assignment • Assignment is an operation that permits to assign to an object of any class an other object of any class. The function that allows this action is the function assign() but its shortcuts “=” and “<-” are often preferred.
Exercise 1 • Imagine you wanted to evaluate the function below for the following pairs: • (3,2), (6,4), (9,6), (12,8), (15,10) • Very time consuming by hand, but can use our new found friend, R! • >x<-c(3,6,9,12,15) # create the vector x. • >y<-c(2,4,6,8,10) # create the vector y. • > f.xy<-sqrt((3*x^2+2*y)/((x+y)*(x-y))) • To analyse the results, you could also have created a matrix that would have contained the results vector f.xy and the vectors x and y: • >Results<-cbind(x,y,f.xy)
Accessing elements of an object • Use coordinates of the desired information in the object via square barckets “[ ]”. • if x is a vector: name.vector[coordinate] • if x is a matrix: name.matrix[row.coordinate,column.coordinate] • if x is a data.frame: name.dataframe[row.coordinate,column.coordinate] • if x is a list: name.list[[coordinate]] • Vectors: • a[c(2,3)] or a[2:3] # the second and the third elements of the vector • >f.xy[3] # the third elements of the vector f.xy. • >y[ ] # the entire vector f.xy.
Accessing lists and matrices • Lists: • A<-dimnames(Results) # A contains a list of the row and column names of the matrix Results. • >A[[2]] # the second element of A contains the column’s names of Results. • >A[[2]][3] # the third column’s name of Results. • Matrices: • >M1[1,3] # first row, third column • >M1[1:2,2:3] #first two rows of the 2nd and 3rd columns • Results[3,1] # third element of the first column of Results. • >Results[ ,1] # entire first column of Results • >Results[2 ,] # entire second row of Results. • >Results[-2 ,-2] # Results without its second row and its second column. • Results[c(1,2),"f.xy"] # first and second element of the column vector called "f.xy”, can use column names
Exercise 2 • Read in dataset: • hs0 <- read.table("http://www.ats.ucla.edu/stat/R/notes/hs0.csv", header=T, sep=",") • List the first 20 rows for all columns • What type of object is hs0? How can you verify this? • Create a new object called scores with just the first 20 observations of read, write, math, science scores • Display dimensions of new object • Display length of just math scores • Get basic statistics for scores • Find the mean of science scores • Doesn’t work!! Need to exclude NA’s (na.rm=T) • Go back to full dataset, find the mean of write and math scores for each program type