200 likes | 374 Views
Introduction to S-Plus. by Francesco Ferretti Analysis of Biological Data Course Winter term 2007 Dalhousie University. Introduction. S-plus and R are statistical programs using the S language. Developed in the Bell Labs of AT&T in 1970s by Rick Becker, John Chambers and Allan Wilks
E N D
Introduction to S-Plus by Francesco Ferretti Analysis of Biological Data Course Winter term 2007 Dalhousie University
Introduction • S-plus and R are statistical programs using the S language. • Developed in the Bell Labs of AT&T in 1970s by Rick Becker, John Chambers and Allan Wilks • In 1987 Douglas Martin at the University of Washington created the present Insightful Corporation. He made S more popular, compatible with many hardware platforms, and provided with the necessary support for technical and statistical problems. S become S-plus • In 1997 the R project started. It was created by Ross Ihaka and Robert Gentelman at the university of Auckland, New Zealand. R is Similar to S-plus and freely available.
S-Plus and R • Flexible and powerful statistical program • Particularly appealing for its graphical capabilities • Can be problematic with large amount of data SAS is more powerful in these cases
GUI (Grafical User Interface) • Main toolbar and several windows • Object Explorer Overview of what is available on the system. • Computational Engine • data frames, list, matrices, vectors • Interface Objects • Search path, menu items, toolbars, dialogs • Documet objects – Outputs • Graph sheets, Scripts and Reports Object explorer visualize all the objects you have in your work directory
GUI (Grafical User Interface) • Import data • File>Import Data>From file • Export data • File>Export Data>to file • chose among all the data frames present in your working directory, give location and extension • Creating graphs • Highlight a dataset in object explorer • Select variables (Ctrl-select) • Click on 2D plots • Chose the preferred graph type • Save graphs • Default *.sgr (s-plus graph sheet) • Eventually you can choose your preferred picture extension with File>Export Graph.. then specify location, name and extension then click OK
GUI (Grafical User Interface) • Summary statistics • From object explorer select a data frame • On the main toolbar select Statistics>Summary Statistics • Select data, variables and statistics to be shown then click OK
Full potential and flexibility of S-plus. Highly recommended! While GUI can perform much of the S-Plus commands and functions, programming mode allows you to resolve potentially all problems you will encounter in data manipulation, analysis and plotting. Command window Can be used step by step interactively Writing functions Using a text editor (notepad, emacs, editplus, etc.) or directly on the command line Programming mode
Command line (the basic) • S-plus is case sensitive • # commenting sign • ? Call help • q() quit S-plus • <- assignment sign. This is to associate a value or a function to a variable name
Use of S-Plus in programming mode • Calculator */+-, =, log, exp, sqrt, ^, sin, cos Follow the same arithmetic rules */ before +- and () before */ • Manipulate data • Fitting models to data • Plotting graphs
Logical Values • Boolean Values: True, False • < (less than), >, <= (less than or equal to), >=, == (equal to), != (not equal to) • Conditional expressions and operators If, else, ifelse & (and) | (or)
Brackets • () to enclose arguments of functions and perform arithmetic calculations • [] indexing objects • x<-c(1,5,7,8) then x[3] = 7 • {} to enclose groups of commands • Function bodies • If else statements • loops
S-plus common objects • Vector • Ordered group of numbers or strings • X<-c(45,29,27) • z<-c(180,180,165) • y<-c(“Hall”,”Francesco”,”Sara”) • Matrix • “rectangular layout of cells each one containing a value” • AH<-matrix(c(45,29,27,180,180,165),nrow = 3) • AH<-matrix(c(x,z),nrow=3) • Array • Multidimentional matrix • Data frame • AHP<-data.frame(x,z,y) • AHP<-data.frame(x,z,y,) • List • group together data not having the same structure. Output or summary come out as list. You can access or use part of these output.
Functions • Set of commands performed on specified variables • Y<-mean(x) …or..y<-(x1+x2+x3+x4)/4 ..or.. y<-sum(x)/4 ..or..y<-sum(x)/length(x) You can build your own functions • In command line SD<-function(x){sqrt(var(x))} function will be saved in your working directory…..SD(x)
Function name arguments Body of the function, set of commands Functions • Creating a file with an s extension (file.s, sort of a library where you can store one ore more functions) • Open and editor • Write the function: # this function create the dataset “buddy” and # plot its variables one against the other buddy<-function(){ x<-c(2,3,5,6,8,10) y<-c(4,6,10,12,16,20) buddy<-data.frame(x,y) plot(buddy$x,buddy$y,xlab=“x”,ylab=“y”,type=“l”) print(buddy) } • Save the file as an s file: c:\buddy.s • Open the file with source(“c:\\buddy.s”) • Access the funtion calling it as buddy()
Use of S-Plus in programming mode (Manipulation of data) • Dataset never ready for analyses • Importing datasets: read.table() • Subsetting object • Creating new variables • seq(), rep(), sort(), unique(), length() • Merging and binding datasets: • merge(), cbin(),rbin()
Graphical analysis • Plotting to the active device: s-plus window or file pdf.graph(file=“”,horizontal=“”) postscript(file=“”,horizontal=“”) graphsheet(file=“”,format=“”) Important functions: par(), plot(), hist(), boxplot(), pairs()
Fitting a model to data • Take SharkLife data • Summary of the data, summary() • EDA (Exploratory Data Analysis), pairs(), hist(), boxplot(), plot() • Fitting a linear regression model between Lmax and birth.size, model1<-lm() • Checking the model (using statistics and plots), summary(model), plot(model)
Programming mode • Script window • Mode where you can write programs, run them and keep track of your operations for future work • File>New>Script File
Useful Reference Books • The Basic of S-Plus by Krause A. and Olson M. • Statistical computing with S-Plus by Crawley M.J. • Modern Applied Statistics with S-plus by Venables W.N. and Ripley B.D • …much more in the internet